Emily programming language: Status update one

June 4th, 2014

About a month ago, I put up some plans for a programming language. It actually got some interest!, so I thought I’d go over some things that have happened since then.

First off, I decided the name of the language is definitely Emily, unless I’m somehow legally forced to call it something else. I made a website for it and an announcement-feed Twitter, at emilylang.org and @emilylanguage, respectively.

Second off, I now have a prototype implementation! It’s a very, very minimal implementation– you seriously couldn’t develop anything in it, and it’s missing most of the language’s better planned elements– but it does demonstrate a couple of the language’s striking features, and it’s turing-complete. (And the turing completeness part can be demonstrated in an amusing way: It turns out to be possible to embed Unlambda programs in it.)

Programs written in the language don’t look very pretty right now. Most of the “operators” in Emily, like = or +, are eventually intended to be defined as macros, and those macros aren’t implemented in the prototype yet, so setting a variable looks a little funny:

# Prints "3.0"
set .test 3
print test

And adding looks a little funny:

# Prints "7.0"
print (3 .plus 4)

In the final version of Emily, writing “3 + 4″ will do the same thing as writing “3 .plus 4″– the only difference will be that + has operator precedence, and .plus does not.

There’s one macro that *is* in already– ^, or “lambda”, which turns whatever comes right after it into a function:

# Prints "4.0"
^x( print x ) 4

This defines an “anonymous function”– if you aren’t familiar with that idea, it’s a function that doesn’t have a name but just shows up in an expression, like a literal– that takes one argument and prints it. It then feeds the anonymous function the argument “4″.

With these couple things explained, here’s a more complicated Emily program that works in the current interpreter:

set .countup ^arg {
    set .count (arg.from)
    loop ^(
        print count
        print "\n"
        set .count ( count .plus (arg.step) )
        count.lt (arg.to)
    )
}

countup[ set .from 10; set .to 20; set .step 2 ]

Again, not very pretty, but this demonstrates a couple of interesting things that Emily tries to do.

To recap my last blog post a bit, everything in Emily is formally a function– everything. Objects are functions pretending to be objects, functions which take key names (in the form of strings like .plus– assume this is just a funny way of saying “plus”) and map them to values. Numbers are objects, which are functions. Scopes are objects (which are functions); the language sets up and shuffles around the scope functions silently, and it forwards any bare symbols it finds (like “set” or “count” in the examples) to the current scope function as a string argument. A lot of the time these various functions don’t really *act* like functions, of the kind we normally recognize (for this reason I’ve been calling them “blocks”, after Smalltalk) but the fact the language doesn’t distinguish the different kinds of functions semantically opens up a lot of possibilities for useful abstraction. The only “verb” in Emily is “apply”– apply function A to argument B. Everything else is just built on top of that.

Not a lot happens in this example, but a couple things do stand out: assigning a variable value just means invoking the “set” method on the current scope (i.e., passing .set to the current scope block, doing which returns a function that can be used to alter that scope block’s mappings). There are no special flow control constructs– “loop” is just a function that takes a closure as argument and executes it until it returns false. The thing that really stands out to me here though are the appearance of {} and []– the countup language body is wrapped in a {} to give it its own variable scope, and when we invoke countup the argument list is written as an “object literal”, denoted by []. What’s interesting here is how these two things are implemented by the language.

Parenthesis in Emily create “groups”– I have to call them that because I used the word “block” for something else already, maybe a mistake. Any parenthesis in Emily can contain multiple statements, separated by semicolons. A trick here is that group (), scoped group {}, and object literal [] are all the same thing, distinguished only by what the scope is within the group and what is returned at the end. More interestingly, there’s no “magic” to the group operators– effectively, what defines the different group types are a couple of lines of setup/teardown code that run at the beginning and end of the group, and that setup/teardown could in principle be written in Emily. In other words, {} and [] could in the final version just be implemented as macros! In the current implementation, all three group types are handled by the same code in the interpreter, with the difference being:

    (Assume any time a statement executes, its evaluated value is assigned to a variable named “last”)
    Plain groups: Set [scope] to the enclosing scope; after final statement return “last”.
    Scoped groups: Create a block [newscope], which prototype-inherits from the enclosing scope, and set [scope] to it. After final statement return “last”.
    Object literals: Create a block [object]. Create a block [newscope], which prototype-inherits from the enclosing scope. Assign [newscope].set = [object].set and [newscope].this = [object]. After final statement, return [object].

Did I get too language wonky there? The point I’m trying to make here is that “an object literal syntax” is a really basic thing to a language, but this language kind of doesn’t have one, because it doesn’t need one. That’s just normal code running inside of the [], and the thing that makes [] “special”– the setup/teardown– could have been implemented by an end user. This is exciting to me because it implies end users can implement other language extensions, things as complicated as an object literal syntax, on their own.

The implementation

The prototype implementation of Emily is hosted on BitBucket, although it will not do you a lot of good right now. The main problem is that it is written in Objective-C, which means that it will only easily build if you have OS X and XCode. Compiling on other operating systems is possible but will require setting up something like GNUStep. I haven’t investigated this myself, mostly because I intend to port to something-other-than-Objective-C rather than bother with GNUStep. If you do have a mac, or do have GNUStep set up, you’ll probably find it pretty easy to set up and extend. The current implementation is only about five or six source files. (Sorry if the BitBucket page is a little confusing, they just changed their layout and now it’s all weird– the cloud with the arrow in it is the “download source tarball” button.)

Remember, this is literally the very first thing I got to run at all– as of this exact writing there’s no code documentation, no documentation of the exact language as implemented so far (the closest is the sample code files), and there are known memory leaks. I’m not even giving this a version number yet. Still, it’s a start! If you want to try actually using or extending it, feel free to contact me and I’ll help you get set up.

I designed a programming language and it looks like this.

April 22nd, 2014

For reasons I talk about here, I’m going to try to create a programming language. So far I’ve got basically a first-draft design.

There is a specific idea I decided I want to explore, when I do this: Programming languages duplicate too much. Programming languages often have multiple syntaxes that do very similar things, or multiple underlying concepts that do very similar things. It is sometimes possible to collapse these similar things into one thing, and when we do, I usually like the results better.

For example, many languages (Perl, Python, PHP) have both a dictionary type and an object type, but the two are used in effectively the same way; on the other hand Lua collapses dictionaries and objects into one type (tables), and makes an object field lookup identical to a dictionary string lookup. Or most object oriented languages distinguish objects and classes, but prototype based languages show that you can get by with just objects; if objects can inherit from other objects, then a “class” is just a pattern for a particular kind of object. When you collapse ideas together like this, or build language features on top of existing features rather than adding new primitives, you reduce both the amount of mental overhead in thinking about code implementation and also the amount of redundant syntax. There’s usually a lot of redundant syntax. C++ uses . to access a field from a reference, and -> to access a field from a pointer. Most languages use (x) to indicate an argument to a function, and [x] to indicate an index to an array. Why? If pointers and references were just special cases of one underlying concept, or if arrays and functions were, you could use one syntax for each pair and you wouldn’t have to mentally track what each variable is, you wouldn’t have to do all the obnoxious manual refactoring when you suddenly decide to replace a reference with a pointer somewhere or vice versa.

In the language I’ve been thinking about, I started with Lua’s “Table” idea– what if i built objects out of dictionaries?– and decided to take it one step further, and build both objects and dictionaries out of functions. In this language, there’s one underlying data structure that functions, objects, dictionaries, and some other stuff besides are just special cases of– design patterns of.

Taking a cue from Smalltalk, I’m going to call this underlying structure “blocks”.

Blocks are just functions

A block, for purposes of this blog post, is a unary function. It takes exactly one argument, and it returns a value. Anywhere in this blog post I say “blocks”, I could have just written “functions”. I’m going to mostly use the “block” jargon instead of saying “functions” because some blocks will be “used like” functions and some very much will not be.

In my language, you’ll define a function just with “=”:

    addOne ^x = x + 1

The ^x is an argument binding. The ^ signals to the language that the right side of the = needs to be a function body (a closure). If on the next line you just said

    y = addOne 3

That would just assign 4 to the variable “y”, it would not create a function.

Blocks are pattern-matched functions

A big part of this project is going to be that I really like the ideas in functional languages like ML or Haskell, but I don’t actually enjoy *writing* in those languages. I like OO. I want a language that gives me the freedom and expressiveness of FP, but comfortably lets me code in the OO style I use in Python or Lua or C++. So I’m going to steal as many ideas from FP languages as I can. Three really important ideas I’m going to steal are closures, currying, and pattern matching.

In case you don’t know those languages, let me stop and explain pattern matching real quick. You know how C++ lets you function overload?

    // In C++
    void addOneHour(int &k) { k = (k + 1) % 12; }
    void addOneHour(float &k) { k = fod(k + 1.0, 12); }

Well, pattern matching is as if you could switch not just on type, but also on value:

    // In hypothetical-C++
    void addOneAbsolute(int &k where k > 0) { k = k + 1; }
    void addOneAbsolute(int &k where k < 0) { k = k - 1; }
    void addOneAbsolute(0) { } // Do nothing

That last line– the one demonstrating we could write a function whose pattern matches only *one single value*– is going to be important to this language. Why?

Blocks are dictionaries

In my language, if I want to assign more than one pattern to a single block, I just use = multiple times:

    factorial ^x = x * factorial (x - 1)
    factorial 0 = 1

“Factorial” is a block. The way I’m looking at it, a block is just a data structure which maps patterns to closures. It’s like a dictionary, but some of the keys (the ones with bound variables) match multiple values.

However we could not bother assigning any bound-variable patterns, and then we’d just have a dictionary or an array:

    nameOfMonth 1 = "January"
    nameOfMonth 2 = "February"
    nameOfMonth 3 = "March"
    ...

Blocks are objects

Here I want to introduce a data type called an “atom”. This is an idea stolen from Erlang (and possibly Ruby?). Technically an atom is an “interned string”. It’s something that the programmer sees as a string, but the compiler sees as an integer (or a pointer, or something which has a constant-time comparison). You get at the atom by putting a . before a symbol; the symbol is the name of the atom:

    x = .atomname

It’s cheaper to compare atoms than strings (.atomname == .atomname is cheaper than “atomname” == “atomname”) and cheaper to use them as dictionary lookup keys. This means atoms work well as keys for fields of an object. Objective-C for example actually uses atoms as the lookup keys for its method names, although it calls them “selectors”. In my language, this looks like:

    constants.pi = 3.14
    constants.e = 2.71
    constants.phi = 1.61

Notice this looks like normal object syntax from any number of languages. But formally, what we’re doing is adding matching patterns to a function. What’s cool about that is it means we’ll eventually be able to use machinery designed for functions, on objects. Like to skip ahead a bit, eventually we’ll be able to do something like

    map constants [.pi, .e, .phi]

and this will evaluate to an array [3.14, 2.71, 1.61].

What’s up with the square brackets? Oh, right. well, I think all this “constants.” nonsense is gonna get kinda tiresome. So let’s say there’s a syntax like:

    constants = [ pi = 3.14, e = 2.71, phi = 1.61 ]

Notice I say “pi” and not “.pi”– on the left side of an =, the initial “.” is implicit. More on that in a moment.

One other thing. Inside of the [ ], there exists an implicit “this” variable, corresponding to the object the [ ] creates. so if you say

    counter = [
        count = 0
        increment ^x = { this.count = this.count + x }
        decrement ^x = { this.count = this.count - x }
    ]
    counter.increment 1
    counter.increment 3

Then at the end of this string of code “counter.count” is equal to four.

Blocks are prototypes

What if we want more than one counter object? Well, you’ll notice an interesting consequence of our pattern matching concept. Let’s say I said:

    counter = [
        init ^x = { this.count = x }
        increment ^x = { this.count = this.count + x }
        decrement ^x = { this.count = this.count - x }
    ]

    counter_instance ^x = counter x
    counter_instance.init 3
    counter_instance.increment 5

When we say “counter_instance.whatever”, the language interprets this as calling the block counter_instance with the argument .whatever. So if counter_instance is defined to just re-call “counter”, then on the next line saying “counter_instance.init 3″ will fetch the block stored in counter.init, and then that block gets called with the argument 3. The way the “this” binding works is special, such that counter_instance.init gets invoked “on” counter_instance– “this” is equal to counter_instance, not counter.

The syntax we used to make counter_instance “inherit” is pretty ugly, so let’s come up with a better one:

    counter_instance.ditch = counter

I haven’t explained much about how = works, but when we say “counter_instance ^x = “, what we’re really doing is taking a closure with an argument binding and adding it to counter_instance’s implementation-internal key-value store, with the key being a pattern object that matches “anything”. “.ditch” is a shortcut for that one match-anything key slot. In other words, by setting counter_instance.ditch to counter, we are saying that counter is counter_instance’s “prototype”.

Something to make clear here: the lines inside a [ ] aren’t “magic”, like C++ inside a struct declaration or anything. They’re just normal lines of code, like you’d find inside a { }. The difference is the insides of [ ] are using a specially prepared scope with access to a “this” and a “super”, and at the end of the [ ] the [ ] expression returns the scope into which all these values are being assigned (“this”). The upshot is you could easily have the first line of your [ ] be something like an “inherit counter;” call that sets the ditch and does some various other fix-up to make this prototype system act more like some other kind of object system, like a class system (I like classes). This sort of thing is possible because

Blocks are scopes

Like most languages, this one has a chain of scopes. You’ll notice above I offhandedly use both ( ) and { } ; these are the same thing, in that they’re a series of statements which evaluate to the value of the final statement:

    x = ( 1; 2; 3 )

…sets x equal to 3. The 1; and 2; are noops. (Semicolon is equivalent, in the examples I’ve given here, to line-ending. There’s also a comma which is effectively a semicolon but different and the difference is not worth explaining right now.)

The one difference between { } and ( ) is that { } places its values into a new scope. What is a scope? A scope is a block. When you say

    a = 4

The unbound variable a is atom-ized, and fed into the current scope block. In other words “a” by itself translates to “scope.a”. When you create a new inner scope, say by using { }, a new scope block is created, and its ditch is set to the block for the enclosing scope. The scope hierarchy literally uses the same mechanism as the prototype chain.

Block constituents are properties (or: blocks are assignment statements)

Non-language geeks may want to skip this section.

I’ve been pretty vague about what = does, and that’s because it has to do several layers of things (matching items that already exist, binding variables, wrapping closures, and actually performing assignment). However, ultimately = must write [pattern, closure] pairs into one or more blocks. = cannot, however, actually write anything by itself. Ultimately, when = decides it needs to assign something, it is calling a “set” method.

    a = 4

Is ultimately equivalent to

    scope.set .a 4

That = is sugar for .set is a small detail, but it has some neat consequences. For one thing, since everything that happens in this language is curryable, it means you can trivially make a function:

    a_mutator = set.a

…which when called will reassign the “a” variable within this current scope (remember, “set” by itself will just be “scope.set”). For another thing, this means you can create a “property” for a particular variable:

    set.a ^x = ( b = x + 1 )
    a = 3

After this code runs, “b” will be equal to 4 (and “a” will still be equal to a function that mutates “b”).

The existence of .set will also have some interesting effects once we have types and therefore a type checker. I’ve been kinda vague about whether = has “set” or “let” semantics– that is, if you assign to a variable does it auto-instantiate or must you predeclare it, if there is a variable by the assigned name in the ditch does assignment shadow in the assigned block or reassign in the parent block, etc. And the answer is it doesn’t much matter for purposes of this post, because any of the possible things that happen when you set a field (“not declared” error thrown, assigned to top-level block, assigned to a parent block) could just be all things that could and do happen in different blocks, depending on what that block’s .set is set to. For example, it would probably make sense for object blocks and scope blocks to have a different last-ditch “.set” behavior, or be sensible to allow different source files to have different “.set”s for their file-level scopes (“use strict”).

On that note, let’s talk about types. There’s a lot of very exciting stuff happening in the study of types in programming languages right now, both types as used in languages and types as used in extra-lingual static analysis tools. I don’t understand a lot of this research yet (and I want to learn) but I think I understand enough to have an idea of what’s possible with types right now, and that means I know how I want types in this language to work.

Blocks are types

Let’s say we have a syntax variable:type that we can use to constrain the arguments of a function.

    factorial ^x : int = x - 1

When this function is called, there will be a runtime check, if “x” is not an int, it will be a runtime failure. Let’s say we can use the a:b construct inside expressions too:

    square ^x = ( x * x:float ) :: stateless

Let’s say that :: instead of : indicates that the type is being applied not to the value returned by that parenthesis, but to the implicit “function” defined by the parenthesis itself. “stateless” is a type that applies to functions; if we assert a function is “stateless” we assert that it has no side-effects, and its resulting value depends only on its inputs. (In other words, it is what in another language might be called “pure”.)

There’s some kind of a inferred typing system in place. There’s a compile time type checker, and when it looks at that “square” function it can tell that since “x” is a float in one place inside the expression, that the “x” passed into square must itself be a float. It can also tell that since the only code executed in “square ^x” is stateless, that the function “square ^x” is also stateless. Actually the “stateless” is from the checker’s perspective unnecessary, since if the checker has enough information about x to know the * in (x * x) is a stateless operation– which, if it knows x is a float, it does know that– then square ^x would be stateless anyway.

There’s some kind of a gradual typing system in place. There is a compile-time step which, everywhere square ^x is called, tries to do some kind of a type-proving step and determine if the argument to square is a float. If it can prove the argument is a float, it actually omits the runtime check to save performance. If it *can’t* prove the argument is a float, or it can prove the argument *isn’t* a float, it adds the check and maybe prints some kind of compile-time warning. (To stress: some of these properties, like “stateless”, might be in many cases *impossible* to prove, in which case the checker is conservative and treats “can’t prove” as a failure.) Besides omitting safety checks, there are some other important kinds of optimizations that the type checker might be able to enable. Critically, and this will become important in a moment, if a function is stateless then it can be potentially executed at runtime.

So what are types? Well, they’re just functions. “int” and “stateless” are language-builtin functions that return true if their arguments are an int, or a provably stateless function, respectively. (For purposes of a type function, if the type *doesn’t* match, then either a runtime failure or a return false are okay.) Types are values, so you can construct new ones by combining them. Let’s say that this language has the || and && short-circuit boolean operators familiar from other languages, but it also has & and | which are “function booleans”– higher level functions, essentially, such that a | b returns a function f(x) which is true if either a(x) or b(x) is true. So if “stateless” and “nogc” are two of the builtin type functions, then we can say:

    inlineable = stateless | nogc

And if we want to define a totally unique type? Well, you just define a function:

    positive ^x = x > 0
    sqrt ^x : positive = x / x    # Note: There might be a bug here

Obviously you can’t use just any function here– there would have to be some specific type condition (probably something like the “inlineable” I describe above) that any function used as a type in a pattern would be required to conform to. This condition would begin and end with “whatever the type checker can efficiently prove to apply or not at compile-time”.

Let’s finally say there’s some sugar for letting you define these “type condition” functions at the same time you define the function to whose parameters they apply; we could reduce that last block down to

    sqrt (^x >= 0) = x / 2    # Square root implementation, WIP 2

One other bit of sugar that having a type system makes easy:

Blocks are argument lists

So everything so far has been a unary function, right? There’s only so much we can do with those. This language is set up for currying– that’s how method lookup works, after all– and I would like to offer explicit sugar for curry:

    curryadd ^x ^y = x + y

But ehh, I don’t actually like using currying for everything. I like argument lists. And I really, *really* like named arguments, like Python uses. Let’s say we have this syntax:

    divide [^numerator, ^denominator = 1] = numerator / denominator

The “parameters” block there? Is totally just a block. But there’s some kind of block wiring such that:

    divide [4, 2]           # Evaluates to 2
    divide [4]              # Evaluates to 4-- "denominator" has a default argument
    divide [9, denominator=3]                       # Evaluates to 3
    divide [denominator = 4, numerator = 16]        # Evaluates to 4
    divide [ ]       # Compile-time error -- assignment for "numerator" not matched

There’s some sort of block “matching” mechanism such that if the argument block can be wired to the parameter block, it will be. I don’t have an exact description handy of how the wiring works, but as long as blocks remember the order in which their (key, value) pairs are assigned, and as long as they can store (key, value) pairs where exactly one of key and value is (no value), then such a matching mechanism is at least possible.

My expectation is that almost all functions in this language will use the argument blocks for their parameters, and almost all invocations will have an argument block attached.

Blocks are macros

I wanna go back here and look at something closer: We’ve defined that there’s some subset of this language which can be run at compile time, and that the type checker can identify which functions are in that category. I think this is a pretty powerful concept, because it means the language can use *itself* as its macro language.

So far in this post, you’ve seen three main kinds of syntax in the code samples: Unary function application (again, a field lookup like a.b.c is really just a bunch of currying), “=”, and little extra operators like “+”. What I’m going to assert is that the extra operators– and also maybe =, and maybe even [ ]– are actually just rewrite rules. So for the line:

    3 + square 4

Before actually being executed, this line is transformed into

    3 .plus ( scope .square 4 )

“3″, like anything else, is a block. Like in Io or Self, adding three to four is just invoking a particular method on the 3 object. In this language “+”, the symbol, is just a shortcut for .plus, with parser rules to control grouping and precedence. (If we actually just wrote “3 .plus square 4″, then the currying would try to interpret this as “(3 .plus square) 4″, which is not what we want.)

There’s some kind of a syntax for defining line-rewrite rules, something like:

    op [ symbol = "!", precedence = 6, replace = .not, insert = .unary_postfix, group = .right_inclusive ]
    op [ symbol = "*", precedence = 5, replace = .times, insert = .infix, group = .both ]
    op [ symbol = "+", precedence = 4, replace = .plus, insert = .infix, group = .both ]
    op [ symbol = "==", precedence = 3, replace = .eq, insert = .infix, group = .both ]
    op [ symbol = "&&", precedence = 2, replace = .and, insert = .infix, group = .both ]
    op [ symbol = "||", precedence = 1, replace = .or, insert = .infix, group = .both ]

Which means for something like

    result = 12 * 2 + 9 == 3 + 8 * 4
    result = !parser.valid 34 && result

Ultimately what’s actually being executed is:

    scope .set .result ( ( ( 12 .times 2 ) .plus 9 ) .eq ( 3 .plus ( 8 .times 4 ) ) )
    scope .set .result ( ( ( scope .parser .valid 34 ) .not ) .and ( scope .result ) )

So this is fine for symbols like + and – which operate on two clearly-defined values, but what about something more complicated like “=”? Well, there ought to be some kind of way to pass “op” a .custom function, which takes in a list of lexed tokens representing a line and returns a transformed list of tokens. At that point you can do pretty much anything. “=” might be the *one* thing that you can’t implement this way because = does special things involving adding bindings. But short of that, custom “op”s would be sufficient even for things like, I don’t know, flow control:

    if ( a == 4 ) { k.x = 3 } else { k.x = 4 }

I may be getting into the language-geek weeds again here but I’m gonna walk through this: Let’s say I have a higher order function “if ^pred ^exec” which takes functions “pred” and “exec”, executes “pred” (pred is probably nullary… which I haven’t decided what that means in this language yet), if the result is true it executes “exec” and returns the void combinator (v ^x = v), if the result is false it returns a function which expects as argument either .elsif (in which case it returns if) or .else (in which case it returns a function that takes a nullary function as argument and evaluates it). We’ve now defined the familiar if…elsif…else construct entirely in terms of higher order functions, but actually *using* this construct would be pretty irritating, because the “pred” and “exec” blocks couldn’t just be ( ) or { } as people expect from other languages, they’d have to be function-ized (which means annoying extra typing, toss some ^s in or however lambdas are made in this language). But, we can declare “if”, “else” and “elsif” rewrite ops: “if” ^-izes the next two tokens and then replaces itself with just “if” again; “else” and “elsif” ^-ize the next one token each and then replace themselves with .else or .elsif. If we do this, then the familiar if… else syntax above just *works*.

…why am I going into all this, about “if” “else”? Well, because I want to stress that it means *flow control constructs can be implemented in the language itself*, and they will be truly first-class equals with builtins like “if” or “while”. In my eventual vision of this language, the *only* language-level syntactical elements are

    .
    ^
    ( )
    [ ]
    { }
    ;

And *everything* else, including comment indicators and the end-of-line statement-terminator, is just rewrite rules, ideally rewrite rules written in the language itself. Which implies if you don’t like the language’s syntax much, you could just unload the builtin “stdops” module that contains things like “+” and “if”, and substitute your own. “op” rules are local to scopes, so syntax could vary hugely file to file. Which… well, shouldn’t it? I know people who avoid entire languages because they don’t like one or two things about the syntax. Say, people who go “well, Objective-C has a neat object model, but I can’t get used to all those square brackets”. Or I in my last blog post specifically said that although they both have lots of features I like, I personally won’t use LISP because I can’t make visual sense of S-expressions, I won’t use Javascript because of the casting rules. None of this makes any sense! Languages should be about *features*. They should be models of computation, and we should be evaluating them based on how expressive that model is, based on the features of the underlying flow control or object model or type system or whatever. Syntax shouldn’t have to be part of the language selection process, and if languages let us put the sugar on ourselves instead of pre-sugaring everything then it wouldn’t have to be. I’m probably getting carried away here. What was I talking about? Did I say something just now about casting rules? Let’s talk about casting rules.

Blocks are language machinery

Some syntactical elements, like [ ] and =, might be too complex for the programmer to plausibly implement themselves. The programmer should still have a fair amount of control over these builtins work. One way to do this would be to have things like [ ] and = implicitly call functions that exist in the current scope. For example, instead of calling .set, = might call a function “assign” that exists in current scope; this would allow individual scopes to make policy decisions such as the variable auto-instantiation rules I mentioned earlier. [ ], at the moment it instantiates the new block, might call a function “setup” that exists in the current scope, allowing the programmer to do things like change the default ditch (base class) or the exact meaning of “inherit”. There might be a function that defines the default type constraints for numbers, or strings, or lines of code. Maybe somewhere there’s a Haskell fan who wants to be able to have every ( ) wrapped up to be ^-ized and every line wrapped in ( ) :: stateless, so that any code *they* write winds up being effectively lazy-evaluated and side-effect-free and they can only communicate with the rest of the language using unsafe monads. They should be able to do that.

One thing I definitely want in is for there to be something like a “fallback” function which, if a particular block is called with an argument whose type doesn’t fit any pattern the block has defined, attempts to map the argument to one of the patterns the block *can* handle. In other words questions about whether different but intraconvertable types like ints and floats can be converted between without a cast would be a decision made on a per-project or per-file basis. Or for example if there’s a function

    square ^x:int = x*x

and one of the patterns on the fallback block is

    fallback ^fn : function( ^type, _ ) [^x : type] = fn x    # Follow all that?

(Let’s assume “function” is a higher-order type function such that function(a,b) is the type of a function a -> b, and let’s assume _ has the magic property of “match anything, but don’t capture it” when used in a pattern.)

…then even though the function is only defined for (square x) we could totally get away with calling square[ x ], because the fallback function could match [ x ] to x.

Uh, incidentally, I’m not totally sure this thing with the fallback function is actually in general possible or possible to make performant. But as with most of the stuff in this language, I think it would be fun to try!

Blocks are C++ or Javascript objects in disguise, potentially

There’s one last thing I want to talk about here, although it’s one of the most important features from my perspective. The model we have here– where formally speaking all field accesses execute functions, all field assignments execute functions, and there’s some kind of type checker at work capable of tracking fine detail about what kinds of operations get performed on individual blocks– means that the underlying language-level implementation of “a block” could differ from block to block.

The model I’ve described here for blocks is extremely dynamic and flexible– *too* flexible, such that it would be very difficult to make code using all these dynamic features performant. Except not every block will be using all of the features blocks have. Some blocks will only contain “value” keys (i.e. never a ^var:type pattern), and the type inferrer will be able to prove this the case. The compiler/interpreter could represent this one block internally as a plain hashtable, rather than taking the overhead to enable executing arbitrary code on every access. Some blocks, despite being mutable, will have a fixed known set of keys; the language could maybe represent these in memory as plain structs, and translate atoms to fixed memory offsets at compile time.

And some blocks, at the programmer’s direction, might be doing something else altogether. It’s easy to imagine a “proxy object” where each invocation of an atom and an argument on the block is actually copying the atom and argument and shipping them into another thread or across a network, and the type checker ensures the contract is followed and objects are actually copyable; you could build an Erlang style messaging system this way.

Of particular interest to me, some blocks might actually be guests from some totally other system, say a different language with its own object model. An FFI for some other language could make wrapper blocks for that language’s objects, and put in place type guarantees that the programmer does not interact with those blocks in any way the guest language does not support. The two languages I’d personally really like to be able to interface with this way are C++ and Javascript, because these languages have valuable platform and library support, but also are languages I do not actually want to *write*.

C++ in particular interests me, because I’m not aware of any languages which are “higher level” in the sense that interests me but which can currently adequately interface with C++. C++ is actually pretty tricky to interface with– the big problem here, to my mind, being that method calling conventions (name mangling) vary from compiler to compiler. Actually, on some platforms (by which I mean “Windows”) it’s the case that shared libraries (DLLs) can’t be shared between compilers even if you are writing in C++ yourself. It would probably be necessary, if making a C++ FFI, to target one particular compiler (I’d vote Clang, because it’s extensible and has good platform support). Choosing to target one particular compiler would have a neat side effect: With some knowledge of the compiler’s implementation details, it *ought* to be possible to make blocks that inherit from C++ classes, and have those blocks actually construct fake vtables at runtime that jump into the compiled code for (or interpreter for) my language. Since in my language “classes” and “objects” get constructed by calling functions whose execution could be potentially deferred to runtime, it would be essentially invisible to the programmer when they say [ inherit QObject; objectName = "Block" ] whether a normal block or a pseudo-C++ class is being constructed.

Okay?

Anyway, here’s what I think I’ve got here. I started with one single idea (pattern-matched unary functions that remember the order in which their patterns were assigned), and asked the question “how much of what a language normally does could I collapse into this one concept?”. The answer turns out to be “very nearly EVERYTHING”, including stuff (like type specifications, macros and FFIs) that most languages would wind up inventing effectively an entire sub-language just to support (templates… ugh). I actually *do* want a new programming language, mostly because of that thing I mentioned with not liking any existing language’s C++ interop, and I actually do intend to at least attempt this project. I’m basically just gonna download the Clang source at some point and see how far I get. One thing in my favor is that since this language is based on a small number of simple things that interact in complex ways, I could probably get a minimal implementation going without too much difficulty (especially if I don’t go for types on the first pass).

Oh, one very final thing: I never said what all of this is called. In my head I’m planning to name this language either Emily, because despite being fundamentally OO it strikes me as a fairly “ML-y” language; or Emmy, after Emmy Noether. I’ll decide which later.

That’s all.

Note to commenters: Harsh criticisms are very much welcomed. Criticisms based on demographics or assumptions about my background are not. Thanks.

Statement of Intent

April 22nd, 2014

I think I have decided that these are the things I most want in a programming language:

    1a. Objects with string keys, which double as dictionaries.
    2a. Closures.
    3a. A way to store a closure in an object and invoke it with the relevant object bound to some variable within the function (“this”).
    4a. Prototype-based inheritance for objects.
    5a. The potential to instantiate a closure, or an arbitrary singleton object, inside of an expression (“lambda”).
    6a. Gradual typing (with calls from untyped code into typed code being protected by something like a check and an exception throw at runtime).
    7a. Two-way communication with C++ (C++ objects may be visible within the language, language objects may be passed into C++ code).
    8a. Two-way communication with Javascript.
    9a. Support for multiple concurrent threads of execution, with interaction between by means of message passing.
    10a. It must be possible for me to develop a program on one operating system and execute it on another operating system (i.e. there is a VM or a cross compiler).

And here are some things that I think would be very nice in a programming language:

    1b. Pattern matching.
    2b. Syntax support to allow me to declare curried functions, as transparently as if they were non-curried functions.
    3b. Function invocation with named arguments (i.e. Python keyword args).
    4b. Suspension and resumption of stacks (coroutines or generators).
    5b. Atoms (interned strings which can be used as object keys without paying for a string lookup).
    6b. Properties (i.e. overloading of assignment).
    7b. The language itself should be ultimately defined as a AST, with multiple possible syntaxes reducing down to that single AST.
    8b. Language features (ints, exceptions) should be possible to enable or disable for individual programs or pieces of code. Not all features fit all projects.
    9b. There should be a potential for (arbitrary and user-defined?) type “adjectives” which are assertions that are remembered as part of the type and checked as part of the type checker. (Is this object mutable? Is this object mutable with regard to its set of keys? Does this function have side effects? Are these two properties recursive for the related object/call graph? Does this function ever use “this”? Is this object ever accessed with a non-atom key? Is this integer greater than three?)
    10b. Potential to directly manipulate the scope of a closure (I.E. mappings of unbound variables in the function body) after function declaration.
    11b. The collapsing of functions and objects. (“Look up key ‘f’ on object a” and “Invoke method a with unary argument ‘f’” should be the same operation, same syntax, same behavior. Function scope and prototype chain should be, insofar as the user can tell, the same mechanism).
    12b. Customizable automatic casting rules (if we have 1b, applied by a user-selected pattern-matched function).
    13b. Syntax support for object “classes” that have the familiar semantics of Python or Java. Ideally, classes would just be a pattern implemented on top of prototypes, but the syntax should make this pattern easy and the class identity should be visible to things like debuggers.
    14b. Some simple user-configurable syntax redefinition (like the ability to define an operator ^ where invoking 3^4 should be actually interpreted as exponent(3,4), and ^ has a user-defined operator precedence, etc).
    15b. The ability to have some sort of object which is ultimately represented at runtime as a packed memory buffer, such that saying “set element 4 of x to 3″ is actually writing “3″ into a specific well-defined byte or bytes in memory. (This is a useful thing both because of performance optimizations it makes possible, and for interacting with hardware like GPUs which only understand byte streams.)

I would like to explore ways to get to a point where I have a programming environment that satisfies everything from the A list and some emotionally satisfying number of items from the B list. In seeking such a language, there are incidentally some conditions that any language I spend time using will have to satisfy. I don’t want to have an argument with anyone about this final list; just think of them as personal preferences.

    1x. It must not require me to program with S-expressions.
    2x. It must not require the use of the JVM or CLR. (I am potentially willing to compromise on the CLR.)
    3x. It must not force me to program in a way which avoids the use of side-effects or state.
    4x. It must not require me to use any closed-source compiler, core library, or development tool.

No language currently exists which provides everything on my A list. Some of the items on the B list, particularly 10b and 11b, do not exist in any language that I am aware of. Most of the most interesting items on the B list only seem to exist in languages which fail one or more items on the X list. Any currently existing language will be a compromise. How fair do these compromises look?

If I use Lua, I am happiest. Lua gets me 1a, 2a, 3a, 4a, 5a, 10a, 4b, 6b, and 13b, 15b if I’m using Luajit, and since I have 5a I can awkwardly approximate 3b. No language I know of does 1a better than Lua. Unfortunately, 7a and 9a, which Lua does *not* support, are two of my *most* important conditions– I literally can’t develop without those two, whereas missing 1a through 5a will merely make me sad.

    (Lua gets something close to 7a– two-way communication with *C*– and two-way communication with C can be used to construct trampolines that *approximate* communication with C++, but in my experience using this technique is so frustrating I would rather avoid Lua altogether than attempt it again. There are also incidentally projects which purport to offer 8a for Lua, but I have not seen these projects actually working and I believe that they would need additional development before being actually used.)

If I use Javascript, I do get 1a, 2a, 3a, 4a, 5a, 8a obviously, 10a, and 6b. If I am using Javascript ES6, I also get 9a, 3b and 4b. Actually, Javascript does pretty well against my lists. Unfortunately, some of the personally important items on the list Javascript satisfies, it messes up quite badly. For example it offers 4a, but uses a baroque method for defining an object with a prototype, and it does not offer any way when overriding a prototype method to invoke the “super” method. That last one’s pretty huge; “super” is a basic feature of all object systems which, since it is present in Self, I would argue without “super” you don’t have a prototype-based inheritance system at all. On similar lines, Javascript offers 2a but essentially ruins it by not (before ES6) offering proper block scope; if I can’t readily create a closure in (for example) a loop, that is so awkward I probably won’t ever use closures at all.

There is another problem with Javascript: It is an unending cavalcade of horrors. Javascript feels *shoddy*. Many extremely basic language features are riddled with weird exceptions, and exceptions to exceptions, such that you never feel certain what the code you have just written does; many kinds of simple typos result in a silent failure or unexpected behavior rather than any kind of error. (The type coercion rules alone would feel like a great argument against using Javascript even if everything else about the language were perfect.) I very much like what is in this toolbox, but the individual tools all feel as if they will fall apart in my hands if for example I hit anything too hard with that hammer. This shifting-sand problem is exacerbated by the dramatic variance between Javascript versions and implementations; the primary benefit of Javascript is its huge installed base, but only a small subset of the syntax is “safe” in the sense that an acceptable portion of your installed base is compatible with it. This is an unfortunate property when we are talking about mere syntactical convenience features: 3b and 4b are nice, but I would not literally give up users to get them.

    (I have not investigated if Javascript satisfies 7a).

If I use C++, I get 4a (or rather 13b, class-inheritance alone, which is close enough for me), 6a (not gradual typing but strong typing, which I don’t mind), 7a (sort of– assuming you never dynamically link against anything from a competing C++ compiler), 9a (with memory sharing instead of events, but you can add the events via a library), 10a (with *great* difficulty), 15b (although since 15b is *all* C++ offers for data storage, it isn’t very pleasant), and an extremely limited 14b that can be used to get 6b (although only for some kinds of variables). There is also a horrible, confusing mockery of 12b which really the language would be better off if they had not tried to offer this feature. If I use C++11, I get 2a, but it’s gross (the closures capture variables manually, rather than automatically; they’re only created if you use the “lambda” feature, which is totally separate from any other kind of function or method; and the syntax is incredibly ugly). C++11 also has “auto”, which is… one step closer to my ideal form of 6a than strong typing alone would be, I guess. Overall C++ and C++11 don’t offer *anything* exactly the way I’d most prefer it, and the only items on my list I’d even feel C++ is really “good enough” at are 4a/13b and 9a.

    (At this point one notices something interesting about my criteria 7a: Not even C++ does very well against it. This is a hard criteria to meet.)

I’ll leave it as an exercise to the reader to test my criteria against Python, ML, Haskell, and Erlang.

Looking at my options, I feel that if I could somehow hack Luajit to offer 6a and 7a and also support Lua Lanes (which provides 9a), I’d have a language I were totally satisfied with. I spent a while seriously considering doing just that. However, Luajit is big, and 7a in *particular* is such a hairy thing to implement that I wouldn’t want to attempt it in someone else’s codebase.

So I think if I want a language that satisfies all of these things, it would have to be a new language. Along those lines, I have written a follow-up post here.

Double Union Game Jam recap

February 23rd, 2014

Yesterday I cohosted with a woman named Snail a game jam at Double Union, a feminist hacker space in San Francisco. The game jam was targeted at women and gender-nonbinary people and the goal was to reach out to people who might not be participating in SF’s existing indie/gamejam culture. We opened with a class on Twine I taught (my class notes are here)and then ran for about four hours before showing what we’d made. It seemed to go really well! We got about 12 people plus the normal Double Union Saturday crowd, and every single person who stayed to participate wound up making a game.

Here’s some of the games we made– all of them can be played in a browser:

Gina made A Grim Task, a really intense short story about volunteering at a nonprofit.

Haley made “LET ME TELL YOU ABOUT LIL B THE BASED GOD”. The name describes it pretty well.

Wonja made two games:
Lizard Quest, an INCREDIBLY AMAZING adventure game.
Hallway, a, uh… experimental experience.

Kanane made The Huntress, an impressively polished fantasy story.

I actually made a couple small things!:
San Francisco Quest, which I actually made during the Twine class at the beginning as a demonstration of how to use Twine. It’s very very short, probably the smallest a game could legally be without getting into trouble with the Game Police.
I then *tried* to make a Pacman clone, but it took me a really long time to fix some bugs where I couldn’t get anything at all to draw (it turns out C++/OpenGL is a really bad choice for a four-hour jam). By the time I got that fixed, I was out of time, so I couldn’t add anything like interactivity. So I decided the result was Social Anxiety Pac Man, a game where all you can do is stand in a maze immobilized by your choices. (The link goes to an image, not the game. Trust me, it’s the same experience.)

Snail, my cohost, made Another City, a really funny hallucinatory story that to me feels reminiscent of San Francisco Quest only better. She says this link is a “demo” she’ll finish later, some of the paths don’t complete.

Seanna made How To Be Successful. This one is actually really cool, it’s a Twine game consisting of nothing except pictures.

Alicia made the one non-Twine game of the jam, Make the kitties happy!. This one is totally adorable, you have some cats and you have to make them all happy at the same time by giving them what they want. Hint: You can change the tool selected (at the top) by clicking on it. (You may have to run this in Firefox.)

Jenn made Lady Bird and Tiger Bear, a short fairy tale.

And our two last participants, Tyler and Hannah, made complex Twine stories about, respectively, memories you can hold in your hand and a epic struggle to make yourself a sandwich. These haven’t been posted on the Internet yet but I played the sandwich one and it was really good.

Overall I’m really happy with how the jam turned out. We talked at the end and we are planning to do this again at minimum and hopefully expand this into a larger recurring women+nonbinary game making group. There seemed to have been more people who were interested in participating but couldn’t because of the limited space at DU, so we want to give them a chance to do that next time. Our current plan now that the jam is out of the way is to start doing a monthly Twine writer’s group at Double Union, and in a couple months do another game-making-class+gamejam like the one yesterday. Next time we’ll be teaching Stencyl. We should have more details soon, so keep an eye out!

A Game of the Year 2013 Poll: Results

January 14th, 2014

CLICK HERE TO JUMP TO THE PRETTY COLOR-CODED FULL RESULTS

This explanation will look a lot like that of previous years, but:

Every year since 2004 I’ve been hosting this Game of the Year poll for the users of some forums I read. There are a lot of GOTY polls out there, but this one I think is kind of special. Most polls, you’re given a list of four or five options and you’re asked to pick the one you liked best. This poll, people are given a list of a couple of hundred options, consisting of every new game released in the previous year– and asked to rate their top ten or twenty.

This does a few interesting things. First off, we get to see all the information about what people’s second, third etc choices are. Second off, because the second, third etc choices count, people are more likely to vote for the game they want to win, rather than the game they think is likely to win– they’re less likely to engage in “strategic voting”. Finally, because we have all this information, we’re actually able to provide somewhat reasonable rankings for something like the top hundred or so games of last year.

The full results– showing the exact number of voters who ranked each game first, second, third place etc– can be found here. In the meantime, the final results were:

  1. Gone Home (3426) *** GAME OF THE YEAR ***
  2. Bioshock Infinite (3373)
  3. Papers, Please (2506)
  4. Saints Row IV (2497)
  5. Tomb Raider 2013 (2490)
  6. The Legend of Zelda: A Link Between Worlds (2440)
  7. Pokémon X and Y (2423)
  8. The Last Of Us (2263)
  9. The Stanley Parable (2187)
  10. Fire Emblem: Awakening (2161)
  11. Animal Crossing: New Leaf (1924)
  12. Grand Theft Auto V (1721)
  13. Assassin’s Creed IV: Black Flag (1631)
  14. Brothers: A Tale of Two Sons (1554)
  15. Rogue Legacy (1547)
  16. Super Mario 3D World (1393)
  17. Antichamber (1334)
  18. Gunpoint (1302)
  19. Metal Gear Rising: Revengeance (1214)
  20. Kentucky Route Zero (1130)

The numbers in parentheses are the final scores each game got under the poll’s ranking system. Thanks if you voted, and some more elaborate analysis of the results (plus an explanation of the scores) can be found below.

NOTEWORTHY WINNERS

GOTY 2013:

#1, Gone Home

Top-ranked PC Exclusive:

#1, Gone Home

Top-ranked 3DS Exclusive:

#6, The Legend of Zelda: A Link Between Worlds

Top-ranked PS3 Exclusive:

#8, The Last Of Us

Top-ranked WiiU Exclusive:

#16, Super Mario 3D World

Top-ranked Browser Game:

#24, Depression Quest

Top-ranked Mobile Exclusive:

#46, 868-HACK
(OR)
#56, Ridiculous Fishing, if you disqualify 868-HACK because of its PC prototype, “86856527″

Top-ranked Vita exclusive:

#67, Tearaway

Top-ranked PS4 Exclusive:

#85, Resogun

Top-ranked 360 Exclusive:

#104, BattleBlock Theater

Top-ranked XB1 Exclusive:

#124, Killer Instinct

Top-ranked Ouya Exclusive:

#166, Towerfall

Top-ranked Wii Exclusive:

#189, Pandora’s Tower

Top-ranked FPS:

#1, Gone Home

Top-ranked “Indie” Game:

#1, Gone Home

Top-ranked RPG:

#7, Pokémon X and Y

Top-ranked Sports Game:

#45, Divekick

“Cult” Award (see below):

#46, 868-HACK

NOTEWORTHY LOSERS

Best game of 2013 which somehow nobody considered to be their #1 pick: #30, Guacamelee!
Worst game of 2013 that at least one person considered their #1 pick: Three-way tie between three games tied for the #326 slot: “I Hate the Dark”; Wizardry Online; and “Heroine’s Quest: The Herald of Ragnarok”. Each of these games got only one vote, but each of these voters considered it their game of the year.
Worst game of 2013: Two-way tie between the games tied for #402: “Composition 62″ and “Ultionus: A Tale of Petty Revenge”. Both of these games scored only one vote each, each from someone who considered it their 20th best game of the year.

There were a whole 57 games on the nominations list that no one voted for at all.

ALTERNATE SCORING METHODS

The rankings listed above are based on a version of the Borda count voting method. Each vote cast for a game gives that game a certain number of points. If someone ranks a game #1, that game gets 20 points. If they rank it #2, the game gets 19 points. If they rank it #3 the game gets 18 points… and so on. I have a script that checks a couple of alternate ways of ranking the same data, though.

For example, if we rank games only by the number of first place votes they got, the winner remains the same but almost the entire rest of the list changes dramatically– a lot more movement than usual this year, it seems like. I bolded entries that are different in the first-place-votes count:

First Past the Post

  1. Gone Home (52)
  2. The Last Of Us (51)
  3. The Legend of Zelda: A Link Between Worlds (38)
  4. Saints Row IV (35)
  5. Bioshock Infinite (33)
  6. Fire Emblem: Awakening (27)
  7. Papers, Please (23)
  8. Pokémon X and Y (23)
  9. Grand Theft Auto V (19)
  10. Dota 2 (19)
  11. The Stanley Parable (17)
  12. Animal Crossing: New Leaf (17)
  13. Kentucky Route Zero (16)
  14. Metal Gear Rising: Revengeance (15)
  15. Tomb Raider 2013 (14)
  16. StarCraft II: Heart of the Swarm (12)
  17. Assassin’s Creed IV: Black Flag (11)
  18. 868-HACK (10)
  19. Super Mario 3D World (9)
  20. Hate Plus (9)

Most years when I look at the first-past-the-post list a “cult” game emerges that received very few overall votes, but where an overwhelming percentage of those votes were #1 votes (I think of this as the “Persona award”); this year the standout was 868-HACK, which managed to grab #18 in the first past the post rankings despite being all the way down at #46 in the overall rankings. Also of note here are Hate Plus, which jumped from #33 to a tie for #19; and DOTA 2, which jumped from #22 to #10; and The Last Of Us, which jumped from #8 to #2 (actually, if it had received one more #1 vote, it would have tied Gone Home for first place).

I also did two more ways of sorting the rankings: an “approval” vote, where nothing is counted except the number of votes a game received (i.e. a first-place and a twentieth-place ranking count the same– all the matters is if the game was on someone’s list); and an instant runoff vote. Usually these two track the main count very closely, but this year, something rare happens in IRV: The first and second place games switch place! If you are qualified to comment on the differences between instant runoff and Borda-based ranked voting, feel free to tell us what that means.

APPROVAL

  1. Bioshock Infinite (223)
  2. Gone Home (205)
  3. Papers, Please (176)
  4. Tomb Raider 2013 (162)
  5. Saints Row IV (157)
  6. Pokémon X and Y (155)
  7. The Stanley Parable (149)
  8. The Legend of Zelda: A Link Between Worlds (144)
  9. Fire Emblem: Awakening (140)
  10. The Last Of Us (136)
  11. Animal Crossing: New Leaf (125)
  12. Rogue Legacy (117)
  13. Grand Theft Auto V (113)
  14. Assassin’s Creed IV: Black Flag (108)
  15. Brothers: A Tale of Two Sons (107)
  16. Gunpoint (98)
  17. Antichamber (91)
  18. Super Mario 3D World (84)
  19. Don’t Starve (83)
  20. Far Cry 3: Blood Dragon (82)

IRV

  1. Bioshock Infinite (223)
  2. Gone Home (205)
  3. Papers, Please (176)
  4. The Legend of Zelda: A Link Between Worlds (144)
  5. Saints Row IV (157)
  6. Pokémon X and Y (155)
  7. The Last Of Us (136)
  8. Tomb Raider 2013 (162)
  9. The Stanley Parable (149)
  10. Fire Emblem: Awakening (140)
  11. Animal Crossing: New Leaf (125)
  12. Grand Theft Auto V (113)
  13. Assassin’s Creed IV: Black Flag (108)
  14. Rogue Legacy (117)
  15. Brothers: A Tale of Two Sons (107)
  16. Gunpoint (98)
  17. Antichamber (91)
  18. Super Mario 3D World (84)
  19. Metal Gear Rising: Revengeance (77)
  20. Far Cry 3: Blood Dragon (82)

FINALLY: PER-FORUM BREAKDOWNS

Okay, so this is where things get… interesting.

When this poll first started, it was run out of the forums for Penny Arcade, and historically, that one forum has totally dominated the results. I traditionally link on a couple more small forums, but these usually only provide a handful of votes– and anyway, most of the forums I tended to target were themselves spinoff from the PA forum community. The final results invariably look almost exactly like the PA-specific results.

Except not this year. This year, for whatever reason, my efforts to promote the poll on Twitter took off like crazy; one of the tweets about it got 51 retweets. The result was a HUGE voter influx, to the point where PA was almost outnumbered– in the end PA contributed about 300 votes, whereas Twitter contributed about 250.

And PA and Twitter voted *really differently*. My vote script tracks “where votes came from”, and lets me run results isolated to votes from a particular source. The tracking isn’t perfect, but ought to be able give us some idea how different internet communities voted. Here’s the breakdowns from the different major vote contributors with their respective color-coded listings linked; here’s what we find:

Penny Arcade Forums (296 voters)

  1. Bioshock Infinite
  2. Tomb Raider 2013
  3. The Last Of Us
  4. Saints Row IV
  5. The Legend of Zelda: A Link Between Worlds
  6. Grand Theft Auto V
  7. Assassin’s Creed IV: Black Flag
  8. Fire Emblem: Awakening
  9. Pokémon X and Y
  10. Papers, Please
  11. Brothers: A Tale of Two Sons
  12. Rogue Legacy
  13. Gone Home
  14. The Stanley Parable
  15. Starcraft II: Heart of the Swarm
  16. Super Mario 3D World
  17. Gunpoint
  18. Metal Gear Rising: Revengeance
  19. Ni No Kuni
  20. Guacamelee!
Twitter (244 voters)

  1. Gone Home
  2. Papers, Please
  3. The Stanley Parable
  4. Animal Crossing: New Leaf
  5. Pokémon X and Y
  6. Depression Quest
  7. Saints Row IV
  8. Kentucky Route Zero
  9. Antichamber
  10. Hate Plus
  11. The Legend of Zelda: A Link Between Worlds
  12. Bioshock Infinite
  13. Fire Emblem: Awakening
  14. Tomb Raider 2013
  15. The Last Of Us
  16. Gunpoint
  17. Rogue Legacy
  18. Super Mario 3D World
  19. Candy Box
  20. 868-HACK
Platformers.net (32 voters)

  1. The Legend of Zelda: A Link Between Worlds
  2. Fire Emblem: Awakening
  3. Bioshock Infinite
  4. Rogue Legacy
  5. Pokemon X and Y
  6. The Stanley Parable
  7. Gone Home
  8. Far Cry 3: Blood Dragon
  9. Papers, Please
  10. Cookie Clicker
  11. Shin Megami Tensei IV
  12. Metal Gear Rising: Revengeance
  13. Animal Crossing: New Leaf
  14. Super Mario 3D World
  15. Salty Bet
  16. Phoenix Wright: Ace Attorney
  17. Brothers: A Tale of Two Sons
  18. Grand Theft Auto V
  19. Assassin’s Creed IV: Black Flag
  20. Saint’s Row IV
Tigsource.com (22 voters)

  1. Papers, Please
  2. The Stanley Parable
  3. The Last Of Us
  4. Path of Exile
  5. Brothers: A Tale of Two Sons
  6. Shelter
  7. Save the Date
  8. Monaco: What’s Yours is Mine
  9. Jelly No Puzzle
  10. Risk of Rain
  11. Don’t Starve
  12. SUPERHOT
  13. The Wonderful 101
  14. Starcraft II: Heart of the Swarm
  15. Dota 2
  16. Metal Gear Rising: Revengeance
  17. Towerfall
  18. Samurai Gunn
  19. Far Cry 3: Blood Dragon
  20. Dragon’s Crown

Looking at these breakdowns, the most pronounced difference between the big two voting blocks– Twitter and Penny Arcade– is how they treated the top two entries from the combined results: Twitter put Gone Home at #1 whereas PA put it at #13, and Bioshock Infinite was voted #1 by PA but ranked down at #12 for twitter. One thing that might have hurt Bioshock here is that although Twitter voted overwhelmingly for Gone Home, the PA bloc did not vote for Bioshock Infinite nearly as solidly; in fact, going by “first place” votes alone, PA actually preferred The Last Of Us (39 first place votes to Bioshock’s 23).

The voting differences get even more interesting when we compare the Twitter results to the Tigsource results– the Twitter results seem to show a very heavy influence, and Tigsource is an indie community, but these are apparently slightly different parts of the indie community because a few of the games that made strong showings on the Twitter list (in particular, Gone Home and Depression Quest) didn’t rank at all among Tigsource’s 20 or so voters.

Anyway, that’s it, thanks so much for voting and I’ll be doing this again next year!

Game of the Year 2013: Vote Here

January 5th, 2014

Hello anyone out there: I’ve got this Game of the Year poll that I run on some web forums I frequent. The way it works is that you rank your favorite games of the year– up to 20, though vote for as many or as few as you want– and the script will sort out the top 200 or so out of everyone’s votes. Here’s last year’s results if you want to see what this looks like (or previous years here). I will run this poll script for one week and then post the results. If you’d like to give it a try:

Vote here

When one week is up (Jan 12) I will delete this post and post the results here on this blog. Thanks!

Player Piano

November 19th, 2013

For my submission to the SHARECART 1000 project, a simple FM synthesizer/sequencer. When installed with other SHARECART 1000 games, Player Piano will allow you to listen to your SHARECART save files as music.

Note that auto-save on quit will only work if you quit by pressing “ESC”.

Download

How do I figure out the original git revision of a GitHub zip I downloaded?

October 20th, 2013

This is a problem I ran into today: Awhile back I downloaded a copy of a project hosted on GitHub, using the “Download Zip” button on the project page.

A few months later I came back, and I needed to know exactly which git revision it was I downloaded. Given this zip file, how does one “go backward” and figure out what revision it was? (The GitHub zips contain just a folder containing a single revision; there is no .git directory in it, so command line git can’t do anything.) I got some help on Twitter (thanks Robert!) and eventually figured out how— but I couldn’t find an explanation anywhere Google had picked up, so here’s a quick summary for future generations.

Method 1: Zip comment

This one’s pretty simple: There’s a thing in the headers of zip files called the “zip comment”, and apparently GitHub stores the original revision hash in there. You can get this out using the command line zip tool and the “unzip -z” flag:

If you’re using Windows, the makers of the WinZip tool claim they display the comment automatically when you open the zip in WinZip.

Method 2: Dates

Maybe you’ve lost the original zip, and you just have the folder? We still have one piece of useful information: the “modified” dates. If you look, all the files in the unzipped repo will have their created/modified dates set to the same date and time:

In fact, this is the date and time corresponding to the exact timestamp on the original Git commit. So we can just go look through the git logs until we find a revision with that exact timestamp:

By the way: If you download a zip from BitBucket, they helpfully actually put the exact revision hash into the name of the downloaded zip, so you’re unlikely to run into this problem. Just so you know though BitBucket git repositories do work with both the “zip comment” and the timestamp rules above. BitBucket mercurial repositories don’t have the “comment” in the zip file, but they do contain an invisible file named .hg_archival at the root level of the unzipped directory which contains the same information and then some.

Exacto: How to copy scenes between Stencyl files

June 2nd, 2013

Warning: This has only been successfully tested in one specific case and it could potentially leave Stencyl files unusable. Only use it after backing up the Stencyl projects involved.

So this is a tool with a pretty narrow application, but, I made a thing. If you’re familiar with the (generally pretty nice) game-making tool Stencyl, you may be aware of a curious limitation it has: It is not possible to copy a scene between two Stencyl files. You can import and export assets, or scripts; but not entire scenes. My friend Liz wound up getting stuck on this hard when she was making her game Problem Attic (it’s really good, you should play it); she had split her project file into two, and was experimentally making levels in each. Then she went back to merge them and found out Stencyl won’t let you do this.

To unstick Liz, I wound up making a Python script that just goes behind Stencyl’s back and copies scene data between games. You can find the script, named Exacto, here; click “Download”, or look under “source” (it’s the only file). From our testing, the script appears to work— it was used to make the final version of Problem Attic.

Caveats

The script copies all the data related to a Stencyl scene that I am aware of. There are two problems here though. First off, it’s possible there’s some data I’m not aware of, and which the script therefore misses. Second off, the script does not even try to copy things which are global to the game— such as actors, or assets like tilemaps. So if you want this script to work, you will need to copy those things over first. You will need to separately load all the assets used in game A, into game B.

There’s one more, even more awkward thing though. Remember I said above our test programs were two versions of the same game? Well, all the assets have “ID numbers”. You can’t see them while you’re making the game, but they’re in the game files. I’m not sure, but it’s possible that if you copied between two *very* different games, and you used a lot of assets in the scene you copied, and both games contained the same assets *but* the “ID numbers” were different, then my script might get the ID numbers confused and cause Stencyl to use the wrong actors or the wrong assets or tilemaps after you copy the scene. I’m not sure; I haven’t had a chance to try this. Because Liz again was just working off two versions of one game file, I expect the ID numbers would be the same in both for her anyway.

I’m going into all this technical detail for a simple reason: Stencyl does not deal well when it sees something it doesn’t expect in the game file. Stencyl expects it will only see game files it made and that people like me won’t be mucking around behind its back. So if you copy something into a game file that Stencyl doesn’t understand, it might just crash, or freeze, or maybe you’ll think the copy worked but afterward you’ll find certain windows in the interface just refuse to open. (This is why I am making such a big deal about backing up your files.) So I’m wanting to warn about places where my script might confuse Stencyl, and the assets/asset ID thing is the biggest danger point I know of.

By the way though— if you *do* find a case where running this script left Stencyl unable to work on your game file, I’d be curious to talk to you. If I had more examples to work with, I’d be able to make the script more resistant to bugs.

Usage

Download the exacto.py script from the above URL. If you are using Windows, you’re also going to need to install something called “Python”. Go here and download the “Python 2.7 Windows Installer” or “Python 2.7.5 Windows installer”— it doesn’t much matter which you pick as long as it has “2.7″ in the name and not “3.0″. (If you are on a Mac, you already have Python.) Run that installer.

Close Stencyl, and open up a command line window— on Windows this is “Command prompt” in the start menu, on the Mac this is Applications->Utilities->Terminal, but if you don’t already know how to use the command line this next part might not make much sense to you anyway. Cd to the directory you downloaded exacto.py to. Here’s a few examples of using Exacto– things you can type into the command line:

python exacto.py --help

This has Exacto print out exactly how to use it. There’s a couple small features it lists in the –help but which I don’t document in this blog post.

python exacto.py -i ~/stencylworks/games/gameFROM list

This prints out a list of all the scenes, and their “Scene ID” numbers. By the way, where I say “gameFROM”, put the name of your game.

python exacto.py -i ~/stencylworks/games/gameFROM -o ~/stencylworks/games/gameTO copy "scene one" "scene two" "scene three"

This is what you actually want to do to copy scenes. You’ll want to put the name of the game you want to pull from instead of gameFROM after the -i, the name of the game you want to write into instead of gameTO after the -o, and at the end instead of “scene one” “scene two” put the names of the scenes you want to copy (in quotation marks). Once you’ve run this command, you can open up Stencyl again and see if it worked.

By the way– maybe you run “list” and realize the names are kinda complicated or messy to type? You could also just use numbers. Say you ran the “list” command and you saw the scenes you want to copy have scene IDs 6 and 8. You could say:

python exacto.py -i ~/stencylworks/games/gameFROM -o ~/stencylworks/games/gameTO copy --raw-id 6 8

Notice all these examples are for the Mac command line. If you run on Windows, you want to write it a little different. Instead of saying “python” you want to say “C:\Python27\python.exe”, and everywhere I wrote “~/stencylworks/games/” above you’ll want to say “%APPDATA%\Stencyl\stencylworks\games\”. (The point with this “stencylworks/games” folder is that you need to find the place where stencylworks stores your game’s *directory*– the thing containing the game.xml. The script doesn’t work on .stencylworks files.)

TODO

I’m not sure if I intend to do any more work on this— it depends on how much interest there is. But some things I think would be worth exploring in a future version of this script:

  • It’s probably pretty easy to make a Windows GUI so you don’t have to do all this downloading-Python/command line junk. cx_Freeze or something.
  • Fix for the “ID numbers” thing I’m worried about above?
  • It’s probably possible to copy assets and actors and stuff! I just haven’t looked into it. At least, it’s probably pretty easy to *check* if the assets are present in the destination file, and if they aren’t, complain and refuse to copy, instead of just going ahead and breaking stuff.
  • It would be pretty easy to be able to operate on .stencylworks files, or even make new .stencylworks files. They’re just zips.

Anyway

Anyway, good luck, and if you try this script out, I’d be curious to hear how it went (whether it works or not!).

The Best Game

May 12th, 2013

Ian Snyder this weekend is hosting #objectjam, which challenges us to create games which instead of running on computers or game consoles run on “everyday” objects such as walls or chairs. I couldn’t think of anything myself, but I realized:

Java, our Shetland Sheepdog, actually invented a game you play with a piece of rope. He invented this game when he was about four and taught it to us by repeatedly forcing us to play it. I don’t know what Java calls this game because I’m not completely certain he has an internal concept of “language” or “names” so I’m calling it The Best Game because that is how he seems to think of it. If he were allowed to, Java would play this game and nothing else forever.

I hereby submit this game to #objectjam on Java’s behalf. Here is how you play.


Java, AKA Player One.

The Best Game requires:

  • A short piece of rope, say a little under a foot, preferably with one or more knots in it.
  • Two players. Player Two has to be a human because they need to have hands. Player One can be anybody.

The game is played in three phases, which repeat:

Phase 1: Keep-away. Player One starts with the rope (Java holds it with his mouth because that is how he holds things). Player Two has to try to get it. Player One should stand as close as possible to Player Two (this is not a hiding game), but if Player Two tries to grab the rope immediately dodge out of the way.

Phase 2: Tug-of-war. Player Two has now grabbed one end of the rope. However, Player One still holds the other end. Player One should hold on as tight as possible to prevent Player Two from taking it away.

Phase 3: Fetch. Player Two, now that they have the rope, must throw the rope as far as possible, so that Player One can run really fast and go retrieve it. Player One then returns the rope to Player Two’s position, at which point Phase 1 restarts immediately.

Tips for playing: Sometimes, people may not realize they are playing the game, or might think the game is done. In this case, Player One can try dropping the rope in Player Two’s lap; when they pick the rope up, Player One can immediately re-grab the other end of the rope at which point Phase 2 begins immediately. (This means you skip Phase 1 but this is okay because it’s a good way of teaching Player Two the game. You can ease in Phase 1 as they get the idea.) If Player Two doesn’t immediately react to having the rope dropped in their lap, maybe they didn’t notice you put it there. Try picking the rope up again and dropping it 3 or 4 more times.