Archive for the 'I Made This' Category

I designed a programming language and it looks like this.

Tuesday, April 22nd, 2014

For reasons I talk about here, I’m going to try to create a programming language. So far I’ve got basically a first-draft design.

There is a specific idea I decided I want to explore, when I do this: Programming languages duplicate too much. Programming languages often have multiple syntaxes that do very similar things, or multiple underlying concepts that do very similar things. It is sometimes possible to collapse these similar things into one thing, and when we do, I usually like the results better.

For example, many languages (Perl, Python, PHP) have both a dictionary type and an object type, but the two are used in effectively the same way; on the other hand Lua collapses dictionaries and objects into one type (tables), and makes an object field lookup identical to a dictionary string lookup. Or most object oriented languages distinguish objects and classes, but prototype based languages show that you can get by with just objects; if objects can inherit from other objects, then a “class” is just a pattern for a particular kind of object. When you collapse ideas together like this, or build language features on top of existing features rather than adding new primitives, you reduce both the amount of mental overhead in thinking about code implementation and also the amount of redundant syntax. There’s usually a lot of redundant syntax. C++ uses . to access a field from a reference, and -> to access a field from a pointer. Most languages use (x) to indicate an argument to a function, and [x] to indicate an index to an array. Why? If pointers and references were just special cases of one underlying concept, or if arrays and functions were, you could use one syntax for each pair and you wouldn’t have to mentally track what each variable is, you wouldn’t have to do all the obnoxious manual refactoring when you suddenly decide to replace a reference with a pointer somewhere or vice versa.

In the language I’ve been thinking about, I started with Lua’s “Table” idea– what if i built objects out of dictionaries?– and decided to take it one step further, and build both objects and dictionaries out of functions. In this language, there’s one underlying data structure that functions, objects, dictionaries, and some other stuff besides are just special cases of– design patterns of.

Taking a cue from Smalltalk, I’m going to call this underlying structure “blocks”.

Blocks are just functions

A block, for purposes of this blog post, is a unary function. It takes exactly one argument, and it returns a value. Anywhere in this blog post I say “blocks”, I could have just written “functions”. I’m going to mostly use the “block” jargon instead of saying “functions” because some blocks will be “used like” functions and some very much will not be.

In my language, you’ll define a function just with “=”:

    addOne ^x = x + 1

The ^x is an argument binding. The ^ signals to the language that the right side of the = needs to be a function body (a closure). If on the next line you just said

    y = addOne 3

That would just assign 4 to the variable “y”, it would not create a function.

Blocks are pattern-matched functions

A big part of this project is going to be that I really like the ideas in functional languages like ML or Haskell, but I don’t actually enjoy *writing* in those languages. I like OO. I want a language that gives me the freedom and expressiveness of FP, but comfortably lets me code in the OO style I use in Python or Lua or C++. So I’m going to steal as many ideas from FP languages as I can. Three really important ideas I’m going to steal are closures, currying, and pattern matching.

In case you don’t know those languages, let me stop and explain pattern matching real quick. You know how C++ lets you function overload?

    // In C++
    void addOneHour(int &k) { k = (k + 1) % 12; }
    void addOneHour(float &k) { k = fod(k + 1.0, 12); }

Well, pattern matching is as if you could switch not just on type, but also on value:

    // In hypothetical-C++
    void addOneAbsolute(int &k where k > 0) { k = k + 1; }
    void addOneAbsolute(int &k where k < 0) { k = k - 1; }
    void addOneAbsolute(0) { } // Do nothing

That last line– the one demonstrating we could write a function whose pattern matches only *one single value*– is going to be important to this language. Why?

Blocks are dictionaries

In my language, if I want to assign more than one pattern to a single block, I just use = multiple times:

    factorial ^x = x * factorial (x - 1)
    factorial 0 = 1

“Factorial” is a block. The way I’m looking at it, a block is just a data structure which maps patterns to closures. It’s like a dictionary, but some of the keys (the ones with bound variables) match multiple values.

However we could not bother assigning any bound-variable patterns, and then we’d just have a dictionary or an array:

    nameOfMonth 1 = "January"
    nameOfMonth 2 = "February"
    nameOfMonth 3 = "March"

Blocks are objects

Here I want to introduce a data type called an “atom”. This is an idea stolen from Erlang (and possibly Ruby?). Technically an atom is an “interned string”. It’s something that the programmer sees as a string, but the compiler sees as an integer (or a pointer, or something which has a constant-time comparison). You get at the atom by putting a . before a symbol; the symbol is the name of the atom:

    x = .atomname

It’s cheaper to compare atoms than strings (.atomname == .atomname is cheaper than “atomname” == “atomname”) and cheaper to use them as dictionary lookup keys. This means atoms work well as keys for fields of an object. Objective-C for example actually uses atoms as the lookup keys for its method names, although it calls them “selectors”. In my language, this looks like:

    constants.pi = 3.14
    constants.e = 2.71
    constants.phi = 1.61

Notice this looks like normal object syntax from any number of languages. But formally, what we’re doing is adding matching patterns to a function. What’s cool about that is it means we’ll eventually be able to use machinery designed for functions, on objects. Like to skip ahead a bit, eventually we’ll be able to do something like

    map constants [.pi, .e, .phi]

and this will evaluate to an array [3.14, 2.71, 1.61].

What’s up with the square brackets? Oh, right. well, I think all this “constants.” nonsense is gonna get kinda tiresome. So let’s say there’s a syntax like:

    constants = [ pi = 3.14, e = 2.71, phi = 1.61 ]

Notice I say “pi” and not “.pi”– on the left side of an =, the initial “.” is implicit. More on that in a moment.

One other thing. Inside of the [ ], there exists an implicit “this” variable, corresponding to the object the [ ] creates. so if you say

    counter = [
        count = 0
        increment ^x = { this.count = this.count + x }
        decrement ^x = { this.count = this.count - x }
    counter.increment 1
    counter.increment 3

Then at the end of this string of code “counter.count” is equal to four.

Blocks are prototypes

What if we want more than one counter object? Well, you’ll notice an interesting consequence of our pattern matching concept. Let’s say I said:

    counter = [
        init ^x = { this.count = x }
        increment ^x = { this.count = this.count + x }
        decrement ^x = { this.count = this.count - x }

    counter_instance ^x = counter x
    counter_instance.init 3
    counter_instance.increment 5

When we say “counter_instance.whatever”, the language interprets this as calling the block counter_instance with the argument .whatever. So if counter_instance is defined to just re-call “counter”, then on the next line saying “counter_instance.init 3″ will fetch the block stored in counter.init, and then that block gets called with the argument 3. The way the “this” binding works is special, such that counter_instance.init gets invoked “on” counter_instance– “this” is equal to counter_instance, not counter.

The syntax we used to make counter_instance “inherit” is pretty ugly, so let’s come up with a better one:

    counter_instance.ditch = counter

I haven’t explained much about how = works, but when we say “counter_instance ^x = “, what we’re really doing is taking a closure with an argument binding and adding it to counter_instance’s implementation-internal key-value store, with the key being a pattern object that matches “anything”. “.ditch” is a shortcut for that one match-anything key slot. In other words, by setting counter_instance.ditch to counter, we are saying that counter is counter_instance’s “prototype”.

Something to make clear here: the lines inside a [ ] aren’t “magic”, like C++ inside a struct declaration or anything. They’re just normal lines of code, like you’d find inside a { }. The difference is the insides of [ ] are using a specially prepared scope with access to a “this” and a “super”, and at the end of the [ ] the [ ] expression returns the scope into which all these values are being assigned (“this”). The upshot is you could easily have the first line of your [ ] be something like an “inherit counter;” call that sets the ditch and does some various other fix-up to make this prototype system act more like some other kind of object system, like a class system (I like classes). This sort of thing is possible because

Blocks are scopes

Like most languages, this one has a chain of scopes. You’ll notice above I offhandedly use both ( ) and { } ; these are the same thing, in that they’re a series of statements which evaluate to the value of the final statement:

    x = ( 1; 2; 3 )

…sets x equal to 3. The 1; and 2; are noops. (Semicolon is equivalent, in the examples I’ve given here, to line-ending. There’s also a comma which is effectively a semicolon but different and the difference is not worth explaining right now.)

The one difference between { } and ( ) is that { } places its values into a new scope. What is a scope? A scope is a block. When you say

    a = 4

The unbound variable a is atom-ized, and fed into the current scope block. In other words “a” by itself translates to “scope.a”. When you create a new inner scope, say by using { }, a new scope block is created, and its ditch is set to the block for the enclosing scope. The scope hierarchy literally uses the same mechanism as the prototype chain.

Block constituents are properties (or: blocks are assignment statements)

Non-language geeks may want to skip this section.

I’ve been pretty vague about what = does, and that’s because it has to do several layers of things (matching items that already exist, binding variables, wrapping closures, and actually performing assignment). However, ultimately = must write [pattern, closure] pairs into one or more blocks. = cannot, however, actually write anything by itself. Ultimately, when = decides it needs to assign something, it is calling a “set” method.

    a = 4

Is ultimately equivalent to

    scope.set .a 4

That = is sugar for .set is a small detail, but it has some neat consequences. For one thing, since everything that happens in this language is curryable, it means you can trivially make a function:

    a_mutator = set.a

…which when called will reassign the “a” variable within this current scope (remember, “set” by itself will just be “scope.set”). For another thing, this means you can create a “property” for a particular variable:

    set.a ^x = ( b = x + 1 )
    a = 3

After this code runs, “b” will be equal to 4 (and “a” will still be equal to a function that mutates “b”).

The existence of .set will also have some interesting effects once we have types and therefore a type checker. I’ve been kinda vague about whether = has “set” or “let” semantics– that is, if you assign to a variable does it auto-instantiate or must you predeclare it, if there is a variable by the assigned name in the ditch does assignment shadow in the assigned block or reassign in the parent block, etc. And the answer is it doesn’t much matter for purposes of this post, because any of the possible things that happen when you set a field (“not declared” error thrown, assigned to top-level block, assigned to a parent block) could just be all things that could and do happen in different blocks, depending on what that block’s .set is set to. For example, it would probably make sense for object blocks and scope blocks to have a different last-ditch “.set” behavior, or be sensible to allow different source files to have different “.set”s for their file-level scopes (“use strict”).

On that note, let’s talk about types. There’s a lot of very exciting stuff happening in the study of types in programming languages right now, both types as used in languages and types as used in extra-lingual static analysis tools. I don’t understand a lot of this research yet (and I want to learn) but I think I understand enough to have an idea of what’s possible with types right now, and that means I know how I want types in this language to work.

Blocks are types

Let’s say we have a syntax variable:type that we can use to constrain the arguments of a function.

    factorial ^x : int = x - 1

When this function is called, there will be a runtime check, if “x” is not an int, it will be a runtime failure. Let’s say we can use the a:b construct inside expressions too:

    square ^x = ( x * x:float ) :: stateless

Let’s say that :: instead of : indicates that the type is being applied not to the value returned by that parenthesis, but to the implicit “function” defined by the parenthesis itself. “stateless” is a type that applies to functions; if we assert a function is “stateless” we assert that it has no side-effects, and its resulting value depends only on its inputs. (In other words, it is what in another language might be called “pure”.)

There’s some kind of a inferred typing system in place. There’s a compile time type checker, and when it looks at that “square” function it can tell that since “x” is a float in one place inside the expression, that the “x” passed into square must itself be a float. It can also tell that since the only code executed in “square ^x” is stateless, that the function “square ^x” is also stateless. Actually the “stateless” is from the checker’s perspective unnecessary, since if the checker has enough information about x to know the * in (x * x) is a stateless operation– which, if it knows x is a float, it does know that– then square ^x would be stateless anyway.

There’s some kind of a gradual typing system in place. There is a compile-time step which, everywhere square ^x is called, tries to do some kind of a type-proving step and determine if the argument to square is a float. If it can prove the argument is a float, it actually omits the runtime check to save performance. If it *can’t* prove the argument is a float, or it can prove the argument *isn’t* a float, it adds the check and maybe prints some kind of compile-time warning. (To stress: some of these properties, like “stateless”, might be in many cases *impossible* to prove, in which case the checker is conservative and treats “can’t prove” as a failure.) Besides omitting safety checks, there are some other important kinds of optimizations that the type checker might be able to enable. Critically, and this will become important in a moment, if a function is stateless then it can be potentially executed at runtime.

So what are types? Well, they’re just functions. “int” and “stateless” are language-builtin functions that return true if their arguments are an int, or a provably stateless function, respectively. (For purposes of a type function, if the type *doesn’t* match, then either a runtime failure or a return false are okay.) Types are values, so you can construct new ones by combining them. Let’s say that this language has the || and && short-circuit boolean operators familiar from other languages, but it also has & and | which are “function booleans”– higher level functions, essentially, such that a | b returns a function f(x) which is true if either a(x) or b(x) is true. So if “stateless” and “nogc” are two of the builtin type functions, then we can say:

    inlineable = stateless | nogc

And if we want to define a totally unique type? Well, you just define a function:

    positive ^x = x > 0
    sqrt ^x : positive = x / x    # Note: There might be a bug here

Obviously you can’t use just any function here– there would have to be some specific type condition (probably something like the “inlineable” I describe above) that any function used as a type in a pattern would be required to conform to. This condition would begin and end with “whatever the type checker can efficiently prove to apply or not at compile-time”.

Let’s finally say there’s some sugar for letting you define these “type condition” functions at the same time you define the function to whose parameters they apply; we could reduce that last block down to

    sqrt (^x >= 0) = x / 2    # Square root implementation, WIP 2

One other bit of sugar that having a type system makes easy:

Blocks are argument lists

So everything so far has been a unary function, right? There’s only so much we can do with those. This language is set up for currying– that’s how method lookup works, after all– and I would like to offer explicit sugar for curry:

    curryadd ^x ^y = x + y

But ehh, I don’t actually like using currying for everything. I like argument lists. And I really, *really* like named arguments, like Python uses. Let’s say we have this syntax:

    divide [^numerator, ^denominator = 1] = numerator / denominator

The “parameters” block there? Is totally just a block. But there’s some kind of block wiring such that:

    divide [4, 2]           # Evaluates to 2
    divide [4]              # Evaluates to 4-- "denominator" has a default argument
    divide [9, denominator=3]                       # Evaluates to 3
    divide [denominator = 4, numerator = 16]        # Evaluates to 4
    divide [ ]       # Compile-time error -- assignment for "numerator" not matched

There’s some sort of block “matching” mechanism such that if the argument block can be wired to the parameter block, it will be. I don’t have an exact description handy of how the wiring works, but as long as blocks remember the order in which their (key, value) pairs are assigned, and as long as they can store (key, value) pairs where exactly one of key and value is (no value), then such a matching mechanism is at least possible.

My expectation is that almost all functions in this language will use the argument blocks for their parameters, and almost all invocations will have an argument block attached.

Blocks are macros

I wanna go back here and look at something closer: We’ve defined that there’s some subset of this language which can be run at compile time, and that the type checker can identify which functions are in that category. I think this is a pretty powerful concept, because it means the language can use *itself* as its macro language.

So far in this post, you’ve seen three main kinds of syntax in the code samples: Unary function application (again, a field lookup like a.b.c is really just a bunch of currying), “=”, and little extra operators like “+”. What I’m going to assert is that the extra operators– and also maybe =, and maybe even [ ]– are actually just rewrite rules. So for the line:

    3 + square 4

Before actually being executed, this line is transformed into

    3 .plus ( scope .square 4 )

“3″, like anything else, is a block. Like in Io or Self, adding three to four is just invoking a particular method on the 3 object. In this language “+”, the symbol, is just a shortcut for .plus, with parser rules to control grouping and precedence. (If we actually just wrote “3 .plus square 4″, then the currying would try to interpret this as “(3 .plus square) 4″, which is not what we want.)

There’s some kind of a syntax for defining line-rewrite rules, something like:

    op [ symbol = "!", precedence = 6, replace = .not, insert = .unary_postfix, group = .right_inclusive ]
    op [ symbol = "*", precedence = 5, replace = .times, insert = .infix, group = .both ]
    op [ symbol = "+", precedence = 4, replace = .plus, insert = .infix, group = .both ]
    op [ symbol = "==", precedence = 3, replace = .eq, insert = .infix, group = .both ]
    op [ symbol = "&&", precedence = 2, replace = .and, insert = .infix, group = .both ]
    op [ symbol = "||", precedence = 1, replace = .or, insert = .infix, group = .both ]

Which means for something like

    result = 12 * 2 + 9 == 3 + 8 * 4
    result = !parser.valid 34 && result

Ultimately what’s actually being executed is:

    scope .set .result ( ( ( 12 .times 2 ) .plus 9 ) .eq ( 3 .plus ( 8 .times 4 ) ) )
    scope .set .result ( ( ( scope .parser .valid 34 ) .not ) .and ( scope .result ) )

So this is fine for symbols like + and – which operate on two clearly-defined values, but what about something more complicated like “=”? Well, there ought to be some kind of way to pass “op” a .custom function, which takes in a list of lexed tokens representing a line and returns a transformed list of tokens. At that point you can do pretty much anything. “=” might be the *one* thing that you can’t implement this way because = does special things involving adding bindings. But short of that, custom “op”s would be sufficient even for things like, I don’t know, flow control:

    if ( a == 4 ) { k.x = 3 } else { k.x = 4 }

I may be getting into the language-geek weeds again here but I’m gonna walk through this: Let’s say I have a higher order function “if ^pred ^exec” which takes functions “pred” and “exec”, executes “pred” (pred is probably nullary… which I haven’t decided what that means in this language yet), if the result is true it executes “exec” and returns the void combinator (v ^x = v), if the result is false it returns a function which expects as argument either .elsif (in which case it returns if) or .else (in which case it returns a function that takes a nullary function as argument and evaluates it). We’ve now defined the familiar if…elsif…else construct entirely in terms of higher order functions, but actually *using* this construct would be pretty irritating, because the “pred” and “exec” blocks couldn’t just be ( ) or { } as people expect from other languages, they’d have to be function-ized (which means annoying extra typing, toss some ^s in or however lambdas are made in this language). But, we can declare “if”, “else” and “elsif” rewrite ops: “if” ^-izes the next two tokens and then replaces itself with just “if” again; “else” and “elsif” ^-ize the next one token each and then replace themselves with .else or .elsif. If we do this, then the familiar if… else syntax above just *works*.

…why am I going into all this, about “if” “else”? Well, because I want to stress that it means *flow control constructs can be implemented in the language itself*, and they will be truly first-class equals with builtins like “if” or “while”. In my eventual vision of this language, the *only* language-level syntactical elements are

    ( )
    [ ]
    { }

And *everything* else, including comment indicators and the end-of-line statement-terminator, is just rewrite rules, ideally rewrite rules written in the language itself. Which implies if you don’t like the language’s syntax much, you could just unload the builtin “stdops” module that contains things like “+” and “if”, and substitute your own. “op” rules are local to scopes, so syntax could vary hugely file to file. Which… well, shouldn’t it? I know people who avoid entire languages because they don’t like one or two things about the syntax. Say, people who go “well, Objective-C has a neat object model, but I can’t get used to all those square brackets”. Or I in my last blog post specifically said that although they both have lots of features I like, I personally won’t use LISP because I can’t make visual sense of S-expressions, I won’t use Javascript because of the casting rules. None of this makes any sense! Languages should be about *features*. They should be models of computation, and we should be evaluating them based on how expressive that model is, based on the features of the underlying flow control or object model or type system or whatever. Syntax shouldn’t have to be part of the language selection process, and if languages let us put the sugar on ourselves instead of pre-sugaring everything then it wouldn’t have to be. I’m probably getting carried away here. What was I talking about? Did I say something just now about casting rules? Let’s talk about casting rules.

Blocks are language machinery

Some syntactical elements, like [ ] and =, might be too complex for the programmer to plausibly implement themselves. The programmer should still have a fair amount of control over these builtins work. One way to do this would be to have things like [ ] and = implicitly call functions that exist in the current scope. For example, instead of calling .set, = might call a function “assign” that exists in current scope; this would allow individual scopes to make policy decisions such as the variable auto-instantiation rules I mentioned earlier. [ ], at the moment it instantiates the new block, might call a function “setup” that exists in the current scope, allowing the programmer to do things like change the default ditch (base class) or the exact meaning of “inherit”. There might be a function that defines the default type constraints for numbers, or strings, or lines of code. Maybe somewhere there’s a Haskell fan who wants to be able to have every ( ) wrapped up to be ^-ized and every line wrapped in ( ) :: stateless, so that any code *they* write winds up being effectively lazy-evaluated and side-effect-free and they can only communicate with the rest of the language using unsafe monads. They should be able to do that.

One thing I definitely want in is for there to be something like a “fallback” function which, if a particular block is called with an argument whose type doesn’t fit any pattern the block has defined, attempts to map the argument to one of the patterns the block *can* handle. In other words questions about whether different but intraconvertable types like ints and floats can be converted between without a cast would be a decision made on a per-project or per-file basis. Or for example if there’s a function

    square ^x:int = x*x

and one of the patterns on the fallback block is

    fallback ^fn : function( ^type, _ ) [^x : type] = fn x    # Follow all that?

(Let’s assume “function” is a higher-order type function such that function(a,b) is the type of a function a -> b, and let’s assume _ has the magic property of “match anything, but don’t capture it” when used in a pattern.)

…then even though the function is only defined for (square x) we could totally get away with calling square[ x ], because the fallback function could match [ x ] to x.

Uh, incidentally, I’m not totally sure this thing with the fallback function is actually in general possible or possible to make performant. But as with most of the stuff in this language, I think it would be fun to try!

Blocks are C++ or Javascript objects in disguise, potentially

There’s one last thing I want to talk about here, although it’s one of the most important features from my perspective. The model we have here– where formally speaking all field accesses execute functions, all field assignments execute functions, and there’s some kind of type checker at work capable of tracking fine detail about what kinds of operations get performed on individual blocks– means that the underlying language-level implementation of “a block” could differ from block to block.

The model I’ve described here for blocks is extremely dynamic and flexible– *too* flexible, such that it would be very difficult to make code using all these dynamic features performant. Except not every block will be using all of the features blocks have. Some blocks will only contain “value” keys (i.e. never a ^var:type pattern), and the type inferrer will be able to prove this the case. The compiler/interpreter could represent this one block internally as a plain hashtable, rather than taking the overhead to enable executing arbitrary code on every access. Some blocks, despite being mutable, will have a fixed known set of keys; the language could maybe represent these in memory as plain structs, and translate atoms to fixed memory offsets at compile time.

And some blocks, at the programmer’s direction, might be doing something else altogether. It’s easy to imagine a “proxy object” where each invocation of an atom and an argument on the block is actually copying the atom and argument and shipping them into another thread or across a network, and the type checker ensures the contract is followed and objects are actually copyable; you could build an Erlang style messaging system this way.

Of particular interest to me, some blocks might actually be guests from some totally other system, say a different language with its own object model. An FFI for some other language could make wrapper blocks for that language’s objects, and put in place type guarantees that the programmer does not interact with those blocks in any way the guest language does not support. The two languages I’d personally really like to be able to interface with this way are C++ and Javascript, because these languages have valuable platform and library support, but also are languages I do not actually want to *write*.

C++ in particular interests me, because I’m not aware of any languages which are “higher level” in the sense that interests me but which can currently adequately interface with C++. C++ is actually pretty tricky to interface with– the big problem here, to my mind, being that method calling conventions (name mangling) vary from compiler to compiler. Actually, on some platforms (by which I mean “Windows”) it’s the case that shared libraries (DLLs) can’t be shared between compilers even if you are writing in C++ yourself. It would probably be necessary, if making a C++ FFI, to target one particular compiler (I’d vote Clang, because it’s extensible and has good platform support). Choosing to target one particular compiler would have a neat side effect: With some knowledge of the compiler’s implementation details, it *ought* to be possible to make blocks that inherit from C++ classes, and have those blocks actually construct fake vtables at runtime that jump into the compiled code for (or interpreter for) my language. Since in my language “classes” and “objects” get constructed by calling functions whose execution could be potentially deferred to runtime, it would be essentially invisible to the programmer when they say [ inherit QObject; objectName = "Block" ] whether a normal block or a pseudo-C++ class is being constructed.


Anyway, here’s what I think I’ve got here. I started with one single idea (pattern-matched unary functions that remember the order in which their patterns were assigned), and asked the question “how much of what a language normally does could I collapse into this one concept?”. The answer turns out to be “very nearly EVERYTHING”, including stuff (like type specifications, macros and FFIs) that most languages would wind up inventing effectively an entire sub-language just to support (templates… ugh). I actually *do* want a new programming language, mostly because of that thing I mentioned with not liking any existing language’s C++ interop, and I actually do intend to at least attempt this project. I’m basically just gonna download the Clang source at some point and see how far I get. One thing in my favor is that since this language is based on a small number of simple things that interact in complex ways, I could probably get a minimal implementation going without too much difficulty (especially if I don’t go for types on the first pass).

Oh, one very final thing: I never said what all of this is called. In my head I’m planning to name this language either Emily, because despite being fundamentally OO it strikes me as a fairly “ML-y” language; or Emmy, after Emmy Noether. I’ll decide which later.

That’s all.

Note to commenters: Harsh criticisms are very much welcomed. Criticisms based on demographics or assumptions about my background are not. Thanks.

Sweet Nothings

Thursday, January 31st, 2013

A collection of small art/music toys I’ve made over the last six months. Some are interactive, some are not. Included are “Brainfarm”, “CS/1″, “Devil’s Chord”, “A Cube for You”, “pxswap()” and “Sun Sets”.


All of these are improved versions of things previously posted on my anti-games page. I made the collection to submit to something called the “Experimental Gaming Workshop”.

Customer testimonials

“WHAT IS THIS MACHINE” — Porpentine, Rock Paper Shotgun

“I wonder if second one is broken; I’m not sure what the expected behavior is but the sound is static and only part of the screen animates…” — Hannah Greenwood

“Please turn that off, it’s really loud.” — Diana Heideman


Monday, December 17th, 2012

This is a game I made with Liz Ryerson. I programmed and Liz did all the actual work. We made this for Ludum Dare 25, the 72-hour jam; the theme was “You Are The Villain”. I shall leave it up to you if it fits that theme or not. It is very mysterious. Only Liz knows what it means for sure.


You can find the Ludum Dare competition entry page for the game here.

This game requires an OpenGL 2.0 compatible computer to run.

The World Hates You

Monday, August 27th, 2012

Made in 48 hours for the Ludum Dare 24 competition– theme: “evolution”. Accordingly, this game uses an evolutionary algorithm to generate the hardest platformer levels possible. As you play, the game reports back your progress over the internet to a server which breeds the hardest levels together and sends the results out to future players.

The game does not have a win condition. The purpose of this game is not to be won. The purpose of this game is to get progressively better at killing you.


You can find the Ludum Dare competition entry page for the game here.

This game requires an OpenGL 2.0 compatible computer to run.

Polyconsole: This is how I make games now

Sunday, August 12th, 2012

About a year ago, Ivan Safrin released his personal game engine under the name Polycode. I was impressed enough I began adopting the engine for all my personal game projects. However, a year after the release, it’s clear to me that although Polycode is a promising, powerful engine, it’s very hard to get started with. In hopes of doing something about this, I’ve got my own variant of the Polycode tools, which I would like to share with you. Using the tools below in place of the official Polycode distribution hopefully will make it easier to get started; make it easier to integrate Lua and C++ in a single game; and give you a much more up-to-date copy of Polycode than the (fairly out of date) one on the official website.

Below are three options– depending on exactly how dirty you want your hands to get– for using Polyconsole (my toolset). Each option includes support for Mac, Windows and Linux game development (although with options 2 and 3, there are currently complications for Windows users).

What’s this now?

Polycode is a general game engine with support for C++ and Lua. It supports a broad range of features out of the box, including pixel shaders, 2D and 3D physics via Box2D and Bullet, and the standard niceties like resource management and xml serialization. Support for networking, mobile devices, and a Stencyl-like IDE are being worked on and hopefully will be eventually added.

Polyconsole is my personal toolset for using Polycode– technically Polyconsole is an alternative to the “Polycode Player”, which is the official way to write Lua programs with Polycode. Polyconsole basically just loads a game written in Lua and runs it, along with a realtime Lua console for testing stuff out while you play:

If you run your game in Polyconsole you’ll also have access to some utilities I added that aren’t in standard Polycode, like a 2D scene loader for SVG files which you can make in Inkscape:

Here completely within Inkscape I’ve drawn a little 2D physics scene, set weights and such for the various objects, and in the case of the one red ball at the top I added a Lua collision handler (it prints “Bang!”). On the right you can see the aftermath where Polyconsole has loaded all this up.

These are simple examples, but I’ve used this basic framework to make some complex and varied things very rapidly. Here’s some stuff I’ve made with Polycode and Polyconsole this year:

So, how does one use this?

Option 1: I don’t know what “compiling” is, I just want to make games

Below you can download prebuilt versions of Polyconsole for Mac, Windows and Linux, and just start writing games with Lua:

How to use this

Polyconsole loads the game it’s running out of a file named “media.pak” (actually a zip file) in its internal resources. It has a mod system though, whereby if it sees a file named “” or a directory named “mod” in the same directory with it, it will selectively replace the files in media.pak with the “mod” files.

The version of Polyconsole above is set up so you can just write a whole game in the “mod” directory. In this package, the “mod” directory is preloaded with the contents of the sample program in media.pak. The package also has two extra magic files set up for you: “settings.xml” and “debug.xml”. The settings.xml lets you set the size of the window Polyconsole runs in. (Without this file, the game will just run full screen.) The debug.xml on the other hand, just by being there, causes Polyconsole to run in debug mode. This means you can access the Lua console, by hitting tab and then typing (you can hit tab again to hide it). Also in debug mode if you hit “esc”, all the current scripts, textures, images etc will be reloaded immediately from disk.

If you open up the mod/media directory included along with this Polyconsole, you’ll find the following:

init.txt: This describes the initial “room” that Polyconsole loads.
example.svg: This is the little sample physics scene in the demo.
material/: This directory contains textures and shaders, and the basic.mat xml file that describes them.
overlay/: This directory contains all your Lua.

Games in Polyconsole are made up of what I call “overlays” and “rooms”. An “overlay” is just a little package of Lua code– it contains one Lua script to run when the overlay loads, one Lua script to run when the overlay is closed, and one Lua script that runs once per frame as long as the overlay is up. A “room” on the other hand is a list of overlays and SVG files which Polyconsole loads all in a bunch. So for example in an average game I might have one room for the title screen, one room for the game over screen, and one room for the game itself. Alternately I could make each level its own room, if that makes more sense for the game. The sample Polyconsole program linked above contains only one room, which (as you can see if you look in init.txt) loads an overlay named “game” containing the code for the demo, the example.svg that describes the 2D scene, and standard “startup” and “shutdown” overlays. (You’ll always want to make the standard startup overlay the first thing in a room spec and the shutdown overlay the last thing in a room spec, because these are used by Polyconsole’s memory management.)

So to very quickly get started with Polyconsole you can download one of the above packages; open the exe; and while it’s running, open the mod/media/overlay/game folder and edit the onLoad and onUpdate scripts included within. Then whenever you change a line, you can go back to the program and hit “esc” to see your changes. (If it didn’t work, you can hit “tab” to see the console, where any errors will be printed.)

If you want to know how to write Polyconsole code, the following documentation should be helpful:

There’s also a rudimentary help system built into Polyconsole if you type help() at the console prompt.

You can find examples of Polyconsole code by looking at any of my games since “Markov Space”, or skimming of my in-development games I haven’t finished yet. They’re all built with this framework, and they all come with source code (and even if they weren’t, you could pull out the lua source just by pulling out the media.pak and unzipping it). Some of these are under no-commercial-use licenses, but if there’s something you want to copy out of one of my games just let me know and I can relicense it.

When you’re done

If you decide to write a game as a mod folder, once you’ve completed your game you can do the following to make something releaseable:

  • Zip up the mod folder into a “” file, and remove the “mod” folder itself
  • Remove the settings.xml and debug.xml files
  • Change the names in the Readme
  • Rename “Polyethylene” to something not stupid

And you’ve got a game! You can then port your game to [Mac, Windows, Linux] by downloading one of the other two packages above that you didn’t install, and dropping the in.

Option 2: I want to combine C++ and Lua in a single program

Writing these Lua mods will only get you so far, though. To me the entire appeal of Polycode in the first place is that I could write a game in Lua, getting the ease of use of something like Flashpunk– but if I really *needed* the power of C++, I could write C++ extensions that did whatever strange thing I wanted. I’m not restricted by what the developers of the VM or the browser plugin or whatever thought to put in, because Polyconsole is the Lua “VM” effectively and I control that. If you want to do this too, you’ll need to compile your own copy of Polyconsole.

To make that easier, here’s the source code to Polyconsole, packaged in each case with a complete build of the Polycode library. (The source is identical between the three versions, the difference between the three versions is which version of Polycode is included.)

With complications? Wait, what’s that about? Well, there’s a problem with the Windows version right now. Most people writing C++ software for Windows use Microsoft Visual Studio. But I don’t have a copy of Windows, so I don’t use Visual Studio. Instead, I use MinGW, which is an open source compiler that makes Windows exes and can be run on any operating system you like.

But: It turns out that if you have a library built with MinGW, it can’t be used with Visual Studio, or vice versa. Worse, not only are MinGW and Visual Studio incompatible, but each individual version of MinGW is incompatible with every other! So, the Windows Polycode I have linked above was made with MinGW32 version 4.3.0. That means you can only compile that version of Polyconsole with MinGW32 4.3.0– nothing else. (The actual, final .exe games made with this system will run on any Windows machine.)

So I’ve got a list of instructions I wrote up for how to install MinGW, MSYS and Python so you can compile Polyconsole on Windows, but when I got someone to test them, they didn’t work– because you can’t get an installer for MinGW32 4.3.0 on Windows anymore, and the versions accessible with their snazzy new mingw-get system aren’t compatible with 4.3.0. I’m working very hard on trying to find a solution for this– I need to either get a modern version of MinGW32 that runs on Mac or Linux, or I need to get an installer of MinGW32 4.3.0 for Windows I can give you, or I need to get a copy of Windows so I can just build the darn thing with MSVC. When I’ve figured out a solution, I’ll come back here and post it– any suggestions you might have are welcome. But I don’t have a solution yet. Sorry.

In the meantime, hilariously, you can build a Windows version of Polyconsole with the Windows package linked above– if you have a Macintosh. You just have to run the MinGW-for-Mac installer linked here. (This is what I do with my own games.)

How to use this

The first and most important thing I can tell you is use version control. Take that whole package you just downloaded and add all the files to a hg or git repository. Otherwise, the game bits you wrote will get all mixed in with the Polyconsole code, and it will be really, really hard to remember what you changed or upgrade to a different version of Polyconsole if you want to do that later. If you don’t know how to use hg or git, on Windows you can make this really really easy by downloading the Tortoise programs. On Mac or Linux, you can just go to the Polyconsole directory in the Terminal and run: hg init .; hg add .; hg commit -m "Initial"

With that out of the way, assuming you can even install the appropriate compiler, the rest is pretty easy. If you’re on a mac, open PolycodeTemplate.xcodeproj and build and run. If you’re in the Windows version, navigate to package/win and run make. If you’re in the Linux version, navigate to package/lin and run make.

The one other thing you do need to know is how to add C++ functions that are callable from Lua. When you build Polyconsole, it will run a script that’s part of Polycode called create_lua_library that will suck out parts of your program and make them visible to Lua. You need to tell it which parts of the program are supposed to be Lua-visible. To do this, you need to edit the file lua_visible.txt in the base of the repository to add the names of every header file you want the script to look in, and also edit media/project_classes.txt to add the names of any classes you want Lua to be able to use. You don’t want to just add all your header files to lua_visible.txt, by the way, because create_lua_library can sometimes get confused if you try to feed it overly fancy stuff.

Mostly, I don’t mess with that and I just add methods to the one class that’s already set up in there to be visible to lua– “project_bridge”. This is found in source/bridge.cpp and source/bridge.h; at startup, Polyconsole sets up an object of type project_bridge visible to Lua under the name “bridge”. So you can just add a new method void arglebarf() {} to bridge.h, recompile, and suddenly Lua code will be able to call bridge:arglebarf().

When you’re done

The makefiles attached to these packages make complete finished versions of Polyconsole, so there you’re pretty much done. On mac though you’ll want to run ./package/ from the project root to make a “Release” version with a readme and pretty stuff.

You also might want, before you compile, to run this magical perl oneliner which will rename the sample program from “Polyethylene” to something more sensible:

perl -p -i -e 's/Polyethylene/YourGameNameHere/g' * */* */*/* */*/*/* */*/*/*/*

Option 3: I want to do everything myself

What I originally intended with Polyconsole was that instead of writing a mod to this “Polyethylene” example program, or having to download a whole compiled version of Polycode from me, was that Polyconsole should just download and compile Polycode for you for all the different operating systems you want to support. You can still get this, if you download Polyconsole from the Bitbucket page.

There’s rather a lot of documentation at the Bitbucket page explaining how this works, but basically, I wrote a little program called that automates building Polycode; and makes it easy to make your own modifications to Polycode by keeping a binary cache of Polycode versions you’ve previously built, and keeping track of which Polycode version goes with which SCM checkin. Unfortunately, in practice the script wasn’t nearly as easy to use as I’d hoped, and when people tried to run it on their own computers unexpected things went wrong. (Also, it doesn’t work at all on Windows right now, although I think someday it will.) So unless you think you’re willing to tinker a bit and debug CMake problems if something goes wrong, you might be better off downloading one of the Option 2 packages above.

TODOs and alternatives

So: This is still all in progress. This is all a little jumbled. I’ll be continuing to make improvements to this. Here are the things at the top of my To Do list:

  1. I need better Windows support.
  2. This should all be documented more cleanly than it is.
  3. Rooms should be able to include 3D physics scenes– not just 2D ones.
  4. A lot of little features and cleanup are needed. Right now Polycode programs crash when you run them on a computer with an OpenGL 1.x video card; these should be supported, or at give you a polite error message. There should be an onClick handler for overlays. I think vsync is broken on Windows. There should be an option to use PortAudio for sound or at least do realtime-generated sound with OpenAL. And so on.

If you’re interested in helping push along Polycode development yourself, Polycode’s github page is here (my fork is the one marked “mcclure”) and as linked above I have a BitBucket project for Polyconsole here.

My goal here is to make something people can actually sit down with and use for development. So if you feel what’s linked above isn’t enough to be a usable development starting point in your view, feel free to post below and let me know what you think is missing.

In the meanwhile, you also may want to be aware of these alternatives to Polycode:

  • Gameplay3D– Haven’t used this but this is something RIM released, it seems to have many of the same features as Polycode including Lua scripting support and it supports mobile.
  • Panda3D– Haven’t used this either, again offers about the same features as Polycode, seems a lot more heavyweight than either Polycode or Gameplay3D but also very complete.
  • Unity3D– This is the gold standard in this kind of engine right now, and unlike Gameplay3D or Panda3D I’ve actually seen indie games using it. It is a little larger in scale, it comes with a VM so that you can run code written in C# or Javascript in it and it has a fancy IDE with a built in scene and 3D model editor. However, unlike everything else on this page, it is closed source and costs MONEY.
  • Love– This is a 2D-only, Lua-only game engine. I’ve heard really good things about it, and it will definitely be simpler than Polycode.
  • Jumpcore– This doesn’t much resemble the other items on this list, I link it only because I made it. This was what I used to make games before Polycode. It’s a really stripped-down, C++-only library that’s really more just a wrapper for porting SDL games to iPhone/Android, but if you are looking for a simple C++ starting point this might be useful for you.
  • Flashpunk and Flixel– These are some great 2D gaming Flash libraries. They offer programming interfaces like Polycode’s, but they’re really mature and have really strong communities. Since they’re Flash, you’re limited to what Flash can do and where Flash can run, but that’s increasingly “everything” and “everywhere”. I’ve got some brief experience with FlashPunk and liked it.
  • Pixie– Also a little different from the other listed items, but: These guys wrote a Stencyl-style editor for HTML5 games, in HTML5, so you can make a tiny game on their website and upload it right there. It seems really cool.

That’s all I’ve got


Wednesday, August 1st, 2012

This is a video game where the only way to interact with it is to write and execute Lua source code.


I made this back at the beginning of the year for a Super Friendship Club event, but at the time didn’t release it because I was trying to decide whether to expand on it. I think I’ve decided it’s done. The game requires an OpenGL 2.0 compatible computer to run.

I’ve been circulating this game in indie circles since I first wrote it, and to my knowledge no one has ever beat it. I got reports of a couple of people getting to level two.


Sunday, April 22nd, 2012

Made in 48 hours for the Ludum Dare 23 competition. The game is over when the music stops.


You can find the Ludum Dare competition entry page for the game here.

The game is based on this cellular automata. It requires an OpenGL 2.0 compatible computer to run.

My Own Footsteps

Sunday, December 18th, 2011

Ludum Dare is a periodically held competition to make a game in 48 hours. For this weekend’s Ludum Dare (theme: “Alone”), I waited until the last minute, started a game Sunday morning, and over about seven hours banged out the first and only 3D game I have ever made. It’s called My Own Footsteps and sound is recommended.

Thanks to amon26, Mark and Stephen for playtesting.

You can find the Ludum Dare competition entry page for the game here.


Saturday, October 15th, 2011

Found out yesterday that (1) a group of indie games folk are crashing the IGF with a “pirate cart” of over a hundred indie games, and (2) Klik of the Month Club is doing a competition today with the theme “violate Atari’s intellectual property”. Of course I had to do something, so I made this little 2-player game. I started it last night at about 10:30, and finished it this afternoon at about 2:30. So that would be about 16 hours, some of it spent sleeping.

It’s like pong, but with more pong.


Playing with sound is recommended.

You Don’t Fit

Sunday, August 21st, 2011

Here is a video game I made in 48 hours for the Ludum Dare competition. It is called You Don’t Fit.


You can find the Ludum Dare competition entry page for the game here.

Thanks to everyone who playtested, especially Kevin, Alice, and Yakul.

By the way: This game supports user-created levels using Inkscape. Check the comments below for instructions on how to do this.