Statement of Intent
April 22nd, 2014I think I have decided that these are the things I most want in a programming language:
- 1a. Objects with string keys, which double as dictionaries.
2a. Closures.
3a. A way to store a closure in an object and invoke it with the relevant object bound to some variable within the function (“this”).
4a. Prototype-based inheritance for objects.
5a. The potential to instantiate a closure, or an arbitrary singleton object, inside of an expression (“lambda”).
6a. Gradual typing (with calls from untyped code into typed code being protected by something like a check and an exception throw at runtime).
7a. Two-way communication with C++ (C++ objects may be visible within the language, language objects may be passed into C++ code).
8a. Two-way communication with Javascript.
9a. Support for multiple concurrent threads of execution, with interaction between by means of message passing.
10a. It must be possible for me to develop a program on one operating system and execute it on another operating system (i.e. there is a VM or a cross compiler).
And here are some things that I think would be very nice in a programming language:
- 1b. Pattern matching.
2b. Syntax support to allow me to declare curried functions, as transparently as if they were non-curried functions.
3b. Function invocation with named arguments (i.e. Python keyword args).
4b. Suspension and resumption of stacks (coroutines or generators).
5b. Atoms (interned strings which can be used as object keys without paying for a string lookup).
6b. Properties (i.e. overloading of assignment).
7b. The language itself should be ultimately defined as a AST, with multiple possible syntaxes reducing down to that single AST.
8b. Language features (ints, exceptions) should be possible to enable or disable for individual programs or pieces of code. Not all features fit all projects.
9b. There should be a potential for (arbitrary and user-defined?) type “adjectives” which are assertions that are remembered as part of the type and checked as part of the type checker. (Is this object mutable? Is this object mutable with regard to its set of keys? Does this function have side effects? Are these two properties recursive for the related object/call graph? Does this function ever use “this”? Is this object ever accessed with a non-atom key? Is this integer greater than three?)
10b. Potential to directly manipulate the scope of a closure (I.E. mappings of unbound variables in the function body) after function declaration.
11b. The collapsing of functions and objects. (“Look up key ‘f’ on object a” and “Invoke method a with unary argument ‘f'” should be the same operation, same syntax, same behavior. Function scope and prototype chain should be, insofar as the user can tell, the same mechanism).
12b. Customizable automatic casting rules (if we have 1b, applied by a user-selected pattern-matched function).
13b. Syntax support for object “classes” that have the familiar semantics of Python or Java. Ideally, classes would just be a pattern implemented on top of prototypes, but the syntax should make this pattern easy and the class identity should be visible to things like debuggers.
14b. Some simple user-configurable syntax redefinition (like the ability to define an operator ^ where invoking 3^4 should be actually interpreted as exponent(3,4), and ^ has a user-defined operator precedence, etc).
15b. The ability to have some sort of object which is ultimately represented at runtime as a packed memory buffer, such that saying “set element 4 of x to 3” is actually writing “3” into a specific well-defined byte or bytes in memory. (This is a useful thing both because of performance optimizations it makes possible, and for interacting with hardware like GPUs which only understand byte streams.)
I would like to explore ways to get to a point where I have a programming environment that satisfies everything from the A list and some emotionally satisfying number of items from the B list. In seeking such a language, there are incidentally some conditions that any language I spend time using will have to satisfy. I don’t want to have an argument with anyone about this final list; just think of them as personal preferences.
- 1x. It must not require me to program with S-expressions.
2x. It must not require the use of the JVM or CLR. (I am potentially willing to compromise on the CLR.)
3x. It must not force me to program in a way which avoids the use of side-effects or state.
4x. It must not require me to use any closed-source compiler, core library, or development tool.
No language currently exists which provides everything on my A list. Some of the items on the B list, particularly 10b and 11b, do not exist in any language that I am aware of. Most of the most interesting items on the B list only seem to exist in languages which fail one or more items on the X list. Any currently existing language will be a compromise. How fair do these compromises look?
If I use Lua, I am happiest. Lua gets me 1a, 2a, 3a, 4a, 5a, 10a, 4b, 6b, and 13b, 15b if I’m using Luajit, and since I have 5a I can awkwardly approximate 3b. No language I know of does 1a better than Lua. Unfortunately, 7a and 9a, which Lua does *not* support, are two of my *most* important conditions– I literally can’t develop without those two, whereas missing 1a through 5a will merely make me sad.
- (Lua gets something close to 7a– two-way communication with *C*– and two-way communication with C can be used to construct trampolines that *approximate* communication with C++, but in my experience using this technique is so frustrating I would rather avoid Lua altogether than attempt it again. There are also incidentally projects which purport to offer 8a for Lua, but I have not seen these projects actually working and I believe that they would need additional development before being actually used.)
If I use Javascript, I do get 1a, 2a, 3a, 4a, 5a, 8a obviously, 10a, and 6b. If I am using Javascript ES6, I also get 9a, 3b and 4b. Actually, Javascript does pretty well against my lists. Unfortunately, some of the personally important items on the list Javascript satisfies, it messes up quite badly. For example it offers 4a, but uses a baroque method for defining an object with a prototype, and it does not offer any way when overriding a prototype method to invoke the “super” method. That last one’s pretty huge; “super” is a basic feature of all object systems which, since it is present in Self, I would argue without “super” you don’t have a prototype-based inheritance system at all. On similar lines, Javascript offers 2a but essentially ruins it by not (before ES6) offering proper block scope; if I can’t readily create a closure in (for example) a loop, that is so awkward I probably won’t ever use closures at all.
There is another problem with Javascript: It is an unending cavalcade of horrors. Javascript feels *shoddy*. Many extremely basic language features are riddled with weird exceptions, and exceptions to exceptions, such that you never feel certain what the code you have just written does; many kinds of simple typos result in a silent failure or unexpected behavior rather than any kind of error. (The type coercion rules alone would feel like a great argument against using Javascript even if everything else about the language were perfect.) I very much like what is in this toolbox, but the individual tools all feel as if they will fall apart in my hands if for example I hit anything too hard with that hammer. This shifting-sand problem is exacerbated by the dramatic variance between Javascript versions and implementations; the primary benefit of Javascript is its huge installed base, but only a small subset of the syntax is “safe” in the sense that an acceptable portion of your installed base is compatible with it. This is an unfortunate property when we are talking about mere syntactical convenience features: 3b and 4b are nice, but I would not literally give up users to get them.
- (I have not investigated if Javascript satisfies 7a).
If I use C++, I get 4a (or rather 13b, class-inheritance alone, which is close enough for me), 6a (not gradual typing but strong typing, which I don’t mind), 7a (sort of– assuming you never dynamically link against anything from a competing C++ compiler), 9a (with memory sharing instead of events, but you can add the events via a library), 10a (with *great* difficulty), 15b (although since 15b is *all* C++ offers for data storage, it isn’t very pleasant), and an extremely limited 14b that can be used to get 6b (although only for some kinds of variables). There is also a horrible, confusing mockery of 12b which really the language would be better off if they had not tried to offer this feature. If I use C++11, I get 2a, but it’s gross (the closures capture variables manually, rather than automatically; they’re only created if you use the “lambda” feature, which is totally separate from any other kind of function or method; and the syntax is incredibly ugly). C++11 also has “auto”, which is… one step closer to my ideal form of 6a than strong typing alone would be, I guess. Overall C++ and C++11 don’t offer *anything* exactly the way I’d most prefer it, and the only items on my list I’d even feel C++ is really “good enough” at are 4a/13b and 9a.
- (At this point one notices something interesting about my criteria 7a: Not even C++ does very well against it. This is a hard criteria to meet.)
I’ll leave it as an exercise to the reader to test my criteria against Python, ML, Haskell, and Erlang.
Looking at my options, I feel that if I could somehow hack Luajit to offer 6a and 7a and also support Lua Lanes (which provides 9a), I’d have a language I were totally satisfied with. I spent a while seriously considering doing just that. However, Luajit is big, and 7a in *particular* is such a hairy thing to implement that I wouldn’t want to attempt it in someone else’s codebase.
So I think if I want a language that satisfies all of these things, it would have to be a new language. Along those lines, I have written a follow-up post here.