Saturday, October 31, 2015

Log 110115

Read up on R. Okay.. Read the sources of the Hugs interpreter, again. Did some programming. Man, I am slow.

How useful is Hot-Swapping?

So, turns out WhatsApp is build using FreeBSD and Erlang. And one of the discerning qualities of Erlang is that you can hot-swap modules. So that prompted the question: Do I want to be able to support something like this, in principle if not in practice?

I have a construct in the language which I want to compile out but which crosses module boundaries. Now what? If you cannot, or shouldn't, compile it out; you probably need to support it, possibly all the way to the runtime. Which is annoying.

How useful is Erlang hot-swapping anyway?

Friday, October 30, 2015

Log 103015

Well I didn't do a lot today except think. Compiling out try/catch/throw will go down the book as just another silly idea I once had. This looks like it was the last possible show-stopper in the interpreter. There is only work left, and today I certainly didn't feel like that.

I posted an idea on LtU about back-compiling a half-way evaluated expression to a program; that looks to be feasible and interesting. Not sure it is that useful though.

Thursday, October 29, 2015

It is that simple

I am leaning towards: If you can't compile it out (desugar it) trivially, then you probably shouldn't.

It ain't that simple: Exceptional Control Flow

Story short: The current scheme for translating try/catch to a set of combinators is (mostly) correct, but leaks since curried combinators might escape a try/catch block with exception handlers still referring that block. Seems I made a bad call here.

So, I need to rethink this. And since I am now provably 'Not that Smart'™, I'll do it in this blog. First, a number of observations:

There is a translation scheme of try/catch to an 'unsugared' term. (I use the term 'unsugared' loosely for a source code to source code translation.)
This is trivial, any calculation can be translated to SK only if need be.

If we can unsugar try/catch, we should.
It greatly simplifies the runtime if I only need to think about combinator applications. There is the possibility that I simply shouldn't try to unsugar this, to keep the operational model clean, but that's unlikely.

The translation of try/catch to an unsugared term is a 'must have.'
I want to do combinatorial rewriting, mostly to see where I'll end up, but also to experiment with what it enables. If anything, if this is to be a simple slow toy, it should enable certain functionality trivially usually not found in other languages. If all computations are simple terms being rewritten we might enable:
  1. Trivial low overhead concurrency. Concurrency deflates to interleaving evaluation steps amongst any number of terms.
  2. Trivial suspension of computations. A term could, for instance, be stopped, written to disk, and resumed at any given point.
  3. Trivial mobile code. Any term can be serialized, send to another place, and be resumed there.
The last one is a must have. I want to see whether I could create a better R, or Python, for data centre applications, and the ability to move code to where the data is is a must have. In Spark, this is achieved by either serializing Java bytecode for Java/Scala, or pickling Python objects, and sending the abstract calculations to their destinations. I want this.

The unsugared term must allow for efficient evaluation.
Okay, here's the real catch. The most trivial source to source translation is simply having any calculation return a value from a disjoint union; i.e., any computation returns either a normal value or an exceptional result and either proceeds normally or passes on the exception. But the extra checks are wasteful, blow up all terms, and are inefficient in that they also need to 'pass up' exceptional results instead of simply 'tearing down a stack'.

The design space with magical combinators is limited.
Since we don't want the above translation, that warrants the idea that we should introduce 'magical' combinators which can modify the control flow.
In a term rewriting system, modifying control flow boils down to being able to rewrite the term in unexpected manners (and continue evaluation somewhere else).
For instance, a 'magical' combinator could have a value pop up somewhere else in the term, or erase and substitute entire terms in a non-normal fashion and then force evaluation elsewhere.

And we have two rough solutions for these 'magical' combinators: either do it more or less 'purely', by the creation of control flow objects which are passed around and hold information about what part of the term to rewrite, or do it 'impurely' with combinators which through side effects manage and query external tables referencing the term.

Of course, I prefer the first solution but fell flat on my face there. In the case of try/catch, since control flow objects might escape the local context through returning curried applications, those control flow object cannot be that 'pure' as I imagined (this is solvable, however, by making them impure; i.e., the control object is an impure shared stack of contexts) Moreover, passing around control flow objects is again wasteful as it usually passes an extra argument around.

So there you have it. I want 'pure' terms but for performance reasons I should give up on that idea and use side-effecting impure combinators; i.e., all terms are evaluated within a term referencing context modified and queried by 'magical' combinators. Which should imply a simple scheme where I introduce combinators for try/catch and throw which impurely modify, or extract information from, some tables. Tada, this bites again with low overhead concurrency or mobile code.

So, now I wrote it down. Unsure what to do. It ain't that simple.

Wednesday, October 28, 2015

Log 102815

Various. Thinking about a simplistic runtime. Since I won't be compiling but interpreting, the size of various things are now run time variables, not compile time constants. And C++ doesn't have variable sized (array) structs. Doesn't exist, period. Which implies I cannot implement natively reference counted thunks without doubling the amount of allocations it does, one for the thunk, one for the variable sized array of arguments. Bad luck I guess.

I'll need to extend the try/catch transform and strengthen an invariant. Bloody currying... 

Tuesday, October 27, 2015

Log 102715

I implemented the transform which compiles out try/catch and throw blocks. The aim for the interpreter is to become a useful 'toy', very minimalist with an interactive mode for implementing small mathematical scripts.

The interpreter should be implemented excruciatingly simplistic.

The catch/throw compilation step is not entirely trivial and bites with a possible interactive mode. I need to think over how to implement the interactive mode upfront. I am too used to batch compiling things.

Thursday, October 22, 2015

Compiling out Try/Catch/Throw

Below a piece of code I need to understand again. Screw me, not the foggiest.

namespace lambda (

    using system
    using list

    def reset: lambda = lambda_comb "_reset"

    /** The untry transform compiles out throw/catch statements by 
     *  supplying the 'exception handler' as the first argument to all
     *  functions. It also introduces a reset combinator for try/catch
     *  blocks.
     *
     *  - A definition gets one extra argument.
     *    .. F _ (def f = e) = def f = \x -> F x e
     *  - A any combinator is passed the handler.
     *    .. F x (X) = (X x)
     *  - A throw becomes an application of the exception handler to 
     *    the supplied value.
     *    .. F x (throw e) = x (F x e)
     *  - A try/catch block introduces the reset combinator
     *    .. F x (try e0 catch e1) = R (\s. (\y. F y e0) (\z. s (F x e1 z)))
     */
    def untry_transform: lambda_transform lambda =
        abs = [ v, l -> lambda_abs (list.single v) (lambda_alts (list.single l)) ];
        [ ef, lambda_def f l ->
            x = lambda_fresh_name nop;
            (_, l) = untry_transform x l;
            l = abs x l;
            l = lambda_def f l;
                (ef, l)
        | ef, lambda_comb t ->
            l = lambda_app (lambda_comb t) ef;
                (ef, l)
        | ef, lambda_try l0 l1 -> 
            y   = lambda_fresh_name nop;
            s   = lambda_fresh_name nop;
            z   = lambda_fresh_name nop;
            (_, l0) = untry_transform y  l0;
            (_, l1) = untry_transform ef l1;
            l0  = abs y l0;
            l1  = abs z (lambda_app s (lambda_app l1 z));
            l2  = abs s (lambda_app l0 l1);
            l   = lambda_app reset l2;
                (ef, l)
        | ef, lambda_throw l -> 
            (_, l) = untry_transform ef l;
            l = lambda_app ef l;
                (ef, l)
        | ef, l -> 
            lambda_child untry_transform ef l ]

    def untry: lambda -> lambda =
        [ l -> (_, l) = untry_transform (lambda_comb "nop") l; l ]

)

Catanastrophe: The Catagorical Mistake

I'll admit it, I am pleased with myself. I predicted on LtU that Haskell's fixation on category theory would lead nowhere, and some people now seem to agree.

The problem: software engineering concentrates on analyzing and describing rich (algebraic) structures whereas category theory concentrates on the trivial, the superficial, structure of theories. There's nothing there to lift, except for the trivial, which you probably could code in half an hour anyway. So you end up with incomprehensible theories on elaborate types providing all kinds of annotations describing structure and functionality you'll probably never use, where you could code the functionality you need yourself, but are forced to comprehend all the trivial observations anyway.

Of course, Haskell is a research language, a moving target, and I've no problem with being proven wrong somewhere in the future. But for the moment, I'll assume I was right and now other people start to find out and agree on that.

So maybe it's not such a bad idea to have an untyped functional language?

Log 102215

So. Progress is slow but steady. Progress so far: provisional Unicode lexer (doesn't handle all cases), provisional parser (doesn't handle operator precedence right), provisional semantics checker (doesn't check all invariants), basic support for desugaring (I am now implementing all cases).

Desugaring is simple. I am going to desugar guarded rewrites, which will make them very expensive but I don't think it's really worth supporting them at this stage, or ever. The motto will be: You can use guards but please don't if you can avoid them.

I am going to drop LLVM support and will write a rationale about that in the next post.

I am implementing in C++, not a bad choice but I somewhat regret having dropped Java (bytecode) support.

First Post

First post for the Egel Language, an untyped eager scripting language.