Egel Language: July 2017

Monday, July 31, 2017

A short post on Concurrency

Right. So, I guess I am going the standard route to concurrency in a rewrite system and implement a 'par' primitive. I think it can be done without changes to the runtime.

'par f g' should do the following things:

Create a node for the results of evaluating the two arguments.
Start two threads which concurrently evaluate 'f nop' and 'g nop' which place their results in the node after being evaluated. Note, a dummy argument is necessary since otherwise both arguments would be fully reduced.
Block until the arguments are fully reduced.

Then there is exception handling. What to do if either thread returns an exception? There are two solutions.

The simplistic manner. Just place the exception thrown in the result node.
The right manner. As soon as either thread throws an exception, stop the other thread, and throw that exception upward.

There are no users, I expect no users, I'll do the simplistic manner.

Object Orientation

Since I have mutable state and records, an object oriented (OO) model is possible. The 'problem' is in late binding of methods, mostly. Below, a solution, though there are several approaches. It is split into two parts, general definitions which introduce a few combinators for constructing and updating objects and a few examples.

/*
 * There are several manners in which you can try to emulate object
 * orientation (OO) in Egel. This is one manner.
 *
 * An object has a mutable state and a collection of named methods. 
 */
import "prelude.eg"

// a namespace OO which introduces the 'object' datatype and 
// accessors.
namespace OO (
    using System

    data object

    def object_get_state =
        [ object STATE MM -> get STATE ]

    def object_set_state =
        [ object STATE MM, S -> object (set STATE S) MM ]

    def object_get_method0 =
        [ cons NAME0 (cons METHOD MM), NAME1 ->
            if NAME0 == NAME1 then METHOD else object_get_method0 MM NAME1
        | {}, NAME1 -> NAME1 ]

    def object_get_method =
        [ object STATE MM, NAME -> object_get_method0 MM NAME ]

    def @@ = 
        [ SELF, NAME -> (object_get_method SELF NAME) SELF ]

)

using OO
using System

// examples of a box and a circle class.
def box =
    [ W, H -> 
        object (var (W, H)) 
        { "resize", 
          [ SELF, WW, HH -> object_set_state SELF (WW, HH) ],
          "area",
          [ SELF -> [(W, H) -> W*H] (object_get_state SELF) ] }
    ]

def circle =
    [ R -> 
        object (var R) 
        { "resize", 
          [ SELF, R -> object_set_state SELF R ],
          "area",
          [ SELF -> [R -> (tofloat (R*R))*3 ] (object_get_state SELF) ] }
    ]

def main =
    B = box 1 5;
    AREA0 = B @@ "area";
    N = (B @@ "resize") 2 3;
    AREA1 = B @@ "area";
        (AREA0, AREA1)

This just shows it is possible to have OO in the graph rewrite system. There are other approaches. The named fields could be introduced as constants, the attributes and methods could be coalesced into one list, or map, of fields; i.e., essentially a named record. And some system could be supported through syntactic sugar.

It's good enough for now, I rather fix concurrency.

Sunday, July 30, 2017

Log 073017

I hardly needed to debug the Unicode bug, it seemed to have been a simple matter of using UChar instead of UChar32 which I noticed after just glancing over the code. The floating point bug was just a matter of setting the precision right on the output stream. Still took me about four or five hours; not sure why, I might have been unmotivated.

I cleaned up some example code. There's still a lot of stuff to implement, none of the basic libraries are up to par, I lack proper text handling primitives and I need to add concurrency.

Moreover, of course, Python is still the better tool because of OO. But I got variables, which means mutable state. I might be able to support a trivial OO model? Unsure. Then again, mutable state is already more than most small lambda calculus based OO encodings... Lots to do. I think I'll go for concurrency first, that seems the most interesting.

Saturday, July 29, 2017

Log 072917

There are two things I need to get right before the Egel interpreter is in any state of being made public. One is handling of Unicode, one is handling of floating point numbers.

The problem with Unicode is severe since it doesn't reproduce consistently.

[marco@stallion src]$ ./egel

>> "asfdas"
"asfdas"
>> "hello world!"
"l"
>> "!"
"!"
>> "h!"
"h!"
>> "hello world!"
"ɕ"

I am simply not a Unicode expert. Of course, the code looks right -just a number of conversions- but I simply don't know what goes wrong here.

Then there is the floating point handling.


[marco@stallion src]$ ./egel 
>> using System
>> 131.153535
131.154
>> 0.12341424 / 132414412.0
9.3203e-10

Somehow the float conversion doesn't seem to produce a 64bit representation. Or, it looks like it has an internal 64 bit representation but the conversion from and to a double seems to assume less precision. At least, this bug produces consistent behavior. Guess I am stuck for a few days. For the Unicode bug I might need an expert, though.

Friday, July 28, 2017

Local Variables

So, the following trivial solution would give me local variables with a few simple combinators.

[ X -> Y = var X; _ = set Y (get Y + get Y); Y ] 7

If var stores a tree, then the result is a tree. If get retrieves a previously stored tree, then the result is a tree. If set stores a tree, then the result is a tree. If I am wrong, then f*ck it, and I am gonna call it an unsafe feature.

Log 072817

I implemented some things - an environment variable for the include path, a build script, conversion routines, a prelude file - and debugged those. I failed to find a compiler switch to solve a problem with late resolution in shared libraries, solved it in another manner, and found the compiler switch later anyway.

Almost all functionality is there except for assignment. I first thought I would implement assignment on global variables only but it dawned on me that I have been wrong: it should be possible to have assignment on local variables also. I need the computation graph to be acyclic, but it looks it is sufficient to have assignment on variables only; as long as I don't update fields that should satisfy the constraint too.

Maybe take a small break and think on that. I hope I don't need a complete overhaul.

Thursday, July 27, 2017

Log 072717

Observe the pretty date! I haphazardly moved some code around and implemented further functionality for primitive input/output and basic math libraries. Of course, there were -and still are- some small bugs but the scheme I use seems to work.

So what is left is to make both libraries conceptually neat, and move on and implement a 'prelude' for the interpreter.

Wednesday, July 26, 2017

Log 072617

Well, the interpreter prints. That is, the scheme I've got to implement primitive input/output and a math library compiles and the interpreter is able to run dynamically loadable libraries.

It's a bit rough around the edges and I need to move some code around still, but it works.

Tuesday, July 25, 2017

Excel bindings for Egel?

So, I want to do this with Egel once.

Saturday, July 22, 2017

Log 072217

So, Thursday I wrote an introduction to the evaluation strategy I employ in the Egel interpreter. I moved that to the main website and started shaving, rewriting, and removing. Lastly, I updated the pictures a few times which took some work because Inkscape isn't very good at keeping a similar layout for different pictures. Now, it's Sunday and I am reasonably content with the result, though it might be somewhat informal to some people. But, ah well, this is the first time I had the opportunity to get half-baked 'feelings' right while loosely relating what I designed to other approaches so I guess I was entitled to wasting some time on it.

Meanwhile, I didn't code. I decided upon a C++ implementation for primitive input/output because reasons -callback hell, the convenience of Unicode support, minimal dependencies, C++ is reasonably portable, etc.- but that also implies a lot of work I can't seem to find the energy for at the moment.

I need to:

Implement channels in C++, an abstraction over several iostreams which is going to be pretty verbose in that it needs to support a lot of operations.
Write IO combinators employing that abstraction in a dynamically loadable module.
Test all of that. The channel abstraction, the combinators, and module loading.

Too. Much. Work.

The Unreasonable Effectiveness of Dynamic Typing

The Unreasonable Effectiveness of Dynamic Typing for Practical Programs from JavaZone on Vimeo.

Thursday, July 20, 2017

Operational Semantics of Eager Combinator Rewriting

I EXPANDED ON THIS POST AND ADDED IT TO THE MAIN WEBSITE.

I got a question about the operational semantics of the eager combinator rewrite system I use. It's embarrassingly facile and trivial to explain with two pictures, so here goes.

Say, you have a term language where you want to rewrite an expression consisting of combinators. In Egel's case, each combinator knows how to rewrite itself, therefore, corresponds to a procedure or some code.

You could use a stack and push stackframes for F, G, and H. However, I wanted to experiment with another model, rewriting. Note that since we're rewriting eagerly we're rewriting the outermost expression first.

The operational semantics Egel uses is extremely simplistic: Start with the node which is to be rewritten first, store the result where necessary, and continue with the next node to rewrite.

That's it. Note that the resulting graph is still a directed acyclic graph. That's what enables me to implement Egel's interpreter with native C++ reference counted pointers.

This is, of course, a slow and restrictive manner of evaluation. For one, stack based evaluators are simply faster because pushing data to, or popping from, a stack is less computationally expensive than allocating, and deallocating, heap nodes. Second, it's a term rewrite system so you can't allow for assignment since that would usually allow you to create cyclic structures.

The benefits of this model are threefold. For one, you don't run out of stack space which is really important in a functional language as I have experienced, unfortunately. Second, it is still a tree rewrite system so you don't need to care about tail call optimization. Third, you can reference count.

I simply like this model of evaluation, there isn't much more to it.

Notes: Yes, these are thunks or heap allocated stack frames. No, this isn't done with CPS since CPS is a transformation, not an evaluation strategy. Bytecode is generated directly for this model.

More: Egel also supports exceptions which is a natural extension of this model. It is left as an exercise to the reader.

Wednesday, July 19, 2017

Log 071917

I took a break after running into a few snags. For one, the Ocaml input/output (IO) specification wasn't very good. It neatly describes channels and cleanly separates them into input and output; that looks nice at first glance but what if you want both? I.e., a naive database application usually consists of seek operations followed by reads and writes. It's an odd design decision.

Then there is C++'s manner of working with stdin, stdout and stderr as well as istreams, ostreams and fstreams. They made the standard IO streams uncopyable! Implying you can only pass references but never copy them. Of course, you might be tempted to pass pointers but you run into another whole set of problems. For one, expect lots of template type errors, and also, you're not supposed to call delete when a standard stream goes out of scope.

Then there is the class inheritance design of C++'s streams. It's over-engineered to meet the requirements of an OO type system; here you also see that the type system forces a solution which is simply not that neat though multiple inheritance helps.

Looking at both specifications I can only conclude that C's POSIX file handles specification simply wasn't that bad.

I also started looking at libuv as the basis of file IO, and later multithreading. And I hit another snag: Callbacks don't neatly fit into a mostly referential transparant system. (Well, so far. Unless I come up with a scheme to make it fit.) The problem: Sure you can define a callback, sure it will read values from a pipe, but it can't do anything with it. It simply cannot inform another thread easily to continue evaluation with a calculated value. (Imagine multiple reading threads and one thread who consumes and writes the result back. It's difficult to combine results from multiple threads with one callback.)

I am tempted to roll out my own temporary solution for primitive IO. I am also tempted to ignore libuv and write a PAR combinator using C++ primitives. But libuv is cross-platform, well-maintained, scalable, and popular...

Sunday, July 16, 2017

Opaque Combinators

It's design time! As the implementation progresses I am working towards a solution for primitive input/output (IO) in Egel. As a design specification, I simply copied the, or one, interface of Ocaml's primitive IO libraries from a website. Keeping fingers crossed they thought that design through.

That means I am leaving purity behind. But then, if you want pure combinators, you can always build them upon the impure interface so not much is lost.

The bigger problem is that I now need to come up with some kind of scheme for impure 'stateful' combinators. I.e., a fix for file handles, raw pointers, matrices, etc.

So what have I got? For one, the symbol attached to a combinator isn't enough to uniquely describe it. Moreover, equality is problematic and needs to be handled elsewhere. The runtime can't look into the combinator since it simply doesn't know, cannot know, what it contains; but it needs a symbol attached it seems.

Right. A bit foggy but some solution is dawning on me.

Log 071617

Right. I managed to implement the N-Queens problem which the interpreter solves in around 13 seconds, GHC just promptly delivers the solutions.

That either implies I have a very slow runtime -which I know- or GHC sneakily also enjoys the benefit of lazy evaluation on this problem. I.e., I don't understand the solution I translated but the standard functional approach is to generate all board positions and filter out all admissible ones. GHC might benefit from lazy evaluation in that it doesn't need to produce all board positions but just needs to inspect whether a partly constructed solution is admissible according to the constraints, that would give almost back-tracking behavior on the solution space. (If I got it correct, the solution space is over-constrained, you only need 92 out of roughly forty thousand solutions. It helps a lot if you can swiftly dispose of most of them.)

Anyway, this task is far too complex for where the Egel interpreter is now in its development. I decided to implement the solution to a far more trivial problem, Euler project's task one. Sum all the integers from one to a thousand which are dividable by either three or five. Below the code, without the prelude primitives.

namespace ThreeOrFive (

    using System
    using List

    def sum = foldr (+) 0

    def three_or_five = [X -> or ((X%3) == 0) ((X%5) == 0) ]

    def euler1 =
        [ N -> (sum @ (filter three_or_five)) (fromto 1 N) ]

)

using ThreeOrFive

def main = euler1 1000

Friday, July 14, 2017

N-Queens

Woot! The N-Queens problem as translated from Haskell to Egel. I have absolutely no idea what the code does.


namespace System (

    def flip = [ F, X, Y -> F Y X ]

    def or =
        [ false, false -> false
        | X, Y         -> true ]

    def and =
        [ true, true    -> true
        | X, Y          -> false ]

    def @ =
        [ F, G, X -> F (G X) ]

)

namespace List (

    using System

    def length =
        [ nil -> 0
        | cons X XX -> 1 + (length XX) ]

    def foldl =
        [ F, Z, nil -> Z
        | F, Z, cons X XX -> foldl F (F Z X) XX ]

    def foldr =
        [ F, Z, nil -> Z
        | F, Z, cons X XX -> F X (foldr F Z XX) ]

    def ++ =
        [ nil, YY -> YY
        | cons X XX, YY -> cons X (XX ++ YY) ]

    def map =
        [ F, nil -> nil
        | F, cons X XX -> cons (F X) (map F XX) ]

    def reverse = 
       foldl (flip cons) nil

    def block =
        [ 0 -> nil
        | N -> cons (N-1) (block (N-1)) ]

    def fromto =
        [ X, Y -> 
            if X < Y then
                reverse (map [ N -> N + X ] (block (Y-X)))
            else nil ]

    def zipwith =
        [ Z, cons X XX, cons Y YY -> cons (Z X Y) (zipwith Z XX YY)
        | Z, XX, YY               -> nil ]

    def any =
        [ P, nil        -> false
        | P, cons B BB  -> if P B then true else any P BB ]

    def all =
        [ P, nil        -> true
        | P, cons B BB  -> if P B then all P BB else false ]

    def elem =
        [ X -> any ((==) X) ]

    def notelem =
        [ X -> all ((!=) X) ]

    def insertEverywhere =
        [ X, nil -> {{X}}
        | X, cons Y YY -> cons (cons X (cons Y YY))

                 (map (cons Y) (insertEverywhere X YY)) ]

    def concatMap =
        [ F -> foldr ((++) @ F) nil ]

    def permutations =
        foldr (concatMap @ insertEverywhere) {{}}
)

namespace NQueens (

    using System
    using List

    def nqueens =
        [ 0, NCOLS     -> cons nil nil
        | NROWS, NCOLS ->
            foldr
              [ SOLUTION, A ->
                  A ++
                  (foldr
                    [ICOL, B ->
                        if safe (NROWS - 1) ICOL SOLUTION
                          then B ++ ({SOLUTION ++ {ICOL}})
                          else B]
                    nil
                    (fromto 1 (NCOLS+1))) ]
            nil
            (nqueens (NROWS - 1) NCOLS) ]

    def safe =
        [ IROW, ICOL, SOLUTION ->
            notelem true
             (zipwith
                [SC, SR ->
                   or (ICOL == SC) 
                  (or (SC + SR == ICOL + IROW) 
                      (SC - SR == ICOL - IROW))]
                SOLUTION
                (fromto 0 IROW)) ]
)

using List
using NQueens

def main = length (nqueens 8 8)

Unfortunately, it takes some.. time.. A lot longer than Haskell which just prompty gives the answer.


[marco@stallion src]$ time ./egel ../examples/nqueens.eg 
92

real 0m13.020s
user 0m13.000s
sys 0m0.001s

But then again, I am more looking for automating tasks on matrix operations then anything else. So, I am happy.

Log 071417

Over the last few days, I fixed a number of trivialities and added some basic functionality. Escaping/unescaping on characters and strings, got an interactive prompt running, added basic -recursive- comparison operators on runtime objects, fixed a lifting bug in try/catch (and introduced a new one), and some more stuff.

I am working towards implementing an N-queens example in Egel but hit a snag, a core dump on a program. Well, that's a long time ago. So, a) good I started on a less trivial program, and b) bad I hit a core dump on that. That shouldn't happen anymore.

A well. Just wrinkling out stuff, I guess. Hope this bug ain't too pesky.

EDIT: Didn't turn out too pesky. I forgot a recursive case in desugaring lists, which probably implied code generation on an AST node which shouldn't exist. Phew...

Thursday, July 13, 2017

Log 071317

Right. So the way I work is that I program a bit, hit a problem, ponder too long on the problem, lose interest, and half a year later -after I forgot what that nasty little thing was all about- return to coding.

Ridiculous, I admit.

Anyway, I moved twenty lines around and went onto the next problem of getting interactive mode right.