Wednesday, October 4, 2017

Eval

I am still a bit dumbfounded how to get something useful out of the Commonmark bindings. One manner of making it useful, as others showcased, is to add Yaml meta-information and a template system to a processor so I looked at the various Yaml C/C++ implementations but shied a bit away of yet another binding.

The other manner is to use Egel for metainformation or even post-processing rules. Egel-enabled Commonmark where the language plays a similar role as Javascript plays for HTML.

And, of course, that idea is somewhat neat but.. Caveats, unclear how to implement that, etc.

For one, Egel constructs are of course far more verbose than Yaml specification which is designed to be terse. Then, Egel is pure; well, mostly. How do you insert information, or even processing instructions into a document when everything is supposed to be pure? And it seems clear I need some form of `eval` but in the interpreter, so far, the evaluator makes use of a module system which in turn makes use of the runtime system. If the runtime needs to know of the evaluator I suddenly got cyclic dependencies.

This actually needs far more thought than I imagined.

Friday, September 22, 2017

Log 092217

Right, so I worked on a foreign function interface (FFI) by employing C++ macros and bound the Commonmark C library in Egel. It took me a few days to subsequently implement transforms.

Types, those were half the reason I worked slowly. I am not sure I need them or miss them.

Meanwhile, I am trying to convince people on #proglangdesign that programming language designers need to be punished for their crimes. I pointed out that Anton Chigurh is a good artist's impression of a language designer: "Some sociopath killer who sees himself as a pure tool of reason, and inflicts world-wide suffering while never being aware of all the pain because of his own strange psychopathy."

Ah well. Still not sure I can do anything useful with these Commonmark bindings.

Thursday, September 14, 2017

Log 091417


<spruit11> the Self, as epitomized by 'this' in several misconstrued languages, 
           cannot exist as a pure presense vis-a-vis the denial of 
           the Other. an OO language which doesn't expose a 'that' as 
           its necessary, to be elocated, counterpoint is, therefore, a 
           metaphysical paradox reflecting the categorical mistakes of
           the language designer.
<spruit11> or, Stroustrup is a turd.

Rethinking OO and operators.

Sunday, September 10, 2017

That Constant Thing

I completely reversed my position with regard to the preceding post. I need some form of FFI but I am not going to do constant support. First, let's look at the problem.

Say, I'll allow native constants in Egel with a 'const' construct.

    const FIVE = 5

Conventionally, this would introduce a syntactic equivalence; the symbol 'FIVE' and the value '5' should be interchangeable in source code. The difference between such a construct and a normally named abstraction is small but significant. In expressions, you hardly notice.

    FIVE + 5

That should evaluate to '10'.

In a pattern match, the difference becomes apparent.

    [ FIVE -> "five" ] 5

Conventionally, you don't match against the symbol 'FIVE' here but against the value '5'; i.e., the expression above reduces to "five".

I looked at other languages. C/C++ support constants in header files and C++ has a notion of const expressions. Java has static final variables. Haskell doesn't have a notion of syntactic equivalence. Python defaults to using variables. Lisp has constant expressions, no doubt in the form of a macro.

In the end, none of them really handle syntactic equivalence well. And there's a reason for that.

There are a number of manners to support this construct. Through a preprocessor, through a macro extension, through syntactic sugar, or make it native to the language. For Egel, a preprocessor would be nice, macros would be nice, syntactic sugar would be the easiest, native to the language would mean a small overhaul.

Syntactic sugar is easy but conflicts with another goal I have, I want to be able to use source files and object files (which are C++ compiled dynamic libraries) interchangeably. In the future, it should be possible to compile any source file to an object file and load it. But byte code doesn't carry, and probably shouldn't carry, syntactic information -or some notion of the AST- so substitution on the source code can't be implemented.

Which would imply making constants native to the language but, at the same time, it's hard to implement a notion of syntactic equivalence if all you have are C++ defined combinators. Making constants fully native to the language would mostly imply overhauling the byte code or byte code generation. I.e., since the difference with normal combinators is in the pattern match, the simplest solution would be to generate code like "match against the definition of this symbol, not its name." I am not terribly comfortable with such a change at the moment.

A preprocessor or macro support would be nice but similar to a syntactic sugar extension conflicts with the goal of interchangeable dynamic modules.

And there you have it. Roughly one can support constant definitions at the text, AST, or code level, where code is the most portable but hardly anyone implements a scheme such that you can reason back from generated code to a syntactic equivalence.

But the problem I still have is in binding with C/C++ modules which normally introduce large numbers of constants through macros or enumerations. Some form of FFI support or C++ code generation would greatly diminish efforts needed.

Maybe C++ macros or templates could fix this? I should look into that.

Thursday, September 7, 2017

Log 090717

Small stuff. For some reason not completely clear to me I decided that after implementing some Rosetta code examples it would be a nice idea to do some documentation tooling. Commonmark -a small and fast system for Markdown- seems the way to go. I am busy implementing an Egel wrapper around that, but I am somewhat baffled by the hoops I need to jump through to get it all done.

Lack of a foreign function interface (FFI) and lack of constants, or enumerations, are hurdles to take.

The first one, lack of an FFI, I don't want to tackle. My own small experiments with FFI in the Hi language lead me to the conviction that either you support the full C type system, probably including support for parsing header files, or your solution isn't worth it. And even when you got that covered you'll still need glue packages for interfacing with C++.

My solution: no FFI. "Write the glue in C++!" is the adagium for the moment.

The second one, lack of constants/enumerations, I think I want to tackle but I am not sure how yet. There are two well-known solutions: implement a preprocessor or support constants natively in the language. The first solution has as an advantage that that would also bring conditional compilation into the system, the second solution I didn't think fully through yet. Why did C++ implement the 'const' directive? For large constants? Supporting a directive would probably give an advantage that I can mark (large) terms which don't need to be reduced. But at the same time I want to introduce names for terms to insert and pattern match against.

I implemented the start of a String library and noted that I needed to refactor the module system. With the side notes that in the first case a C FFI wouldn't have helped since I was binding C++ Unicode string objects, and in the latter case modules now wrap transforms but need to reflect the conceptual model of Egel constructs properly.

Sunday, August 27, 2017

APL's iota

So it's trivial to implement an APL-like iota combinator.

>> def iota = [ 0 -> 0 | N -> (iota (N-1)) N ]
>> iota 5
(0 1 2 3 4 5)

I subsequently set out to see if I could define factorial in an APL manner. The definition should be something like below.


>> def fac = mult @ map [X -> X + 1] @ iota

But the problem is Egel can't really pattern match readily against a variadic sized array. Take a variadic number of arguments, no problem; match against an n-adic construct, if n is known - sure; match against variadic array, no can do.

So, I either implement car/cdr like deconstruction combinators or I extend the pattern matching features. Where the new snag is that my bytecode doesn't support variadic pattern matching..

Log 082717

I took a small break. Most stuff works now but I hit a small snag which developed after changing the operational semantics of constant application.

>> def f = 1 2
>> f
(1 2)
>> def g = 1
>> def f = g 2
>> f
1

The first definition of f works according to the change in semantics but the second one still optimises constant applications away. I can easily change this but also at a potentially huge performance cost.  *I fixed this.*

I got the dual of variadic application working though, which is nice.


>> def app = [ 0 -> nil |  N -> (app (N-1)) N ]
>> app 3
(System.nil 1 2 3)

Wednesday, August 23, 2017

Conway's Life in Egel

I implemented some Rosetta Code examples, one is Conway's life which I am quite pleased with.

import "prelude.eg"

using System
using List
using IO

def boardsize = 5

def empty = [ X, Y -> 0 ]

def insert =
    [ X, Y, BOARD -> 
        [ X0, Y0 -> if and (X0 == X) (Y0 == Y) then 1
                    else BOARD X0 Y0 ] ]

def coords =
    let R = fromto 0 (boardsize-1) in
        [ XX, YY -> map (\X -> map (\Y -> X Y) YY) XX ] R R

def printcell =
    [ 0 -> print ". "
    | _ -> print "* " ]

def printboard =
    [ BOARD ->
        let M  = map [XX -> let _ = map [X Y -> printcell (BOARD X Y)] XX in print "\n" ] 
                     coords in
            nop ]

def count =
    [ BOARD, X, Y ->
        (BOARD (X-1) (Y-1)) + (BOARD (X) (Y-1)) + (BOARD (X+1) (Y-1)) +
        (BOARD (X-1) Y) + (BOARD (X+1) Y) +
        (BOARD (X-1) (Y+1)) + (BOARD (X) (Y+1)) + (BOARD (X+1) (Y+1)) ]

def next =
    [ 0, N -> if N == 3 then 1 else 0
    | _, N -> if or (N == 2) (N == 3) then 1 else 0 ]

def updateboard =
    [ BOARD ->
        let XX = map (\X Y -> X Y (BOARD X Y) (count BOARD X Y)) 
                     (flatten coords) in
        let YY = map (\X Y C N -> X Y (next C N)) XX in
            foldr [X Y 0, BOARD -> BOARD | X Y _, BOARD -> insert X Y BOARD ]
                  empty YY ]

def blinker =
    (insert 1 2) @ (insert 2 2) @ (insert 3 2)

def main = 
    let GEN0 = blinker empty in
    let GEN1 = updateboard GEN0 in
    let GEN2 = updateboard GEN1 in
    let _ = map [ G -> let _ = print "generation:\n" in printboard G ] 
                {GEN0, GEN1, GEN2} in
        nop

Sunday, August 20, 2017

Cutting Guards

I wrote a small example program, found that the source-to-source transform for getting rid of guards was faulty, and found that since I allow for a variadic number of arguments, a trivial transform doesn't exist.

Below, an example:

     def f = [ X ? X == 0 -> 1 | -> 15 ]

The problem with a variadic number of arguments is that in the above example you would need code for the case where an argument is consumed, but the match failed, and for the case where an argument isn't consumed. Thus, in a trivial solution, you'd copy code for every guard.

There is no trivial solution without copying code - a potential exponential blow-up -, moreover guarded matches add little in expressiveness over just using an if-then-else. I decided to cut guarded commands.

Friday, August 18, 2017

Duals

Well, in Egel you can trivially write down variadic functions, like sum shown below.

>> def sum = [ X, Y -> sum (X+Y) | X -> X ]
>> sum 1 2 3 4
10

But is the reverse possible too? Best I got to was this:

>> def un = [ 0 -> 0 | N -> N (un (N-1)) ]
>> un 5
(5 (4 (3 (2 (1 0)))))

I think I need something like a reverse too though I am not sure. Also get the feeling I am reinventing Lisp. Which is a bad idea I guess, Lisp will always be better at being Lisp than Egel will ever be.

Changes in the Operational Semantics

I decided to make a fundamental change in the semantics. Egel is untyped, so you need to deal with the case where a constant is applied to another constant; i.e., what does "a" 1 reduce to?

I sidestepped the issue until now and went for a scheme where "a" 1 reduces to "a", thus every constant gobbles up any arguments supplied to it. Mostly I did that because it was convenient, the code is somewhat shorter.

But it seems more natural to simply not reduce; i.e., the result should simply be "a" 1. That seems like a little change but users might make small typos, like forget an operator, and find it too hard to debug a program where constants 'magically' disappear.

The trivial case is easy to implement but now all operators also need this behavior. I.e., (1+2) 3 now reduces to 3, whereas it should reduce to 3 3.


Wednesday, August 16, 2017

Log 081617

Well. I ticked off the items on my to-do list. So, the new one:
  1. Switch over to the standard syntax for 'let'.
  2. Allow for assignment in the REPL.
  3. Maybe allow for object definitions in the REPL.
  4. Support a time-out in the IRC bot.
Another four items. Great!

Tuesday, August 15, 2017

Log 081517

The byte code generation bug turned out to be insignificant but in order to debug it, I changed some code first to trivialize some invariants, then I found the snag.

I fixed some small stuff but didn't find the motivation yet to tick off other parts of the to-do list.


Sunday, August 13, 2017

Log 081317

I implemented a quick-and-dirty IRC bot for Egel, a C++ application which allows for online evaluation. That gave me the opportunity to work on a somewhat neat manner of incorporating the Egel runtime in a C++ application.

It works but not to my satisfaction. I need to change and support two things: 1) Evaluation now works in a cumbersome manner with callbacks which need to go. 2) I need to think about a scheme to call Egel routines from the evaluator. I am not sure what I want here but some tighter integration seems in order.

I also found a not so nice bug in bytecode generation. Well, that's a show stopper. I am not worried since I trust the evaluation scheme I got but annoyed that a bug like that can pop up that late during development. I need to either test a lot or idiot proof it, and it turns out I am not idiotic enough to write down really strange programs which bring all bugs to the surface.

Then there is OO. I think I'll need syntactic sugar for an extend relation. That's not too hard but some work.

So, in order, my small todo list below:
  1. Fix the byte generation bug.
  2. Make calling the evaluator sane.
  3. Add extend relation to objects.
The work is never done.




Friday, August 11, 2017

Log 081117

I implemented a scheme for creating objects but I didn't add a safe scheme yet for destructive assignment.

For the moment I see three different approaches:
  1. Ignore, just let cycles be created. This what I have now.
  2. Detect cycle creation, and fail if a cycle would be created.
  3. Conservatively copy, that part of the tree which holds the cyclic reference is copied.

I am also working a bit on an IRC bot but didn't progress very far yet.

Friday, August 4, 2017

Log 080417

Well. I don't feel like doing much. I registered an account at readthedocs and started thinking a bit about performance. On a small test example, determine the length of a list of a million numbers, the interpreter runs in seventeen seconds whereas the Glasgow Haskell interpreter runs in around a second. 17x slower.

The hotspot is, of course, the bytecode interpreter.

There are a number of things I can do to gain some performance:
  1. Change the register storage from a container to an array.
  2. Fix the amount of copying. (The 'clone' method.)
  3. Expand the bytecode a bit since it's minimal but not fast.
  4. Use multiple inheritance to remove an indirection from arrays of objects.
  5. Write my own reference counting object.
I changed a few lines of code and implemented the first two steps. Performance increased from seventeen to ten seconds on the example.

Wednesday, August 2, 2017

Quotation

I am thinking of adding a trivial quotation scheme. Nothing difficult, only for combinators. I.e., `x is nothing more than informing the interpreter not to evaluate the combinator x.

I guess I could do it. But is a construct like that useful? I can't get my head around it.

Mike Acton "Data-Oriented Design and C++"

Tuesday, August 1, 2017

Log 080117

Anyways, after fighting with a small template instantiation problem the 'par' construct works.

Now what?

Monday, July 31, 2017

A short post on Concurrency

Right. So, I guess I am going the standard route to concurrency in a rewrite system and implement a 'par' primitive. I think it can be done without changes to the runtime.

'par f g' should do the following things:
  1. Create a node for the results of evaluating the two arguments.
  2. Start two threads which concurrently evaluate 'f nop' and 'g nop' which place their results in the node after being evaluated. Note, a dummy argument is necessary since otherwise both arguments would be fully reduced.
  3. Block until the arguments are fully reduced.
Then there is exception handling. What to do if either thread returns an exception? There are two solutions.
  1. The simplistic manner. Just place the exception thrown in the result node.
  2. The right manner. As soon as either thread throws an exception, stop the other thread, and throw that exception upward.
There are no users, I expect no users, I'll do the simplistic manner.

Object Orientation

Since I have mutable state and records, an object oriented (OO) model is possible. The 'problem' is in late binding of methods, mostly. Below, a solution, though there are several approaches. It is split into two parts, general definitions which introduce a few combinators for constructing and updating objects and a few examples.


/*
 * There are several manners in which you can try to emulate object
 * orientation (OO) in Egel. This is one manner.
 *
 * An object has a mutable state and a collection of named methods. 
 */
import "prelude.eg"

// a namespace OO which introduces the 'object' datatype and 
// accessors.
namespace OO (
    using System

    data object

    def object_get_state =
        [ object STATE MM -> get STATE ]

    def object_set_state =
        [ object STATE MM, S -> object (set STATE S) MM ]

    def object_get_method0 =
        [ cons NAME0 (cons METHOD MM), NAME1 ->
            if NAME0 == NAME1 then METHOD else object_get_method0 MM NAME1
        | {}, NAME1 -> NAME1 ]

    def object_get_method =
        [ object STATE MM, NAME -> object_get_method0 MM NAME ]

    def @@ = 
        [ SELF, NAME -> (object_get_method SELF NAME) SELF ]

)

using OO
using System

// examples of a box and a circle class.
def box =
    [ W, H -> 
        object (var (W, H)) 
        { "resize", 
          [ SELF, WW, HH -> object_set_state SELF (WW, HH) ],
          "area",
          [ SELF -> [(W, H) -> W*H] (object_get_state SELF) ] }
    ]

def circle =
    [ R -> 
        object (var R) 
        { "resize", 
          [ SELF, R -> object_set_state SELF R ],
          "area",
          [ SELF -> [R -> (tofloat (R*R))*3 ] (object_get_state SELF) ] }
    ]

def main =
    B = box 1 5;
    AREA0 = B @@ "area";
    N = (B @@ "resize") 2 3;
    AREA1 = B @@ "area";
        (AREA0, AREA1)
This just shows it is possible to have OO in the graph rewrite system. There are other approaches. The named fields could be introduced as constants, the attributes and methods could be coalesced into one list, or map, of fields; i.e., essentially a named record. And some system could be supported through syntactic sugar.

It's good enough for now, I rather fix concurrency.

Sunday, July 30, 2017

Log 073017

I hardly needed to debug the Unicode bug, it seemed to have been a simple matter of using UChar instead of UChar32 which I noticed after just glancing over the code. The floating point bug was just a matter of setting the precision right on the output stream. Still took me about four or five hours; not sure why, I might have been unmotivated.

I cleaned up some example code. There's still a lot of stuff to implement, none of the basic libraries are up to par, I lack proper text handling primitives and I need to add concurrency.

Moreover, of course, Python is still the better tool because of OO. But I got variables, which means mutable state. I might be able to support a trivial OO model? Unsure. Then again, mutable state is already more than most small lambda calculus based OO encodings... Lots to do. I think I'll go for concurrency first, that seems the most interesting.

Saturday, July 29, 2017

Log 072917

There are two things I need to get right before the Egel interpreter is in any state of being made public. One is handling of Unicode, one is handling of floating point numbers.

 The problem with Unicode is severe since it doesn't reproduce consistently.
[marco@stallion src]$ ./egel 
>> "asfdas"
"asfdas"
>> "hello world!"
"l"
>> "!"
"!"
>> "h!"
"h!"
>> "hello world!"
"É•"
I am simply not a Unicode expert. Of course, the code looks right -just a number of conversions- but I simply don't know what goes wrong here. 

Then there is the floating point handling.


[marco@stallion src]$ ./egel 
>> using System
>> 131.153535
131.154
>> 0.12341424 / 132414412.0
9.3203e-10

Somehow the float conversion doesn't seem to produce a 64bit representation. Or, it looks like it has an internal 64 bit representation but the conversion from and to a double seems to assume less precision. At least, this bug produces consistent behavior. Guess I am stuck for a few days. For the Unicode bug I might need an expert, though.

Friday, July 28, 2017

Local Variables

So, the following trivial solution would give me local variables with a few simple combinators.

[ X -> Y = var X; _ = set Y (get Y + get Y); Y ] 7

If var stores a tree, then the result is a tree. If get retrieves a previously stored tree, then the result is a tree. If set stores a tree, then the result is a tree. If I am wrong, then f*ck it, and I am gonna call it an unsafe feature.

Log 072817

I implemented some things - an environment variable for the include path, a build script, conversion routines, a prelude file - and debugged those. I failed to find a compiler switch to solve a problem with late resolution in shared libraries, solved it in another manner, and found the compiler switch later anyway.

Almost all functionality is there except for assignment. I first thought I would implement assignment on global variables only but it dawned on me that I have been wrong: it should be possible to have assignment on local variables also. I need the computation graph to be acyclic, but it looks it is sufficient to have assignment on variables only; as long as I don't update fields that should satisfy the constraint too.

Maybe take a small break and think on that. I hope I don't need a complete overhaul.

Thursday, July 27, 2017

Log 072717

Observe the pretty date! I haphazardly moved some code around and implemented further functionality for primitive input/output and basic math libraries. Of course, there were -and still are- some small bugs but the scheme I use seems to work.

So what is left is to make both libraries conceptually neat, and move on and implement a 'prelude' for the interpreter.

Wednesday, July 26, 2017

Log 072617

Well, the interpreter prints. That is, the scheme I've got to implement primitive input/output and a math library compiles and the interpreter is able to run dynamically loadable libraries.

It's a bit rough around the edges and I need to move some code around still, but it works.

Saturday, July 22, 2017

Log 072217

So, Thursday I wrote an introduction to the evaluation strategy I employ in the Egel interpreter. I moved that to the main website and started shaving, rewriting, and removing. Lastly, I updated the pictures a few times which took some work because Inkscape isn't very good at keeping a similar layout for different pictures. Now, it's Sunday and I am reasonably content with the result, though it might be somewhat informal to some people. But, ah well, this is the first time I had the opportunity to get half-baked 'feelings' right while loosely relating what I designed to other approaches so I guess I was entitled to wasting some time on it.

Meanwhile, I didn't code. I decided upon a C++ implementation for primitive input/output because reasons -callback hell, the convenience of Unicode support, minimal dependencies, C++ is reasonably portable, etc.- but that also implies a lot of work I can't seem to find the energy for at the moment.

I need to:
  • Implement channels in C++, an abstraction over several iostreams which is going to be pretty verbose in that it needs to support a lot of operations.
  • Write IO combinators employing that abstraction in a dynamically loadable module.
  • Test all of that. The channel abstraction, the combinators, and module loading.
Too. Much. Work.

The Unreasonable Effectiveness of Dynamic Typing

The Unreasonable Effectiveness of Dynamic Typing for Practical Programs from JavaZone on Vimeo.

Thursday, July 20, 2017

Operational Semantics of Eager Combinator Rewriting

I EXPANDED ON THIS POST AND ADDED IT TO THE MAIN WEBSITE.

I got a question about the operational semantics of the eager combinator rewrite system I use. It's embarrassingly facile and trivial to explain with two pictures, so here goes.

Say, you have a term language where you want to rewrite an expression consisting of combinators. In Egel's case, each combinator knows how to rewrite itself, therefore, corresponds to a procedure or some code.


You could use a stack and push stackframes for F, G, and H. However, I wanted to experiment with another model, rewriting. Note that since we're rewriting eagerly we're rewriting the outermost expression first.

The operational semantics Egel uses is extremely simplistic: Start with the node which is to be rewritten first, store the result where necessary, and continue with the next node to rewrite.


That's it. Note that the resulting graph is still a directed acyclic graph. That's what enables me to implement Egel's interpreter with native C++ reference counted pointers.

This is, of course, a slow and restrictive manner of evaluation. For one, stack based evaluators are simply faster because pushing data to, or popping from, a stack is less computationally expensive than allocating, and deallocating, heap nodes. Second, it's a term rewrite system so you can't allow for assignment since that would usually allow you to create cyclic structures.

The benefits of this model are threefold. For one, you don't run out of stack space which is really important in a functional language as I have experienced, unfortunately. Second, it is still a tree rewrite system so you don't need to care about tail call optimization. Third, you can reference count.

I simply like this model of evaluation, there isn't much more to it.

Notes: Yes, these are thunks or heap allocated stack frames. No, this isn't done with CPS since CPS is a transformation, not an evaluation strategy. Bytecode is generated directly for this model.

More: Egel also supports exceptions which is a natural extension of this model. It is left as an exercise to the reader.

Wednesday, July 19, 2017

Log 071917

I took a break after running into a few snags. For one, the Ocaml input/output (IO) specification wasn't very good. It neatly describes channels and cleanly separates them into input and output; that looks nice at first glance but what if you want both? I.e., a naive database application usually consists of seek operations followed by reads and writes. It's an odd design decision.

Then there is C++'s manner of working with stdin, stdout and stderr as well as istreams, ostreams and fstreams. They made the standard IO streams uncopyable! Implying you can only pass references but never copy them. Of course, you might be tempted to pass pointers but you run into another whole set of problems. For one, expect lots of template type errors, and also, you're not supposed to call delete when a standard stream goes out of scope.

Then there is the class inheritance design of C++'s streams. It's over-engineered to meet the requirements of an OO type system; here you also see that the type system forces a solution which is simply not that neat though multiple inheritance helps.

Looking at both specifications I can only conclude that C's POSIX file handles specification simply wasn't that bad.

I also started looking at libuv as the basis of file IO, and later multithreading. And I hit another snag: Callbacks don't neatly fit into a mostly referential transparant system. (Well, so far. Unless I come up with a scheme to make it fit.) The problem: Sure you can define a callback, sure it will read values from a pipe, but it can't do anything with it. It simply cannot inform another thread easily to continue evaluation with a calculated value. (Imagine multiple reading threads and one thread who consumes and writes the result back. It's difficult to combine results from multiple threads with one callback.)

I am tempted to roll out my own temporary solution for primitive IO. I am also tempted to ignore libuv and write a PAR combinator using C++ primitives. But libuv is cross-platform, well-maintained, scalable, and popular...

Sunday, July 16, 2017

Opaque Combinators

It's design time! As the implementation progresses I am working towards a solution for primitive input/output (IO) in Egel. As a design specification, I simply copied the, or one, interface of Ocaml's primitive IO libraries from a website. Keeping fingers crossed they thought that design through.

That means I am leaving purity behind. But then, if you want pure combinators, you can always build them upon the impure interface so not much is lost.

The bigger problem is that I now need to come up with some kind of scheme for impure 'stateful' combinators. I.e., a fix for file handles, raw pointers, matrices, etc.

So what have I got? For one, the symbol attached to a combinator isn't enough to uniquely describe it. Moreover, equality is problematic and needs to be handled elsewhere. The runtime can't look into the combinator since it simply doesn't know, cannot know, what it contains; but it needs a symbol attached it seems.

Right. A bit foggy but some solution is dawning on me.


Log 071617

Right. I managed to implement the N-Queens problem which the interpreter solves in around 13 seconds, GHC just promptly delivers the solutions.

That either implies I have a very slow runtime -which I know- or GHC sneakily also enjoys the benefit of lazy evaluation on this problem. I.e., I don't understand the solution I translated but the standard functional approach is to generate all board positions and filter out all admissible ones. GHC might benefit from lazy evaluation in that it doesn't need to produce all board positions but just needs to inspect whether a partly constructed solution is admissible according to the constraints, that would give almost back-tracking behavior on the solution space. (If I got it correct, the solution space is over-constrained, you only need 92 out of roughly forty thousand solutions. It helps a lot if you can swiftly dispose of most of them.)

Anyway, this task is far too complex for where the Egel interpreter is now in its development. I decided to implement the solution to a far more trivial problem, Euler project's task one. Sum all the integers from one to a thousand which are dividable by either three or five. Below the code, without the prelude primitives.

namespace ThreeOrFive (

    using System
    using List

    def sum = foldr (+) 0

    def three_or_five = [X -> or ((X%3) == 0) ((X%5) == 0) ]

    def euler1 =
        [ N -> (sum @ (filter three_or_five)) (fromto 1 N) ]

)

using ThreeOrFive

def main = euler1 1000

Friday, July 14, 2017

N-Queens

Woot! The N-Queens problem as translated from Haskell to Egel. I have absolutely no idea what the code does.

namespace System (

    def flip = [ F, X, Y -> F Y X ]

    def or =
        [ false, false -> false
        | X, Y         -> true ]

    def and =
        [ true, true    -> true
        | X, Y          -> false ]

    def @ =
        [ F, G, X -> F (G X) ]

)

namespace List (

    using System

    def length =
        [ nil -> 0
        | cons X XX -> 1 + (length XX) ]

    def foldl =
        [ F, Z, nil -> Z
        | F, Z, cons X XX -> foldl F (F Z X) XX ]

    def foldr =
        [ F, Z, nil -> Z
        | F, Z, cons X XX -> F X (foldr F Z XX) ]

    def ++ =
        [ nil, YY -> YY
        | cons X XX, YY -> cons X (XX ++ YY) ]

    def map =
        [ F, nil -> nil
        | F, cons X XX -> cons (F X) (map F XX) ]

    def reverse = 
       foldl (flip cons) nil

    def block =
        [ 0 -> nil
        | N -> cons (N-1) (block (N-1)) ]

    def fromto =
        [ X, Y -> 
            if X < Y then
                reverse (map [ N -> N + X ] (block (Y-X)))
            else nil ]

    def zipwith =
        [ Z, cons X XX, cons Y YY -> cons (Z X Y) (zipwith Z XX YY)
        | Z, XX, YY               -> nil ]

    def any =
        [ P, nil        -> false
        | P, cons B BB  -> if P B then true else any P BB ]

    def all =
        [ P, nil        -> true
        | P, cons B BB  -> if P B then all P BB else false ]

    def elem =
        [ X -> any ((==) X) ]

    def notelem =
        [ X -> all ((!=) X) ]

    def insertEverywhere =
        [ X, nil -> {{X}}
        | X, cons Y YY -> cons (cons X (cons Y YY))
                 (map (cons Y) (insertEverywhere X YY)) ]

    def concatMap =
        [ F -> foldr ((++) @ F) nil ]

    def permutations =
        foldr (concatMap @ insertEverywhere) {{}}
)

namespace NQueens (

    using System
    using List

    def nqueens =
        [ 0, NCOLS     -> cons nil nil
        | NROWS, NCOLS ->
            foldr
              [ SOLUTION, A ->
                  A ++
                  (foldr
                    [ICOL, B ->
                        if safe (NROWS - 1) ICOL SOLUTION
                          then B ++ ({SOLUTION ++ {ICOL}})
                          else B]
                    nil
                    (fromto 1 (NCOLS+1))) ]
            nil
            (nqueens (NROWS - 1) NCOLS) ]

    def safe =
        [ IROW, ICOL, SOLUTION ->
            notelem true
             (zipwith
                [SC, SR ->
                   or (ICOL == SC) 
                  (or (SC + SR == ICOL + IROW) 
                      (SC - SR == ICOL - IROW))]
                SOLUTION
                (fromto 0 IROW)) ]
)

using List
using NQueens

def main = length (nqueens 8 8)

Unfortunately, it takes some.. time.. A lot longer than Haskell which just prompty gives the answer.

[marco@stallion src]$ time ./egel ../examples/nqueens.eg 
92

real 0m13.020s
user 0m13.000s
sys 0m0.001s

But then again, I am more looking for automating tasks on matrix operations then anything else. So, I am happy.

Log 071417

Over the last few days, I fixed a number of trivialities and added some basic functionality. Escaping/unescaping on characters and strings, got an interactive prompt running, added basic -recursive- comparison operators on runtime objects, fixed a lifting bug in try/catch (and introduced a new one), and some more stuff.

I am working towards implementing an N-queens example in Egel but hit a snag, a core dump on a program. Well, that's a long time ago. So, a) good I started on a less trivial program, and b) bad I hit a core dump on that. That shouldn't happen anymore.

A well. Just wrinkling out stuff, I guess. Hope this bug ain't too pesky.

EDIT: Didn't turn out too pesky. I forgot a recursive case in desugaring lists, which probably implied code generation on an AST node which shouldn't exist. Phew...

Thursday, July 13, 2017

Log 071317

Right. So the way I work is that I program a bit, hit a problem, ponder too long on the problem, lose interest, and half a year later -after I forgot what that nasty little thing was all about- return to coding.

Ridiculous, I admit.

Anyway, I moved twenty lines around and went onto the next problem of getting interactive mode right.

Sunday, May 14, 2017

WanaDecrypt0r

Reading up on how WanaDecrypt0r works but details are scarce. It looks like a heap spray^1 with an unlink exploit but no confirmation of that anywhere.

No wonder Google is investing in Fuchsia?

^1: Actually, that came from some remark I read but I disagree. It might be as trivial as unlink. That is, set up non-paged memory in a deterministic manner, exploit a buffer overrun in SMB to corrupt a part of the kernel heap, unlink into the payload. I don't see why it would need a heap spray.

EDIT: Explanation here.

Thursday, May 11, 2017

Log 051117

I am trying to figure out why the constructor of a combinator is passed a pointer to the VM. It's not a bad design decision, any combinator might want to poll the VM for information for various reasons. Might have been operator overloading where you needed to inspect a global table? But, of course, combinators which reference the current runtime are hard to ship. I might want to cut that but at the risk that I would need to introduce it again at some stage later.

Stuff you know you don't know, stuff you don't know you don't know. And all that.

Wednesday, May 10, 2017

Log 051017

Well. Apart from dreaming, mobile code is pushed into the far future because there are still too many mundane tasks to do. I didn't even implement string unescaping yet, so that's why Hello World doesn't work right. And the more pressing matter after that is implementing a sane manner of being able to link in C++ defined combinators and some kind of sane library for primitive IO and standard operators. That's at least about half a year's of work before I even get to the point I can start thinking about linking in matrix operations.

So, I better get coding.

Monday, May 8, 2017

Not so bad

It dawned on me that a lot can be solved by extracting portable combinators from the runtime instead of making all combinators portable. That would solve the problem that I could 'mess' up a perfectly fine small language.

Saturday, May 6, 2017

So for future's sake

Well great. So, what to do? I have an eager combinator evaluation system. What do you naively need to make it mobile:

  1. The term being reduced (elsewhere) must be transportable, and
  2. the combinators it invokes must be transportable. Moreover, 
  3. the end-result is a term which must be transported back.
The term is a tree of combinators not referencing anything else (if I got the invariants right), so that just needs a means of serialization.

A combinator naively is just a piece of code, so when you're able to serialize that you're halfway there. The other half is that combinators might reference other combinators but that's easy to solve with standard compiler technology. A combinator can be described as a self-contained piece of code referencing external symbols and all you need to do is sweep, ship, and link.

Problem solved, right? However: Verbose, computationally intense, might end up sweeping the entire program, possibly too fine-grained in that you need to gather and link myriads of combinators, doesn't solve distributed scheduling and coordination let alone local exceptions..

And then there's the stuff I didn't think of yet.

Worst case: I manage to corrupt a perfectly fine small language in such a way that it becomes unusable both for stand-alone or distributed computing.

So. Cheers.

Friday, May 5, 2017

Scrap mobile code? For now? For ever?

Right. So the master plan is to develop a language which combines ideas of statistical computing and mobile code. Mobile R, in short.

Great ideas!

But I've been reading up on mobile code and there's a) either no market for that or b) it falls squarely into the category of academically uninteresting things to do but technologically one of the hardest stunts to pull off.

That is, if you walk into a university these days the response will likely be "So you can migrate combinators, huh? Neat. My paragliding functor does the same and I've published tons of papers about that."

But it's ridiculously hard to implement the functionality in such a way that you actually get to the point you have a tool people will be willing to use.

Even after trivializing the heck out of it, I am looking at, what, three years of development?

Tuesday, May 2, 2017

Log 050217

I committed the changes which cut operator overloading. I wasn't very sure about it because I saw it as a means of solving the expression problem for this language. However, the expression problem occurs in larger languages where you don't have all sources under your control because some code, modules, are maintained by others.

This language doesn't have users, I want to grow it into something domain specific, all source code is fully under control by the one user which is me. And the expression problem shouldn't occur to me or my users since I don't intend it to be used for general software engineering purposes.

As such, it was a solution to an entirely academic problem which only cost a steep amount of CPU cycles per lookup. And, in the unforeseen event I actually grow Egel into a language which attracts end-users, I imagine they care more about performance than about solutions to problems which don't exist.

Tuesday, April 25, 2017

Cutting operator overloading

I decided to cut operator overloading. What I want can be expressed from within the language, it doesn't seem to be a big deal, and it looks like that in an untyped language you want to handle operator overloading differently. I.e., with regard to the last remark, it looks like you want to handle arithmetic with a big case switch in a built-in.

Also, it simplifies the runtime.

Monday, April 24, 2017

Dare I cut more?

Egel is a melting pot of ideas, lots of which I needed to cut in the end. So. Operators.

I have a scheme where one can overload operator definitions to work on different 'types' of operands. But the computational cost is high, maybe too high for such a simple language.

Maybe it's just enough to define an operator once? It's a small language and an expensive lookup for every mathematical operator applied is hefty, certainly since operators can be disambiguated by hand by using namespaces.

Friday, April 21, 2017

Log 042117

Four months later.. Well, I implemented the new scheme and got an order speed-up out of it. That's less than I hoped for I guess but, ah well, from 40 to 3 seconds isn't bad I guess.

I'll upload the v0.1 interpreter tomorrow.

Monday, January 2, 2017

Log 010217

Well, happy New Year to everyone I guess. I can't say I am terribly amused with progress over the last month or so. For one, the slight change in semantical processing is something I should be able to code in a day or two/three, and somehow nearly a month has passed so far. I really wanted my source code on Github by now! But I also hurt my wrist which took two weeks to recover, so there is that too.

But I am back to programming, still in progress of implementing the data structure (simple multiple-linked nested symbol table), and discovered I probably need to overhaul it once again. Reason: I forgot a design decision in my former approach which uses one map for global symbols and one map for local analysis. And I made that decision because I want to go from source code to evaluation fast, and using one static symbol table and another one for local analysis means I'll be able to parallelize more in the long run.

So, now I need to figure out whether it is worth it to preemptively defensively code a global static and local dynamic table approach.

Pff. Design decisions over not even .5KLoc of code. It's a good thing it is only a hobby project because as a manager I would never hire myself to write any code.