Egel Language

Wednesday, December 25, 2024

Lessons learned from Advent of Code 2024

I should have kept a log, but here we are: a bullet list of lessons learned from Advent of Code this year in no particular order.

Egel is a fine language for short scripts, but—here we go again—it lacks an order in performance.
When I do stuff right, and this year I did, Egel programs are comparable in size to Haskell programs but -in my mind- way more readable.
Where Haskell shines is sometimes clever use of laziness and -very important- list comprehensions. Egel sometimes shines because of clever use of dynamic typing (usually the use of none), and having mutation.
In comparison to K, both Haskell and Egel take an order more symbols. Ken Iverson was on to something.
Where Egel is slow that is usually due to several factors.

First, the interpreter implements a slow rewriter and something like a one-instruction increment which comes at virtually no cost in most imperative languages takes orders, and not a small amount, more.

Then second, I often iterate over dictionaries, which means first creating the list of keys and then folding over all those keys. I am not sure that is even that hefty a cost, but I should probably write folds and maps for dictionaries.

Third, looking up a value in a dictionary costs much more than indexing in an array.

That means that one example, day 6, which involves an enormous amount of iteration over a board, is very, very slow and times in at 12 minutes.
I added namespace mechanisms and that seems to be a win. Because I added namespaces however, and I still wanted to be able to include the out-of-the box utilities with a short statement I decided to uniquely name all combinators. (I.e., you'll normally find a length operator on lists, strings, and dictionaries, those now all have different names.)
That was a mistake, roughly I need to use the namespaces where nothing in System, OS, and List overlaps but it's fine to have a length operator on strings and use the namespace mechanisms to rename or include that in the manner deemed fit.
I should implement more generic combinators akin to Python or array languages like J/K. Just having add on atoms and complex data shortened the code substantially.
I still get bitten by `-1` versus `-` `1`. I also got bitten by my dumb operator precedence heuristic, the latter did fine for a while but now I need to fix that.
At the end of Advent of Code, programs become longer, and development becomes painful in Egel. I expected this but Egel is indeed only usable for relatively straightforward code.
For short programs, something like a missing or wrongly named argument is trivially found by inspecting the output after running the program. For longer complex programs, it becomes hard to see where the bug is without strong typing.
There was a segfault somewhere during coding this month I can no longer reproduce.

Egel is, after several years of ad-hoc development, still in an alpha stage. But I am getting there.

Egel programs and several metrics are in the README.

Sunday, September 29, 2024

Namespaces, revisited.

I conservatively extended namespaces, but that took an internal overhaul of how scopes are treated. Shorthands are syntactic sugar for full-blown expressions.

using System

The above is sugar for:

using System (c0, c1, c2, ...)

Where c0, c1, c2, ... are all combinators in the System workspace. And that is sugar for:

using ε = System (c0, c1, c2, ...)

Which is again sugar for a number of renames.

using ε = System (c0 = c0, c1 = c1, c2 = c2, ...)

The renames all introduce bound names into the current scope. E.g.,

using Sys = System (p = print)

This will bind 'Sys::p' to 'System::print'.

Docstrings

I added docstrings to the Egel language some time ago. You can add documentation to either a file or a combinator.

@"""A multiline docstring for a file"""

def f =
    @"this is the fabulous 'f' combinator"
    [X -> 42]

Given docstrings, you can now ask for that interactively in the REPL.

>> help 'f'
{"this is the fabulous 'f' combinator"}

'Help' will do a best-match against all combinator definitions.

Saturday, July 27, 2024

Thread local Stopgap

I made the stopgap solution thread local. It's still expensive.

   ~VMObjectArray() {
#ifdef GC_STOPGAP
        if (deferring) {
            for (int i = 0; i < _size; i++) {
                defer.push(_array[i]);
            }
            delete[] _array;
        } else {
            deferring = true;
            for (int i = 0; i < _size; i++) {
                defer.push(_array[i]);
            }
            delete[] _array;
            while (!defer.empty()) {
                defer.pop();
            }
            deferring = false;
        }
#else
        delete[] _array;
#endif
    }

Tuesday, July 16, 2024

Stopgap solution to Cascading Free

Egel runtime objects are instances of C++'s std::shared_ptr. Because egel objects can be deeply nested, it is not uncommon for large structures to go out of scope and result in cascading frees that blow up the C stack.

I.e., the following short egel program would core dump:

# rewrite a list of a million numbers to nonefrom_to 0 1000000 |> [_ -> none]

I implemented a costly stopgap solution. First, we need a collector for trash. It has an in queue and an out queue, and its sole intention is to retain objects for a while such that they don't go out of scope recursively and blow up the C stack.

class GC {

public:
    GC() {
    }

    void push(const VMObjectPtr &o) {
        std::lock_guard<std::mutex> lock(mtx_in);
        _in.push(o);
    }

    void copy() {
        std::scoped_lock lock(mtx_in, mtx_out);
        while (!_in.empty()) {
            _out.push(_in.top());
            _in.pop();
        }
    }

    bool done() {
        std::scoped_lock lock(mtx_in, mtx_out);
        return _in.empty() && _out.empty();
    }

    void empty() {
        std::lock_guard<std::mutex> lock(mtx_out);
        while (!_out.empty()) _out.pop();
    }

    void clear() {
        while (!done()) {
            copy();
            empty();
        }
    }

private:
    mutable std::mutex mtx_in;
    mutable std::mutex mtx_out;
    std::stack<vmobjectptr> _in;
    std::stack<vmobjectptr> _out;
};

inline GC garbage_collector;

Now when an array goes out of scope, we place its members in the collector.

    ~VMObjectArray() {
        for (int i = 0; i < _size; i++) {
            garbage_collector.push(_array[i]);
        }
        delete[] _array;
    }

And during rewrites we then empty the queues.

    while ((trampoline != nullptr) && (*run != HALTED)) {
        if (*run == RUNNING) {
            trampoline = f->reduce(trampoline);
            garbage_collector.clear();
        } else if (*run == SLEEPING) {
            std::this_thread::sleep_for(std::chrono::milliseconds(100));
        } else {  // *run == HALTED
        }
    }

I am not happy with this solution. First, I consider it a std::shared_ptr bug that it can blow up the C stack, and second, the performance cost is way too steep for my taste.

I'ld rather convince C++ developers that this is their bug; for all I know cascading frees are a normal thing in reference counted garbage collection and their implementation should be able to handle that.

Monday, June 3, 2024

Temporal Logic Synthesis

I wanted to show that Egel can be used to create small tools, and decided upon a temporal logic synthesizer. After reading a bit I trimmed that down to: do something useful with LTL and Z3. The first because LTL is relatively small, the latter since I want Z3 bindings for Advent of Code anyway.

Synthesis is a bit of a hopeless subject since the best you can hope for is to derive small automata with maybe five states. Moreover, it already exploded on translating LTL to negation normal form, the first ostensibly minor step in the whole process.

But I remain committed for now.

Tuesday, February 6, 2024

E-Graphs trivially in PSPACE

Some people pointed me at E-graphs, a manner for representing and then heuristically minimizing terms. Of course, Boolean circuits are terms too and minimizing circuits is PSPACE. Very much so, even.

So, it's unclear how much minimization you can actually achieve with E-graphs.