Tuesday, December 21, 2021
Compacting GC?
Sunday, December 12, 2021
For future reference
egel -e "import \"prelude.eg\";; using System;; using List;; foldl (+) 0 (from_to 0 1000000)"
/opt/homebrew/Cellar/llvm@12/12.0.1_1/bin/llvm-profdata merge -sparse default.profraw -o default.profdata
/opt/homebrew/Cellar/llvm@12/12.0.1_1/bin/llvm-profdata show -all-functions default.profraw
Tuesday, December 7, 2021
Testing 1.. 2.. 3..
// memory test
#include <memory>
#include <new>
#include <type_traits>
#include <utility>
// model values = atom int | array value*
static constexpr size_t roundup(size_t x, size_t m)
{
return x % m == 0 ? x : (x + m) / m * m;
};
class Value;
typedef std::shared_ptr<Value> ValuePtr;
class Value {
public:
Value() = default;
virtual ~Value() {
}
};
class Atom: public Value {
public:
Atom(const int v): _value(v) {
}
virtual ~Atom() {
}
static ValuePtr create(const int v) {
return std::shared_ptr<Atom>(new Atom(v));
}
private:
int _value;
};
struct ValueField {
ValuePtr _value;
};
class Array: public Value {
public:
Array() = default;
virtual ~Array() {
while (_size-- > 0)
std::addressof(operator[](_size))->~ValueField();
}
ValueField &operator[](size_t i) {
char *r0 = reinterpret_cast<char *>(this) + roundup(sizeof(*this), alignof(ValueField));
ValueField *t0 = reinterpret_cast<ValueField *>(r0);
return t0[i];
}
ValueField &at(size_t i) {
if (i >= _size)
throw std::out_of_range("duuude!");
return operator[](i);
}
static ValuePtr create(size_t _sizes) {
return std::shared_ptr<Array>(new(_sizes) Array(_sizes));
}
void *operator new(size_t count, size_t _sizes) {
auto z = roundup(count, alignof(ValueField)) + sizeof(ValueField) * _sizes;
return ::operator new(z);
}
void operator delete(void *x) {
::operator delete(x);
}
// not implemented for brevity reasons.
// Also, it would have to handle (if only to throw an exception)
// if someone attempted to move an Array(30) into an Array(20)
/*
Array(Array &&) = delete;
void operator=(Array &&) = delete;
*/
private:
Array(size_t n) {
for (_size = 0; _size < n; ++_size)
new(std::addressof(operator[](_size))) ValueField;
}
size_t _size = 0;
};
int main()
{
Array fixed;
auto flex = Array::create(20);
// (*flex)[1]._value = 42;
}
Thursday, December 2, 2021
Serialization
I am more pondering than writing code at the moment. In order to get to mobile code, I need to be able to serialize, in order to serialize I need some representation independent of the runtime. Because when the serialized graph isn't independent it might corrupt the runtime when it is being loaded among other things.
It looks like I have little choice but to copy almost the complete runtime hierarchy into a serialization format. Which sounds worse than it is, only four primitive types, arrays, then combinators. And then the combinators are the problem. Data is easy, bytecode can be (dis-)assembled, internal objects are always loaded, then the problem is externally defined C++ combinators. Or I do everything on a per-module basis, that's unclear at the moment.
Then there's the question of what goes into the graph. Say I have a bytecode combinator, do I only write the symbol or also the bytecode? No idea yet but I am tempted to have that separate from the graph.
The good part, serialization makes me think hard about how the interpreter is organized. The bad part, now I want to refactor again and get rid of some warts (macros), introduce namespaces, and employ c++20 modules. It's annoying to code against old code.
Also, pretty sure I am going to write the mobile code on top of grpc, which brings other problems since that's a client/server architecture and I want bidirectionality and more general a cloud of nodes which can talk to each other. But ah well, not everything is perfect. Maybe I'll just use it anyway.
Sunday, November 28, 2021
Hadoop
While I work on the Python bridge code I am kind-of thinking I am getting a bit too distracted from the original goal of the Egel language, which was to facilitate data-center programming.
At the same time, it doesn't seem worthwhile to reinvent the wheel in this. Apache Hadoop has a ton of experience behind it so I'll need to look at whether I can lift that with Egel bindings after the Python bridge. Or maybe Kubernetes integration?
Thursday, November 25, 2021
Futamura Projections
I was discussing Futamura projections and made two observations.
- Most of it remains true when you consider that partial evaluation doesn't have a lot to do with it all. Just take an interpreter and input program, combine them, and you already have a 'compiler.' The whole partial evaluation stuff is more the icing on the cake which makes you hope you can end up with an efficient solution.
- Regarding the efficient solution, it's likely the case that partial evaluation in the extreme is an undecidable or NP problem. As an intuition, just take a sat solver as an interpreter. Then Futarama predicts that a sat solver plus a problem statement would give an efficient solver for that problem class, and presto, your specializer proved P versus NP. Doesn't mean you cannot try this approach heuristically, but does imply that you have less than no guarantees at arriving at an efficient solution.
Initial Python Bridge Results
Okay, great. After refactoring I have some initial bridge results.
marco@Marcos-Air build % egel ../test/test0.eg
Python initialized
egel (System::none) bridged (System::none) eq (System::true)
egel (System::false) bridged (System::false) eq (System::true)
egel (System::true) bridged (System::true) eq (System::true)
egel (7) bridged (7) eq (System::true)
egel (3.140000000000000) bridged (3.140000104904175) eq (System::false)
egel (a) bridged (a) eq (System::false)
egel (text) bridged (text) eq (System::true)
Python finalized
Thursday, November 4, 2021
Wednesday, November 3, 2021
Categorical Nomenclature
Corbin@libera is working on a language and made some observations regarding nomenclature. This is a very handy list, I could steal this once.
Sunday, October 31, 2021
Byte Code Player/Recorder/Optimizer
Right, so due to side-effecting code a range of optimizations is out of reach. For example, even beta reduction is unsafe under most circumstances. Like below.
(\f -> f . f) (print 'hello'; (\x -> x + 1)) 0
That gives one side effect strictly right-to-left reduced, but two when 'optimized' by first beta-reducing the term.
HOWEVER,
It should be safe to just observe what bytecode instructions were emitted during the normal evaluation and glue those together.
At some point, I am going to optimize with some kind of bytecode player/recorder.
Static Programming
I thought about dynamic programming, ... and then decided against it. Though I want people (me) to be able to inspect the runtime, I don't want people to be able to modify it (except maybe through loading modules).
Simple reason: when the meaning of a symbol doesn't change that shaves off a lookup over a table, plus you enable optimizations, and that implies way faster code.
Saturday, October 30, 2021
Friday, October 29, 2021
Searching is calculating, parsing is searching
Thursday, October 28, 2021
Calculations
## Calculate:return a - calculate further with value a
def return =
[ A S -> (A, S) ]
## Calculate:chain f g - first do f, then g
def chain =
[ F G S -> [(A, S) -> G A S] (F S) ]
## Calculate:run f s - run calculation f on state s
def run =
[ F S -> [(A, S) -> A] (F S) ]
Tuesday, October 26, 2021
Pre-release v0.1.1: hotfixed critical regression
So, a critical regression made the interpreter stall on large input. I followed a hunch that since the lexer became LL(n) instead of LL(1) the UnicodeString's charAt32 method kept resetting an internal pointer resulting in quadratic behavior. The fix worked.
Back at 50KLoc/sec parses on a synthetic example. It warranted a pre-release of v0.1.1 because this was a show-stopper for me.
Ah well. Neat, I guess.
FreeBSD 13.0 compile and install
The Egel interpreter now works on FreeBSD though you need to set the include environment variable. Took me a day to get FreeBSD+dwm running under Qemu on my Macbook Air M1, but felt worth it.
Saturday, October 23, 2021
To do: Time & Date
So, time and date functions for Egel. Probably these should come in the form of built-in primitives since you want to be able to natively parse and format them. Following the rule: lift C++ to Egel I am looking at that but find it hard, there's some kind of impedance mismatch going on I find hard to place.
Wednesday, October 20, 2021
Optimal Reduction. Levy, you serious?
Optimal probably involves never performing the same reduction during the lifetime of a program.
- Likely there's a result akin to halting that deciding whether two similar reductions might occur during the lifetime of a program is undecidable.
- Optimal sharing probably can be viewed, or stated as, as an optimal search problem thus then no optimal reduction can exist unless P=NP; i.e., optimal reduction would be optimal search would be a polynomial reduction strategy for NP problems.
Tuesday, September 28, 2021
Built-in combinators of the Egel interpreter
I fixed a quick and dirty script to generate a list of built-in combinators. The script and the list of combinators.
Monday, September 27, 2021
FSF CFarm results
FSF graciously granted me access to their compile farm. So far, the Egel interpreter doesn't compile on a lot of machines because it relies on a lot of modern software.
It needs cmake (3.13), libicu (65.0), and libfmt (8.0) and those are not installed on the majority of machines.
I also get wildy varying results back on the performance of the microbenchmark million.eg (summation of a list of the first million integers). Below, the results:
gcc185.fsffrance.org (centos, aarch64) - succesful compile and local install. 22 sec on million
gcc202.fsffrance.org (debian, sparc64) - succesful compile and local install. 1.22 min on million
gcc203.fsffrance.org (debian, powerpc64) - succesful compile and local install. 25 sec on million
I'll try a number of other machines (some I am interested in seem to be down). Next steps are to create rpm and deb packages.
Thursday, September 9, 2021
Dynamic dispatch in Egel
An example how to do dynamic dispatch in Egel. This is a cool language.
# Just a showcase how one could do dynamic dispatch in Egel over a table. # # This is of course an extremely slow implementation over a lookup table # represented as a list. But moving the combinators to C++ would help. import "prelude.eg" using System using List namespace Dispatch ( val dispatch_table = ref {} def dispatch_register = [ TAG FUNC -> let TABLE = get_ref dispatch_table in let TABLE = cons (TAG FUNC) TABLE in set_ref dispatch_table TABLE ] def dispatch_findrec = [ TAG nil -> throw (format "dispatch for {} failed" TAG) | TAG0 (cons (TAG1 FUNC) TABLE) -> if TAG0 == TAG1 then FUNC [ _ -> dispatch_findrec TAG0 TABLE ] else dispatch_findrec TAG0 TABLE ] def dispatch_on = [ TAG -> let TABLE = get_ref dispatch_table in dispatch_findrec TAG TABLE ] ) def ++ = Dispatch:dispatch_on "++" val plus_float_register = let FUNC = [ K EXPR0 EXPR1 -> if [ E::float -> true | E -> false ] EXPR0 then (+) EXPR0 EXPR1 else K nop EXPR0 EXPR1 | K -> K nop ] in Dispatch:dispatch_register "++" FUNC val plus_text_register = let FUNC = [ K EXPR0 EXPR1 -> if [ E::text -> true | E -> false ] EXPR0 then (+) EXPR0 EXPR1 else K nop EXPR0 EXPR1 | K -> K nop ] in Dispatch:dispatch_register "++" FUNC def main = print (1.0 ++ 2.0) "\n"; print ("hello " ++ "world") "\n"; print (1 ++ "failure")
Wednesday, September 8, 2021
Egel interpreter runs twice as fast in VM
Okay, so I ported the interpreter to Fedora 34 (arm64) in a virtual machine on the Air. Now the microbenchmark runs twice as fast than on native, 2.5 seconds.
I thought I might have messed up and the thing doesn't garbage collect, so tested that and:
[marco@fedora debug]$ valgrind ./egel ../examples/million.eg
==2653== Memcheck, a memory error detector
==2653== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==2653== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
==2653== Command: ./egel ../examples/million.eg
==2653==
500000500000
==2653==
==2653== HEAP SUMMARY:
==2653== in use at exit: 3,374 bytes in 4 blocks
==2653== total heap usage: 40,051,242 allocs, 40,051,238 frees, 2,275,766,965 bytes allocated
==2653==
==2653== LEAK SUMMARY:
==2653== definitely lost: 0 bytes in 0 blocks
==2653== indirectly lost: 0 bytes in 0 blocks
==2653== possibly lost: 0 bytes in 0 blocks
==2653== still reachable: 3,374 bytes in 4 blocks
==2653== suppressed: 0 bytes in 0 blocks
==2653== Rerun with --leak-check=full to see details of leaked memory
==2653==
==2653== For lists of detected and suppressed errors, rerun with: -s
==2653== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Tuesday, September 7, 2021
Moved to MacOS (arm64). Egel interpreter now twice as fast
The microbenchmark 'million.eg' which computes the sum of the first million numbers ran in 16 seconds on my old intel laptop. It's just a fold: foldl (+) 0 (fromto 0 1000000)
Observe:
egel % time egel examples/million.eg
500000500000
egel examples/million.eg 6.94s user 0.10s system 99% cpu 7.061 total