I have an extensive list of subjects I can work on in the Egel language. Among them: performance (it needs to become an order faster), the introduction of quotation and local combinator definitions, fixing small unfinished corners, mobile code, and -last but not least- various language experiments around features not easily expressible in typed languages.
But I also want to migrate to C++20 modules and introduce an egel namespace. And I also want to do that before all the above since that should simplify and sanitize the interpreter; modules should allow for a more principled approach to the implementation and I'll be visiting, subsequently cleaning, a lot of code not touched in years. I prefer single-file (complete) modules for a bit of a silly reason: I mostly program in vim and it's just faster/easier to modify declarations not spread over different files.
Egel is intended to be an interpreted language implemented in C++, closely following the C++ standard and not much more (except for Unicode/libicu) and to be easily extendable and embeddable in C++. This already gives problems since C++20 compilers (gcc, clang, msvc) are at various stages of supporting the C++20 module system. This adds to the problems I already have supporting gcc/clang, the interpreter usually only builds on up-to-date systems due to various moving targets, i.e., cmake and libraries.
I am at the start of the process, and I decided to document a number of problems I encountered.
- Naming. A silly observation but do I go for `.ixx`, `.cppm`, or `.cpp` files? I decided on `.ixx`. Rationale: I don't like 4 letter extensions, and `.cpp` is too overloaded for my taste. This is weird because I don't support msvc at the moment, I only work with clang since my shift to a Mac M1, and I want to primarily support the venerable gcc.
- Macros. I make extensive use of macros. A number of them I have been phasing out (like casts) but there are two types of macros I would prefer to keep. Assertions (like `PANIC(m)`, abort with a message) and multiple assignment helpers (like `SPLIT_LAMBDA(l, vv, t)`, split a lambda AST object into variables and a term). Then there are boiler-plate macros (like DYADIC_PREAMBLE which sets up all the boilerplate of a dyadic combinator.) Where do they go? Do I need to include a `macro.hpp` header file in every module?
- Constants. A number of constants are defined as macros in the interpreter. Like `#define LPAREN '('`. I can easily switch over to constexpr but that implies I need to decide on a type, in my case char or UChar32. I would prefer not to. It gets worse for string constants: char*, string, or icu::UnicodeString? Sometimes it makes sense to treat something as text instead of a typed expression. This is a minor detail but switching over to constexpr actually makes the code more tedious to support long term, instead of the opposite.
- Inclusion of other libraries. Where are `<iostream>` and the libicu c++20 module equivalents? Do I include, do I import, if I import, then what to do I import? Where do I find the information, i.e., what website documents where the objects are defined?
- CMake. I have no clue how to write cmake modules that support c++20 modules, not even considering in a portable manner.
No comments:
Post a Comment