Sunday, December 11, 2016

Log 121116

Just writing some stuff down since I didn't find a scheme yet to implement. So, what the interpreter now does is that it keeps all declarations in a nested map. All combinator definitions live fully qualified in the most outward range/symbol table; local ranges are used to reference renamed imported combinators or for variable declarations. Lookup and insertion are more or less O(1), more precise O(n*log(d)) where n is a small number denoting the nesting depth and d the number of declarations.

But when a namespace is entered or a using directive is encountered the global namespace is searched fully for declarations which match a prefix. I.e., quadratic behavior. In the case of the synthetic example with 6k namespaces and a total of 10k declarations, you end up with 60M lookups. (I knew that when I coded it, just didn't know what price I would pay for it in practice.)

This is, of course, utterly ridiculous. Good enough for a very short first-order approximation, not good enough to publish to Github.

Solution: Nested linked symbol tables. A namespace is nothing more than a named symbol table or range. So, I know there is an O(1) solution^1 where entering a namespace is as simple as adding a reference to a scope. I.e, four orders speedup on the synthetic example. But I guess I would need to implement a lot of nesting and I want it to be neat.

I assume this is also where namespaces originate from. It's just a trivial extension to languages which already have complex scoping rules, thus named symbol tables, like most OO languages.

^1: There is also an O(n) solution where entering a namespace, or encountering a using directive, is equivalent to importing the symbols from another scope. This is somewhat nicer since programmers are then warned for overlapping declarations. But I'll go for the O(1) behavior, thank you. I want it to be fast.

No comments:

Post a Comment