Warning: this is an htmlized version!
The original is here, and the conversion rules are here. |
These notes are the very bare beginnings of a technical report. I felt that it would be immoral to keep them to myself until I could publish them, as that could take months or years; so, here they are. Enjoy, and please get in touch if you have any comments. Eduardo Ochs http://angg.twu.net/ [email protected] 2001nov29 Bootstraping a Forth-like language in 50 lines of Lua code ========================================================== If we define a Forth-like language as being one in which the interpreter parses a word, executes it immediately and repeats the process indefinitely, then the code below is an implementation of a Forth-like language: res = {} re = function( res, name, def ) if res[name] then return res[name] end res[name] = regex(def) return res[name] end re(res, "getline", "^([^\n]*)(\n?)") re(res, "getspaces", "^([^ \t]*)") re(res, "getword", "^[ \t]*([^ \t\n]*)") program = {} program.string = readfile(arg[1]) program.pos = 0 getword = function( ) local _, mall, m1 = regmatch(res.getword, program.string, program.pos) program.pos = program.pos + strlen(mall) return m1 end getline = function( ) local _, mall, m1, nl = regmatch(res.getline, program.string, program.pos) program.pos = program.pos + strlen(mall) if mall ~= "" then return m1 end end getuntilre = function( delimre ) local offset, mdelim = regmatch(re(res, delimre, delimre), program.string, program.pos) local m1 = strsub(program.string, program.pos+1, program.pos+offset) program.pos = program.pos+offset+strlen(mdelim) return m1 end dict = {} dict[""] = function( ) getline() end dict["lua-until"] = function( ) dostring(getuntilre(getword())) end while 1 do dict[getword()]() end The last block is the main loop, that parses a word with getword(), converts it to a function by looking it up in a dictionary, and executes the function; the second-to-last block defines the two only words with which the dictionary starts: "", that is executed every time the parser reaches an end of line, and that simply advances the parser pointer (that is stored in program.pos) past the end-of-line char, and "lua-until", that parses a string until a certain delimiter and evaluates that string as Lua code; the idea is that we can use that code to add more words to dictionary, to replace the interpreter main loop by something else, or whatever; thus, "lua-until" is essentially all what is needed to bootstrap a more powerful system. The execution of lua-until is a bit tricky, so let's see it in detail. Consider the following miniforth program: lua-until EOL print("Hello") exit() EOL this is not executed The meaning of "lua-until" is given by dict["lua-until"] = function( ) dostring(getuntilre(getword())) end so the execution of lua-until in the block above consists on parsing a word ("EOL", in that case), then running getuntilre("EOL") to parse everything up to its next occurrence -- getuntilre("EOL") will return the string '\n print("Hello")\nexit()\n' -- and evaluating that with dostring, which will print "Hello" and leave miniforth. Note that the parser won't ever touch what comes after the second EOL -- the "this is not executed". This is an example of a slightly less trivial miniforth program in which the lua-until block is used to define two new words: lua-until EOL dict["hello"] = function( ) print("hello") end dict["bye"] = exit EOL hello bye This is another one, in which we define two words that parse the following words themselves (actually `#' parses all the rest of the current line). Note that `p' evaluates the word as Lua code, and so it is fairy versatile; "p exit()", for example, leaves miniforth. lua-until EOL dict["p"] = function( ) pa(eval(getword())) end dict["#"] = getline EOL p "Hello" p 1+2 p dict # comment p exit() and this is the classical ": square dup * ; : cube dup square * ;" example -- but without bytecodes. lua-until EOL dstack = {} rstack = {program} dpush = function( val ) tinsert(dstack, 1, val) end dpop = function( ) return tremove(dstack, 1) end rpush = function( prog ) tinsert(rstack, 1, prog); program = prog end rpop = function( ) tremove(rstack, 1); program = rstack[1] end dict[""] = function( ) getline() if program.pos == strlen(program.string) then rpop() end end f = function( code ) rpush({string=code, pos=0}) end re(res, ";;", "[ \t\n];;([ \t\n]|$)") dict["::"] = function( ) local word, code = getword(), getuntilre(";;") dict[word] = function( ) f(%code) end end dict["::lua"] = function() local word, code = getword(), getuntilre(";;") dict[word] = dostring(format("return function() %s\nend", code)) end EOL ::lua * dpush(dpop()*dpop()) ;; ::lua dup dpush(dstack[1]) ;; ::lua . pa(dpop()) ;; ::lua val dpush(eval(getword())) ;; :: square dup * ;; :: cube dup square * ;; val 5 cube . val exit() # (find-fline "~/miniforth/") # (find-fline "~/miniforth/miniforth1.lua")