Zachary W. Huang
Making a programming language is challenge which I have always wanted to try. I had made a few sporadic and unsuccessful attempts in the past, but during a trip to Texas, I decided to take up the challenge once again.
I decided to try and implement a simple language - BASIC, the same language found on old 8-bit computers and calculators.
I used the Julia programming language for its meta-programming capabilities.
My BASIC interpreter essentially just transpiles BASIC into the Julia AST representation, which is then evaluated by the Julia JIT JAOT Compiler.
Lexing is the process of turning pure source code into semantically important tokens. For example, lexing
IF x < y THEN GOTO 30
would result in something similar to:
[
Keyword("IF"),
Identifier("x"),
Operator("<"),
Identifier("y"),
Keyword("THEN"),
Keyword("GOTO"),
Integer(30)
]
A sequence of tokens can then be parsed into an Abstract Syntax Tree (AST) or some other sort of representation (such as bytecode). My interpreter used recursive descent parsing to convert the lexed tokens into the Julia AST, which is a lowered representation of Julia code.
For example, something like PRINT (2 + 2)
would be converted to Meta.Expr(:call, :print, Meta.expr(:call, :+, 2, 2))
.
This step was quite easy for me - I just called Julia’s eval
function on each AST expression.
After a bit more fiddling with GOTO
s, I had a decent BASIC interpreter.
Sure, it was missing FOR loops, functions, and data types other than Float, Boolean, and String, but it could do things like calculate square roots:
// An example program written for the basic1 interpreter found in `/src/basic1.jl`
10 LET x = 9999
20 LET dx = 0.0001
30 LET y = 1
// determines if y^2 is close to x
40 IF ((y * y) - x < dx) && ((y * y) - x > -dx) THEN GOTO 70 ELSE GOTO 50
// improves the guess y for y = x^2 using Newton's method
50 LET y = (y + (x / y)) / 2
60 GOTO 40
70 PRINT y \ PRINT "is approximately the square root of" \ PRINT x
Output:
99.99500012962753
is approximately the square root of
9999
All in all, my interpreter supported:
- Features:
- Standard math operators (* - + /) along with integer division (|) and string concatenation (++)
- Comparing values (< > == !=)
- Single line comments (//)
- Integer, float, string (".."), and boolean (#t #f) literals
- Each line can have a line label, which is either an integer or a valid identifier followed by a colon
- The first labeled line in a program is chosen as the entry point for the interpreter
- Multiple statements on a single line can be separated by backslashes ()
- Keywords:
- LET <identifier> = <expression>
- IF <expression> THEN <statement> [ELSE <statement>]
- WHILE <expression> THEN <statement>
- PRINT <expression>
- INPUT <identifier> will read an input line into the variable name
- NOP does nothing
- GOTO <line label>, can jump to a line number, a line label, or a variable containing either
- EXIT exits the program
I cannot emphasize how great the book Crafting Interpreters by Bob Nystrom is. It really helped me to understand the ideas and concepts that drive programming languages.