My language

I have been working on a language, which I am still making. I am making it in C, because it will be fast and shouldn’t take too much memory. So far, I have implemented functions, recursion, variables and strings, and am working towards generics. This is how it works:

We start with a compiler, which reads the content of the given file. Then, we pass the string through a decommenting function, which iterates through the text and removes the content on a line after detecting two / characters. Once this is done, the source code is ready to be converted into bytecode.

The resulting string is first tokenised – sorting it into numbers, identifiers, strings and punctuation. From there, we take a different approach to the conventional method, and compile different statements based on the list of tokens (e.g. if the first token in the list matches if, we are compiling an if statement) instead of forming an AST. The tokens are split at each ; (marking the end of a statement), and are passed into a function which returns bytecode for each statement. The bytecode used to be split by new lines, but as mentioned later on, expressions can produce newlines creating unexpected new statements. Instead, we encode the length of the statement using the first 5 bytes of its string.

The way expression handling works is by iterating through each character in the string backwards. For each character, we compare it to the all of the operators, sorted by precedence. If a match is found, the process is repeated for each string before and after the operator, until either literals or identifiers are found. Brackets are handled by just changing the apparent precedence of operators. Once this is done, the expression tree is serialized using the length technique described earlier.

Now that the statements are converted into a bytecode that the virtual machine can read and execute quickly, they are written to a file.

The virtual machine now executes the statements, determining the length and statement type from the first 6 characters. Unfortunately, type checking is performed in the VM right now instead of the compiler, and I would like to move this behaviour into the compiler at some point.

Leave a Reply

Your email address will not be published. Required fields are marked *