For those of you who do not know, I've been trying to create my own compiler. The first step in this is the lexical analyzer. Basically what a lexical analyzer does is separate everything in a source code and converts certain words/characters into tokens. The way that my lexical analyzer works in this GUI program is that it stores the tokens in an XML file in this format:
Here is an image of the program in action:Code:<source> <token> <name>token class name here</name> <value>token value here</value> </token> .... .... </source>
The scanner scans in order, so some important ordering in the case above is that I have my number class above my decimal class. I also have the identifier very last in the order. The reason for the number before the decimal is because any decimal(in my case) can match a number, but not every number can match a decimal. The reason why I have the identifier last is because almost everything(in my case) can be interpreted as an identifier.
Here is the program(minus the binaries):
lexical_analyzer.zip
Here is the template that I use for my language in case you don't want to create the token classes:
custom template.xml
Now this program is not practical in creating a custom compiler, it just simply shows a technique of a scanner.




Reply With Quote