For those of you who do not know, I've been trying to create my own compiler. The first step in this is the lexical analyzer. Basically what a lexical analyzer does is separate everything in a source code and converts certain words/characters into tokens. The way that my lexical analyzer works in this GUI program is that it stores the tokens in an XML file in this format:
Here is an image of the program in action:Code:<source>
<token>
<name>token class name here</name>
<value>token value here</value>
</token>
....
....
</source>
Attachment 114945
The scanner scans in order, so some important ordering in the case above is that I have my number class above my decimal class. I also have the identifier very last in the order. The reason for the number before the decimal is because any decimal(in my case) can match a number, but not every number can match a decimal. The reason why I have the identifier last is because almost everything(in my case) can be interpreted as an identifier.
Here is the program(minus the binaries):
Attachment 114947
Here is the template that I use for my language in case you don't want to create the token classes:
Attachment 114949
Now this program is not practical in creating a custom compiler, it just simply shows a technique of a scanner.
