For those of you who do not know, I've been trying to create my own compiler. The first step in this is the lexical analyzer. Basically what a lexical analyzer does is separate everything in a source code and converts certain words/characters into tokens. The way that my lexical analyzer works in this GUI program is that it stores the tokens in an XML file in this format:
Code:
<source>
    <token>
        <name>token class name here</name>
        <value>token value here</value>
    </token>
    ....
    ....
</source>
Here is an image of the program in action:
Name:  image.jpg
Views: 1841
Size:  46.0 KB

The scanner scans in order, so some important ordering in the case above is that I have my number class above my decimal class. I also have the identifier very last in the order. The reason for the number before the decimal is because any decimal(in my case) can match a number, but not every number can match a decimal. The reason why I have the identifier last is because almost everything(in my case) can be interpreted as an identifier.

Here is the program(minus the binaries):
lexical_analyzer.zip

Here is the template that I use for my language in case you don't want to create the token classes:
custom template.xml

Now this program is not practical in creating a custom compiler, it just simply shows a technique of a scanner.