Lexical analysis using stack implementation
My goal is to scan a file and check whether the given file is well-formed or not.
for example
Code:
<test>
<yeah>
<yahoo> bla blabla </yahoo>
</yeah>
what ever <s> it is </s>
<br/> test
</test>
i have parsed every single tag into the stack now I'm at the point where I have to check whether all the tag is paired except the tag without closing like <br/>
now my stack look like the following
[8] = </test>
[7] = <br/>
[6] = </s>
[5] = <s>
[4] = </yeah>
[3] = </yahoo>
[2] = <yahoo>
[1] = <yeah>
[0] = <test>
.......
and so on.
Can you guys help me good algorithm to check whether or not all the tag is paired.
I'm not expecting the code to make this happen and I'm sure you won't give that to me, I just need good algorithm or steps that can make this happen. Been thinking for quite sometime but I cant figure the easy way to do this.
Thank you in advance
Re: Lexical analysis using stack implementation
Create another stack object.
Itterate through your list of tags
Every time you get an opening tag push it on the stack.
Every time you get a closing tag make sure there's it's opening tag on the top of the stack (if it isn't then the file isn't well-formed). If there is a corisponding opening tag on the stack then pop it off.
If you get a tag such as <br/> that appears on its own ignore it (or maybe do your own check specific to what you're parsing.
When you get to the end of the list of tags then if there's nothing on the stack the file is ok, if there's stuff still on the stack this means there are still open tags, so the file isn't well-formed.
Re: Lexical analysis using stack implementation
For your file listed the stack would have the following tags as you itterated through each one:
-Nothing-
Test
Yeah, Test
Yahoo, Yeah, Test
Yeah, Test
Test
s, Test
Test
-Nothing-