|
-
Jun 25th, 2009, 09:57 PM
#1
Thread Starter
Fanatic Member
Get the Text inbetween two words (such as HTML Tags) without RegEx
Since I've seen this asked a lot here is the function for getting all the text in between two other strings (or tags, words, etc.).
Code:
Code:
Private Function GetTagContents(ByVal Source As String, ByVal startTag As String, ByVal endTag As String) As List(Of String)
Dim StringsFound As New List(Of String)
Dim Index As Integer = Source.IndexOf(startTag) + startTag.Length
While Index <> startTag.Length - 1
StringsFound.Add(Source.Substring(Index, Source.IndexOf(endTag, Index) - Index))
Index = Source.IndexOf(startTag, Index) + startTag.Length
End While
Return StringsFound
End Function
Example Scenario:
If Source was set to "I {b}love{/b} the word {b}life{/b} don't you?" and you set "{b}" and "{/b}" as the starting and ending tags, the List {"love","life"} would be returned. If the tags don't appear at all in the Source string then the lists count will be 0.
Explanation:
The first 2 lines are just variable declarations. Although in the second one we go ahead and search for our first match with:
Code:
Source.IndexOf(startTag) + startTag.Length
As you can see its just a normal IndexOf which gives us the index of the first start tag, however then I put + starTag.Length. The reason for this addition is that we don't want the index of the startag, we want the index of hte text after the startTag, so adding the length of startTag to it's index will give us what comes directly after it.
Then comes the While Loop. Our condition is:
Code:
While Index <> startTag.Length - 1
As you know, when IndexOf can't find the string, it returns -1. Well we can't just put "While Index <> - 1" because we will always add the startTag.Length onto the IndexOf to get the index of the text in it. So -1 would really be - 1 + startTag.Length, or switch it around to be easier like in the code.
Then comes the first line of the loop:
Code:
StringsFound.Add(Source.Substring(Index, Source.IndexOf(endTag, Index) - Index))
It starts off with StringsFound.Add, so as you can tell where going to add the string we just found to the list. If no string was found then the loop will never run thanks to its condition. Now, we still don't have the string to add, just it's starting index, so within the Add command were also going to find the rest of the string in between the tags at the same time. We start off with a substring of the Source because we already know the starting index of that string thanks to when we declared Index. Then for the length of the substring, you search for the endTag using IndexOf and then put it's index. Notice that you don't add the endtags length like we did with startTag, this is because the index of endTag is the same index as the very end of the string we need, so we don't need to change it.
Notice that there is an extra parameter in the IndexOf though, this is because in the future were going to move onto the next set of Tags, we don't want to get the index of the same endTag the whole time! The second parameter is what index to start looking for the endTag, this is easy, we want to start looking for the endtag right after the word starts, so just use Index.
Then comes the last line of the loop:
Code:
Index = Source.IndexOf(startTag, Index) + startTag.Length
It's nearly the same as the line that we declared Index on, in fact it does the same thing. Can you spot the difference? Yes there is a second parameter for this IndexOf. We do that for the same reason as we did in the first line of the loop, because we don't want to find the same startTag over and over again. So since we know that the index of the very beginning of the string in between comes after the startTag we found, then using that same index will get us the startTag that comes next after it.
And that's it for the loop. The last line simply returns the List(Of String), StringsFound, as the result of the function.
Last edited by Vectris; Sep 8th, 2009 at 07:35 PM.
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|