|
-
Mar 30th, 2011, 01:29 AM
#1
Thread Starter
PowerPoster
[RESOLVED] Read line between specific tag (local html)
Ola,
How do I read/get text/line between specific text. For example:
my html-file contains
<title>This is my Title</title>
I have found a code, which I needed to convert from C#, but it doesn't work:
Code:
Private Shared Function GetTitle(html As String) As String
Dim r As New Regex("<title.*?>")
Dim Title_start As Integer = 0
Dim Title_end As Integer = 0
For Each m As Match In r.Matches(html)
If m.Value.ToLower = "<title>" Then
Title_start = m.Index + m.Length
Exit For
End If
Next
r = New Regex("</title.*?>")
For Each m As Match In r.Matches(html)
If m.Value.ToLower = "</title>" Then
Title_end = m.Index - Title_start
Exit For
End If
Next
Dim title As String = html.Substring(Title_start, Title_end)
Return title
End Function
I have a textbox which get's its value from the Url.Tostring on DocumentCompleted. For example: I have loaded a local html-file, the Url,Tostring would be file:///C:/page.html so I need to call it like:
Code:
GetTitle(textbox.text)
Unfortunately, it doesn't work.
I have already loaded the html-text into a rtb, so perhaps is easier to extract it from there.
Thanks in advance.
Last edited by Radjesh Klauke; Mar 30th, 2011 at 02:03 AM.
-
Apr 2nd, 2011, 04:45 PM
#2
Hyperactive Member
Re: Read line between specific tag (local html)
Hi Radjesh,
This is a good site for working out Regex syntax's:
http://www.codeproject.com/KB/dotnet/regextutorial.aspx
Here's the regex I would use: "(?<=<(TagName).*>).*(?=<\/\1>)"
The function below shows how you can get the text between HTML tags using the regex string above. Note that you would need to pass the results back into the function to get nested tags.
Code:
'You can give any TagName e.g. title, H1, div, head etc. etc.
Private Function Get_HTMLTag(ByVal TagName As String, ByVal HTML As String) As List(Of String)
Dim lMatch As New List(Of String) 'Get the results in a List of strings
'RegexOptions.IgnoreCase allows case mismatch e.g. if TagName="title" results can include "title", "Title", "TITLE" etc.
'RegexOptions.Singleline allows .* to see past CarriageReturn characters
Dim Tag As New Regex("(?<=<" & TagName & ">).*(?=<\/" & TagName & ">)", RegexOptions.IgnoreCase Or RegexOptions.Singleline)
For Each rMatch As Match In Tag.Matches(HTML)
lMatch.Add(rMatch.Value)
Next
Return lMatch
End Function
And this is how you would call the function:
Code:
Dim Tags As List(Of String) = Get_HTMLTag("title", TextBox1.Text)
'You can loop through the list to view all of the results
For Each Tag As String In Tags
MsgBox(Tag)
Next
-
Apr 3rd, 2011, 10:58 AM
#3
Thread Starter
PowerPoster
Re: Read line between specific tag (local html)
Thanks for the great tip and help. You really helped me. +REP
-
Apr 3rd, 2011, 03:51 PM
#4
Thread Starter
PowerPoster
Re: Read line between specific tag (local html)
Thanks. Works great and learned something new.
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|