Results 1 to 4 of 4

Thread: [RESOLVED] Read line between specific tag (local html)

  1. #1

    Thread Starter
    PowerPoster Radjesh Klauke's Avatar
    Join Date
    Dec 2005
    Location
    Sexbierum (Netherlands)
    Posts
    2,244

    Resolved [RESOLVED] Read line between specific tag (local html)

    Ola,

    How do I read/get text/line between specific text. For example:

    my html-file contains
    <title>This is my Title</title>

    I have found a code, which I needed to convert from C#, but it doesn't work:

    Code:
    Private Shared Function GetTitle(html As String) As String
       Dim r As New Regex("<title.*?>")
       Dim Title_start As Integer = 0
       Dim Title_end As Integer = 0
       For Each m As Match In r.Matches(html)
          If m.Value.ToLower = "<title>" Then
             Title_start = m.Index + m.Length
             Exit For
          End If
       Next
       r = New Regex("</title.*?>")
       For Each m As Match In r.Matches(html)
          If m.Value.ToLower = "</title>" Then
             Title_end = m.Index - Title_start
             Exit For
          End If
       Next
       Dim title As String = html.Substring(Title_start, Title_end)
       Return title
    End Function
    I have a textbox which get's its value from the Url.Tostring on DocumentCompleted. For example: I have loaded a local html-file, the Url,Tostring would be file:///C:/page.html so I need to call it like:
    Code:
    GetTitle(textbox.text)
    Unfortunately, it doesn't work.

    I have already loaded the html-text into a rtb, so perhaps is easier to extract it from there.

    Thanks in advance.
    Last edited by Radjesh Klauke; Mar 30th, 2011 at 02:03 AM.


    If you found my post helpful, please rate it.

    Codebank Submission: FireFox Browser (Gecko) in VB.NET, Load files, (sub)folders treeview with Windows icons

  2. #2
    Hyperactive Member
    Join Date
    Apr 2011
    Location
    England
    Posts
    421

    Re: Read line between specific tag (local html)

    Hi Radjesh,

    This is a good site for working out Regex syntax's:
    http://www.codeproject.com/KB/dotnet/regextutorial.aspx

    Here's the regex I would use: "(?<=<(TagName).*>).*(?=<\/\1>)"

    The function below shows how you can get the text between HTML tags using the regex string above. Note that you would need to pass the results back into the function to get nested tags.
    Code:
        'You can give any TagName e.g. title, H1, div, head etc. etc.
        Private Function Get_HTMLTag(ByVal TagName As String, ByVal HTML As String) As List(Of String)
            Dim lMatch As New List(Of String) 'Get the results in a List of strings
    
            'RegexOptions.IgnoreCase allows case mismatch e.g. if TagName="title" results can include "title", "Title", "TITLE" etc.
            'RegexOptions.Singleline allows .* to see past CarriageReturn characters 
            Dim Tag As New Regex("(?<=<" & TagName & ">).*(?=<\/" & TagName & ">)", RegexOptions.IgnoreCase Or RegexOptions.Singleline)
            For Each rMatch As Match In Tag.Matches(HTML)
                lMatch.Add(rMatch.Value)
            Next
    
            Return lMatch
        End Function
    And this is how you would call the function:
    Code:
            Dim Tags As List(Of String) = Get_HTMLTag("title", TextBox1.Text)
    
            'You can loop through the list to view all of the results
            For Each Tag As String In Tags
                MsgBox(Tag)
            Next

  3. #3

    Thread Starter
    PowerPoster Radjesh Klauke's Avatar
    Join Date
    Dec 2005
    Location
    Sexbierum (Netherlands)
    Posts
    2,244

    Re: Read line between specific tag (local html)

    Thanks for the great tip and help. You really helped me. +REP


    If you found my post helpful, please rate it.

    Codebank Submission: FireFox Browser (Gecko) in VB.NET, Load files, (sub)folders treeview with Windows icons

  4. #4

    Thread Starter
    PowerPoster Radjesh Klauke's Avatar
    Join Date
    Dec 2005
    Location
    Sexbierum (Netherlands)
    Posts
    2,244

    Re: Read line between specific tag (local html)

    Thanks. Works great and learned something new.


    If you found my post helpful, please rate it.

    Codebank Submission: FireFox Browser (Gecko) in VB.NET, Load files, (sub)folders treeview with Windows icons

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width