Results 1 to 4 of 4

Thread: Problem with parsing on string!

  1. #1

    Thread Starter
    Addicted Member
    Join Date
    Feb 2011
    Posts
    151

    Problem with parsing on string!

    Hey, I got problem with parsing text with regex.

    PHP Code:
        Private Function Get_HTMLTag(ByVal TagName As StringByVal HTML As String) As List(Of String)
            
    Dim lMatch As New List(Of String'Get the results in a List of strings

            '
    RegexOptions.IgnoreCase allows case mismatch e.g. if TagName="title" results can include "title""Title""TITLE" etc.
            
    'RegexOptions.Singleline allows .* to see past CarriageReturn characters 
            Dim Tag As New Regex("(?<=" & TagName & ">).*(?=<\/" & "a></td></tr></table" & ">)", RegexOptions.IgnoreCase Or RegexOptions.Singleline)
            For Each rMatch As Match In Tag.Matches(HTML)
                lMatch.Add(rMatch.Value)
            Next

            Return lMatch
        End Function 
    PHP Code:
            Dim webClient As New WebClient
            Dim strings 
    As String

            strings 
    webClient.DownloadString("http://google.com/trends")
            
    Dim Tags As List(Of String) = Get_HTMLTag("sa=X"strings)
            
    webClient.Dispose()
            
    'You can loop through the list to view all of the results
            For Each Tag As String In Tags
                MsgBox(Tag)
            Next 
    Im trying to parse google.com/trends to one list, but Im getting this:

    first hot search + source beetween them + last hot search.

    But I want all hot searches to one list without source beetween them.

  2. #2
    Hyperactive Member
    Join Date
    Apr 2011
    Location
    England
    Posts
    421

    Re: Problem with parsing on string!

    Hi Lanibox,

    Remove RegexOptions.SingleLine.
    Code:
    Dim Tag As New Regex("(?<=" & TagName & ">).*(?=<\/" & "a></td></tr></table" & ">)", RegexOptions.IgnoreCase)
    Or you could shorten it to:
    Code:
    Dim Tag As New Regex("(?<=<(a).*sa=X>).*(?=<\/\1>)", RegexOptions.IgnoreCase)

  3. #3

    Thread Starter
    Addicted Member
    Join Date
    Feb 2011
    Posts
    151

    Re: Problem with parsing on string!

    Thanks! :P Working as It should be.

  4. #4

    Thread Starter
    Addicted Member
    Join Date
    Feb 2011
    Posts
    151

    Re: Problem with parsing on string!

    Got second problem =S

    I cant get Hot Topics on google.com/trends with this same beacause this is how it looks like on source code:
    Code:
    <td><a href='#' id=hot_topic_1></a></td>
    And this is Hot Search which I can get with this.

    Code:
     <td><a href=/trends/hottrends?q=kate+upton&date=2011-4-5&sa=X>kate upton</a></td></tr></table>
    So... Is there other way to get Hot Topics on google.com/trends?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width