i want to scrape the top 200 search links from a google page on searching a keyword.
i am using httpwebrequest .
Any other simple way to do it ?
For so far i have this.
This is what i have tried but it's freezing the program and do not add more than 200 pages.Code:Dim request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("http://www.google.com/search?num=100&q=" & TextBox1.Text) Dim response As System.Net.HttpWebResponse = request.GetResponse Dim stream As System.IO.StreamReader = New System.IO.StreamReader(response.GetResponseStream()) Dim page As String = stream.ReadToEnd Dim regexobj As Regex = New Regex("http://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\\^\\*\(\)_\-\=\+\\\/\?\.\:\;\,]*)?") Dim matches As MatchCollection = regexobj.Matches(page) For Each item As Match In matches If Not item.Value.Contains("google") And Not item.Value.Contains("wj") Then ListBox1.Items.Add(item.Value) End If Next
How to fix that ?Code:Dim url As Integer = 1 Do Until url = 10 For Each item As Match In matches If Not item.Value.Contains("google") And Not item.Value.Contains("wj") Then ListBox1.Items.Add(item.Value & url) End If Next url = url - 1 Loop
Any help would be well.
Thanks


Reply With Quote
