PDA

Click to See Complete Forum and Search --> : Web crawler


giodamelio
Jun 17th, 2010, 05:01 PM
I am attempting to build a web crawler and was wondering where to start? I already have a function that can return a List(Of String) of all the links on a page.



Private Function getemailsfromhtml(ByVal html As String) As List(Of String)

Dim output As New List(Of String)
If html = "" Then
html = " "
End If
Dim s As String = "([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})"
Dim m As MatchCollection = Regex.Matches(html, s)
For i As Integer = 0 To m.Count - 1
output.Add(m(i).ToString)
Next
Return output
End Function