I am attempting to build a web crawler and was wondering where to start? I already have a function that can return a List(Of String) of all the links on a page.
vb Code:
Private Function getemailsfromhtml(ByVal html As String) As List(Of String) Dim output As New List(Of String) If html = "" Then html = " " End If Dim s As String = "([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})" Dim m As MatchCollection = Regex.Matches(html, s) For i As Integer = 0 To m.Count - 1 output.Add(m(i).ToString) Next Return output End Function


Reply With Quote