-
How can a program parse HTML code for <A> tags? Then, once inside the tag, find the string: href="mailto:". I need to extract a database of people with e-mail adresses and I need to isolate only the e-mail adress, excluding "?subject=". For example,
<A href="mailto:[email protected]?subject=hello">
And vb finds only, "[email protected]"? Some code would help, im fairly new to VB
-
You can use the InStr Function to do this. For example
Code:
Private Function GetEMail(InputStr)
Dim i As Integer, b As Integer
Dim FinalStr As String
i = InStr(1, InputStr, ":")
If i Then
i = i + 1
b = InStr(i, InputStr, "?")
If b Then
FinalStr = Mid(InputStr, i, b - i)
GetEMail = FinalStr
Exit Function
End If
End If
GetEMail = "Invalid Address"
End Function
The GetEMail function accepts the original string and returns the url only
You can modify this function to suit your needs
-
Code:
Dim strHTML As String
Dim intStartPos As Integer
Dim intEndPos As Integer
strHTML = "<A href=""mailto:[email protected]?subject=hello"">"
' InStr returns the starting position of the "mailto:" string
intStartPos = InStr(1, strHTML, "mailto:")
' Calculate the starting position of the email
' address by adding the length of "mailto:"
intStartPos = intStartPos + Len("mailto:")
' I don't know much about HTML, so I'm going to assume
' that the "?" will always follow the email address,
' so this locates the "?"
intEndPos = InStr(intStartPos, strHTML, "?")
MsgBox Mid$(strHTML, intStartPos, intEndPos - intStartPos)
' If the "?" is not always there, then you can create similar
' code to look for .com, .edu, .org, .gov instead