PDA

Click to See Complete Forum and Search --> : get rid of HTML in string


Jan 11th, 2000, 11:04 AM
I have a string full of HTML and I am only interested in the text that would be visible in a browser. How can I filter out the HTML code in this string?

Jan 11th, 2000, 11:49 AM
You need to filter out anything between the < and the >(including the signs them self), except for the <br> and the </p>.

Example:


Option Explicit

Private Sub Form_Load()
MsgBox HTMLParser("<p><ul><b>This is the text you would see in the web browser.<br>Test.</b></p><br>This is the end")
End Sub

Private Function HTMLParser(ByVal sHTML As String) As String
Dim lCounter As Long
Dim sTemp As String
For lCounter = 1 To Len(sHTML)
Select Case Mid(sHTML, lCounter, 1)
Case "<"
Select Case LCase$(Mid$(sHTML, lCounter, 4))
Case "<br>", "</p>"
'<br>, is next line; </p> is end paragraf
sTemp = sTemp & vbCrLf
lCounter = lCounter + 3
Case Else
Do Until Mid(sHTML, lCounter, 1) = ">"
lCounter = lCounter + 1
Loop
'End Case
End Select
Case Else
sTemp = sTemp & Mid(sHTML, lCounter, 1)
'End Case
End Select
Next
HTMLParser = sTemp
End Function


------------------

Vincent van den Braken
EMail: azzmodan@azzmodan.demon.nl
ICQ: 15440110 (http://www.icq.com/15440110)
Homepage: http://www.azzmodan.demon.nl





[This message has been edited by Azzmodan (edited 01-11-2000).]