Well, almost solved, though I cant get rid of the darn JavaScript. Here's what I got so far:
Add references to "Microsoft VBScript Regular Expressions 5.5".
On Form:
place 2 text boxes, 1 Inet control, and 2 buttons.
VB Code:
Private Sub Command1_Click() Text1 = Inet1.OpenURL("http://www.yujunet.com/") End Sub Private Sub Command2_Click() Dim temp1 As String Dim temp2 As String Dim temp3 As String Dim newstring temp1 = RemoveLines(Text1) temp2 = RegExFind(temp1, "<script[^>]*>(.*)</script>") temp3 = RegExReplace(temp1, temp2, "") temp3 = RemoveHTML(temp3) Text2 = temp3 End Sub
In Module:
VB Code:
Function RemoveLines(myString As String) 'convert multiline to single line string: myString = Replace(myString, vbTab, " ") 'removes Tabs myString = Replace(myString, Chr(13), " ") myString = Replace(myString, Chr(10), " ") myString = Replace(myString, vbCrLf, " ") myString = Replace(myString, vbNewLine, " ") RemoveLines = myString End Function Function RegExFind(myString As String, FindWhat As String) On Error Resume Next 'Create objects. Dim objRegExp As RegExp Dim objMatch As Match Dim colMatches As MatchCollection Dim RetStr As String Set objRegExp = New RegExp objRegExp.Pattern = FindWhat objRegExp.IgnoreCase = True objRegExp.Global = True objRegExp.MultiLine = True If (objRegExp.Test(myString) = True) Then Set colMatches = objRegExp.Execute(myString) For Each objMatch In colMatches RetStr = objMatch.Value Next Else RetStr = "" 'No matches End If RegExFind = RetStr End Function Function RegExReplace(myString As String, FindThis As String, ReplaceWithThis As String) On Error Resume Next 'search string for item and then replace with new item: Dim sourse1 As String, resourse As Object sourse1 = myString Set resourse = New RegExp resourse.Pattern = FindThis resourse.Global = True resourse.IgnoreCase = True If resourse.Test(sourse1) = True Then myString = resourse.Replace(sourse1, ReplaceWithThis) End If RegExReplace = myString End Function Function RemoveHTML(strText As String) Dim RegEx Set RegEx = New RegExp RegEx.Pattern = "<[^>]*>" RegEx.Global = True RegEx.IgnoreCase = True strText = Replace(strText, " ", "") RemoveHTML = RegEx.Replace(strText, "") End Function
Any suggestions would really help![]()




Reply With Quote