Gents,
I'm able to retrieve the source code of a web page and store it in a string variable. I would like to cast that string variable into an HTMLDocument if possible, to make parsing its elements much easier.
How can I do this?
Printable View
Gents,
I'm able to retrieve the source code of a web page and store it in a string variable. I would like to cast that string variable into an HTMLDocument if possible, to make parsing its elements much easier.
How can I do this?
you can declare a webbrowser control in code + use it without adding it to a form:
vb Code:
Dim wb As New WebBrowser wb.DocumentText = "your html string" Dim doc As HtmlDocument = wb.Document
ah ok. i was close!
Problem I'm seeing is that my variable DOES hold valid html (an entire page's worth), but when I set the DocumentText equal to it, it strips everything out and leaves only "<HTML></HTML>" so any future calls to it generate a Null Reference Exception..
Here's my code:
vb Code:
Try 'updates will hold the HTML Source code (verified working) updates = getUpdates(classURL) 'Now scrape it Dim wb As New WebBrowser With wb .DocumentText = updates .ScriptErrorsSuppressed = True End With MsgBox(wb.DocumentText) Dim doc As HtmlDocument = wb.Document Dim element = doc.GetElementById("forumLabel" & txtClassID.Text).InnerText If element IsNot Nothing Then MsgBox(element) End If Catch ex As Exception MsgBox("Appears something failed: " & ex.ToString) End Try
what does:
return?vb Code:
MsgBox(wb.DocumentText)
that returns the "<HTML></HTML>" , but in the debugger, if I hover over "updates", I see ALL the html
ok. try this:
vb Code:
Dim wb As New WebBrowser wb.ScriptErrorsSuppressed = True Dim doc As HtmlDocument = wb.Document.OpenNew(True) doc.Write("your html string")