I have two rich text boxes, and two buttons on my screen. The first button grabs HTML from a URL and then converts the HTML to XML which resides in rich text box 1.
The second button is to grab the XML from the rich text box1 and then parse it to grab all the input elements by their ID.
My issue is that my parser isn't doing anything. My guess is that I'm not quite getting the XML from the first rich text box.
What would be the best way to grab the XML from a rich text box load it into memory and then parse the XML to grab all the ID tags?
Here is my code -- Thanks for any help.
Code:Imports mshtml Imports System.Text Imports System.Net Imports System.Xml Imports System.IO Imports System.Xml.XPath Public Class Scraper Private Sub Scraper_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load End Sub Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click ' Note: This example uses two Chilkat products: Chilkat HTTP ' and Chilkat HTML-to-XML. The "Chilkat Bundle" can be licensed ' at a price that is less than purchasing each product individually. ' The "Chilkat Bundle" provides licenses to all existing Chilkat components. Also, new-version upgrades are always free. Dim http As New Chilkat.Http() ' Any string argument automatically begins the 30-day trial. Dim success As Boolean success = http.UnlockComponent("30-day trial") If (success <> True) Then TextBox1.Text = TextBox1.Text & http.LastErrorText & vbCrLf Exit Sub End If Dim html As String html = http.QuickGetStr("http://www.quiltingboard.com/register.php") If (html = vbNullString) Then TextBox1.Text = TextBox1.Text & http.LastErrorText & vbCrLf Exit Sub End If Dim htmlToXml As New Chilkat.HtmlToXml() ' Any string argument automatically begins the 30-day trial. success = htmlToXml.UnlockComponent("30-day trial") If (success <> True) Then TextBox1.Text = TextBox1.Text & htmlToXml.LastErrorText & vbCrLf Exit Sub End If ' Indicate the charset of the output XML we'll want. htmlToXml.XmlCharset = "utf-8" ' Set the HTML: htmlToXml.Html = html ' Convert to XML: Dim xml As String xml = htmlToXml.ToXml() ' Save the XML to a file. ' Make sure your charset here matches the charset ' used for the XmlCharset property. htmlToXml.WriteStringToFile(xml, "out.xml", "utf-8") RichTextBox1.Text = xml End Sub Private Sub LoopThroughXmlDoc(ByVal nodeList As XmlNodeList) For Each elem As XmlElement In nodeList If elem.HasChildNodes Then LoopThroughXmlDoc(elem.ChildNodes) Else '' Extract the information If elem.HasAttribute("id") Then 'elem.Attributes("AssetID").Value.ToString() ElseIf elem.HasAttribute("name") Then 'elem.Attributes("AttributeID").Value.ToString() End If End If Next End Sub Private Sub Button2_Click_1(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button2.Click Dim doc As XmlDocument = New XmlDocument doc.Load("xmlFile.xml") Dim nodeList As XmlNodeList = doc.GetElementsByTagName("input") LoopThroughXmlDoc(nodeList) End Sub End Class
![]()


Reply With Quote
