|
-
Mar 10th, 2011, 04:12 PM
#1
Thread Starter
Junior Member
HTML Parse
I know this has probably been answered before but I am have difficulty find exactly what I need. This is ultimately what I would like to achieve. I have a form with the Inet control added as well as a web browser. I have it load up a specific URL, then save to .html. What I would like is for it to specifically parse out the Product names as listed below.
http://krasikart.no-ip.org/parse.jpg
Just the product names. I then want to add them to a listbox for ease of copying. Can anyone help me in the right direction? This is the code I have currently.
Code:
Private Sub Command1_Click()
Dim URL() As Byte
'set protocol to HTTP
Inet1.Protocol = icHTTP
Inet1.URL = WebBrowser1.LocationURL
' Retrieve the HTML data into a byte array.
URL() = Inet1.OpenURL(Inet1.URL, icByteArray)
' Create a file for the data.
Open "google.htm" For Binary Access Write As #1
Put #1, , URL()
urlbox.Text = WebBrowser1.LocationURL
Close #1
End Sub
Private Sub gotourl_Click()
WebBrowser1.Navigate urlbox.Text
End Sub
Private Sub parse_Click()
Text1.Text = WebBrowser1.Document.body.innerText
List1.AddItem Text1.Text
End Sub
Thanks for any help you can offer!
-
Mar 10th, 2011, 08:43 PM
#2
Re: HTML Parse
Though the screenshot may be helpful to see what you want parsed, but without some sample html to look at, no one can really offer any advice.
Do you get in your textbox what you expected? If not, more details/specifics are needed
-
Mar 11th, 2011, 05:08 AM
#3
Thread Starter
Junior Member
Re: HTML Parse
Yes, I can post more details. Here is some html snippet.
Code:
<table border="1" cellpadding="2" cellspacing="0" align="center" width=700>
<THEAD class="dark">
<TD COLSPAN=4>Products </TD>
</THEAD>
<THEAD>
<TD> </TD>
<TD>Comments</TD>
<TD>Date Modified</TD></THEAD>
<TR>
<TD WIDTH=300>ACUVAIL - ketorolac tromethamine ophthalmic solution </TD>
<TD WIDTH=300> </TD>
<TD>10/2/2009 </TD>
</TR>
<TR>
<TD WIDTH=300>ALPHAGAN </TD>
<TD WIDTH=300> </TD>
<TD>10/2/2009 </TD>
</TR>
<TR>
<TD WIDTH=300>BOTOX </TD>
<TD WIDTH=300> </TD>
<TD>10/2/2009 </TD>
</TR>
<TR>
<TD WIDTH=300>JUVEDERM </TD>
<TD WIDTH=300> </TD>
<TD>10/2/2009 </TD>
</TR>
Here is a screenshot of the app and results from the parse_Click. I've also linked the entire saved page if that helps anyone.
http://krasikart.no-ip.org/parse2.jpg
http://krasikart.no-ip.org/parse.htm
-
Mar 11th, 2011, 08:38 AM
#4
Re: HTML Parse
My approach would be convert the html to xml and then use the xml dom to pull the values. Here is a quick sample.
Code:
Private Sub Command1_Click()
Dim strURL As String
Dim objHTML
Dim strHTML As String
Dim intPos As Integer
Dim objDoc
Dim objNodeList
Dim objNode
Dim objTdNode
'Get the raw html
strURL = "http://174.51.225.164:8080/parse.htm"
Set objHTML = CreateObject("Microsoft.XMLHTTP")
objHTML.open "GET", strURL, False
objHTML.send
strHTML = objHTML.responseText
'Remove everything before the start of the table
intPos = InStr(1, strHTML, "Products", vbTextCompare)
intPos = InStrRev(strHTML, "<table", intPos, vbTextCompare)
strHTML = Mid(strHTML, intPos)
'Remove everything after the table
intPos = InStr(1, strHTML, "</table>", vbTextCompare) + 7
strHTML = Left(strHTML, intPos)
'Do some replaces to prevent things from breaking
strHTML = Replace(strHTML, " WIDTH=300", "")
strHTML = Replace(strHTML, " width=700", "")
strHTML = Replace(strHTML, " COLSPAN=4", "")
strHTML = Replace(strHTML, " ", "")
Set objHTML = Nothing
'load the html as an xml document
Set objDoc = CreateObject("Microsoft.XMLDOM")
objDoc.async = False
objDoc.loadXML strHTML
' Load the tr tags into a node list
Set objNodeList = objDoc.selectNodes("//TR")
' Loop through the tr tags grabbing the contents of the first td tag
For Each objNode In objNodeList
Set objTdNode = objNode.selectSingleNode("TD")
List1.AddItem objTdNode.Text
Next objNode
Set objTdNode = Nothing
Set objNode = Nothing
Set objNodeList = Nothing
Set objDoc = Nothing
End Sub
-
Mar 11th, 2011, 04:03 PM
#5
Thread Starter
Junior Member
Re: HTML Parse
Thank you so much! It works perfectly. I just have one more question. I need to send the contents of List1 to another listbox on another form. I can't seem to find how to send the list from one listbox in one form to another listbox in another form. The answer will let me complete my project.
-
Mar 11th, 2011, 04:14 PM
#6
Re: HTML Parse
There is no function that will do it all at once. Simple solution is a simple loop
Code:
Dim L As Long
Form2.List2.Clear
For L = 0 To List1.ListCount - 1
Form2.List2.AddItem List1.List(L)
Next
-
Mar 11th, 2011, 04:16 PM
#7
Re: HTML Parse
If it is on another form you have to prefix the listbox name with the form name
Code:
Dim L As Long
Form2.List2.Clear
For L = 0 To List1.ListCount - 1
Form2List2.AddItem List1.List(L)
Next
-
Mar 11th, 2011, 04:18 PM
#8
Re: HTML Parse
MarkT, beat you to it! I realized my error & fixed it before you posted - ain't that slick
-
Mar 11th, 2011, 04:27 PM
#9
Thread Starter
Junior Member
Re: HTML Parse
Thanks, worked perfectly. Now I can complete my project!
-
Mar 11th, 2011, 05:29 PM
#10
Thread Starter
Junior Member
Last edited by Mr.Nemo; Mar 11th, 2011 at 06:49 PM.
-
Mar 11th, 2011, 07:43 PM
#11
Thread Starter
Junior Member
Re: HTML Parse
Sorry for the bump but I had to revise my last question. Several If/Thens I believe will be used for this but I'm trying to get it to search Products and Terms and if Products is listed but not Terms, pull the Product list. If, however the Products and Terms are there, pull both. And, if Terms is there but not Products, pull Terms. Basically I need an And/Or I think. Sorry for the confusion. Pretty tired and can't seem to wrap my brain around it this late in the morning. Thanks for all help!
Last edited by Mr.Nemo; Mar 12th, 2011 at 03:20 AM.
-
Mar 12th, 2011, 02:47 PM
#12
Thread Starter
Junior Member
-
Mar 12th, 2011, 02:52 PM
#13
Re: HTML Parse
Without having the html to work with it is hard to tell what you need.
-
Mar 12th, 2011, 03:10 PM
#14
Thread Starter
Junior Member
Re: HTML Parse
Ah, right. I meant to upload some things. I hate I bumped this but I was working on this all night and have to have this running by Monday. I'll post some html as well as my project files. If someone opens my project, you will have to change the url to the included files to test the function.
Project Files
Web Pages
when I change the search terms, and I know the searched term is in the document, it doesn't always find them. And sometimes it does but doesn't pull the data. It's pretty strange.
-
Mar 12th, 2011, 08:28 PM
#15
Thread Starter
Junior Member
Re: HTML Parse
Anyone have any enlightening ideas? I'm afraid to add too many If/Thens and complicate it. I think it's over complicated as is...
-
Mar 15th, 2011, 04:41 AM
#16
Thread Starter
Junior Member
Re: HTML Parse
I'm still having issues, Not sure what I can do now.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|