[Resolved] Webbrowser - MSHTML Library - Why Late Bound?
I m working on a project that extracts the table data from webpages.
I have observed that many properties and methods of the MSHTML library are late bound and I need to discover their names myself.
For e.g. See the code below. Add a Command Button, a WebBrowser control and refrence to Microsoft HTML Object Library. It is working perfectly OK. It extracts teh table elements and prints it to the debug window.
VB Code:
Option Explicit
Private Sub Command1_Click()
Dim i As Long, j As Long
Dim WebDoc As MSHTML.HTMLDocument
Dim MyTbl As MSHTML.IHTMLTable2
Dim MyTblR As MSHTML.IHTMLTableRow2
'Navigate to the page and wait until fully loaded in the webbrowser
WebBrowser1.Navigate2 "http://nitpu3.kar.nic.in/blrcustoms/pn_2005.htm"
Do While WebBrowser1.ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
For Each MyTbl In WebBrowser1.Document.getElementsByTagName("table")
If MyTbl.cells.length > 4 Then
If Trim(MyTbl.cells(0).[B]innerText[/B]) = "Public Notice No" _
And Trim(MyTbl.cells(1).[B]innerText[/B]) = "Issue Date" _
And Trim(MyTbl.cells(2).[B]innerText[/B]) = "Subject" Then
''This is the table I was looking for
Exit For
End If
End If
Next
Set MyTblR = MyTbl.moveRow(0)
Dim ub As Long
ub = MyTbl.cells.length \ MyTblR.[B]cells.length[/B] - 2
For i = 0 To ub
Set MyTblR = MyTbl.moveRow(0)
For j = 0 To MyTblR.[B]cells.length [/B] - 1
Debug.Print MyTblR.[B]cells(j).innerText[/B],
Next
Debug.Print
Next
MsgBox "Done"
End Sub
The problem is -
1. The Items in bold don't appear after pressing dot(.), though at runtime they run perfectly OK. They seem to be late bound, though I have declared the variables correctly and set references correctly. Why is it late bound then?
2. I need to discover the properties and methods myself by hit and trial. That's quite a tedious and unreliable thing. Can someone provide me the complete list of methods and properties or a link where I can get these?
Thanks in advance !
Pradeep
Re: {{No one answers this?? :( :( :( }} Webbrowser - MSHTML Library - Why Late bound?
No answers ???? :( :( :( :(
Re: [No one answers this??] Webbrowser - MSHTML Library - Why Late bound?
Re: [No one answers this??] Webbrowser - MSHTML Library - Why Late bound?
Thanks dglienna,
The link really helped me discover some more late bound properties which I incorporated into my code. But my problem remains at place.
I've searched the entire thing but my questions remain unanswered !
Any other help pl.
Pradeep :)
Re: Webbrowser - MSHTML Library - Why Late Bound?
I don't really understand. You are in fact early-binding. I get the Intellisense menus coming up, so I don't know what your problem is.
Re: Webbrowser - MSHTML Library - Why Late Bound?
It's a known problem, but I've never found much of a work-around. I use the reference David posted as well when I work with the DOM. Be aware that, when the IntelliSense menu's pop up, they don't always show all possibilities. The MSDN reference is more reliable.
Re: Webbrowser - MSHTML Library - Why Late Bound?
Quote:
Originally Posted by TheVader
It's a known problem, but I've never found much of a work-around. I use the reference David posted as well when I work with the DOM. Be aware that, when the IntelliSense menu's pop up, they don't always show all possibilities. The MSDN reference is more reliable.
OK guys, try this.
Open the Object Browser (F2). Right-click anywhere on the blank area at the top, right of the library and search boxes. Select "Show Hidden Members" from the context menu.
You will see all the hidden members of classes and objects exposed in a grey font, both in the Object Browser listing and in the Intellisense menus.
Hope this fixes the problem. The HTML listings all show up for me anyway, but there ya go. Interesting to note that with that option enabled, you can access things like StrPtr and IUnknown through the Intellisense.
Re: Webbrowser - MSHTML Library - Why Late Bound?
Interesting, penegate, didn't know hidden members existed in VB6. Indeed it now shows properties such as designMode. :)
But still, not always the menu's pop up. Check this example (add HTML Object Library):
VB Code:
Dim doc As HTMLDocument
Dim img As HTMLImg
doc.images.Item(0).alt 'no image menu pops up
img.alt 'while here it does
Re: Webbrowser - MSHTML Library - Why Late Bound?
Easy answer to that one Mr Vader. The Object Browser is your friend :D
HTMLImg is a class, of which "alt" is an explicitly defined member.
Now if you look at HTMLDocument class, you see that Images is a class of IHTMLElementCollection. If you then look at that class, it is designed to encapsulate all DOM type objects. Its member Item is a late-bound Object type, not explicitly defined. That is why the Intellisense dosen't pick it up.
Re: Webbrowser - MSHTML Library - Why Late Bound?
Aha, I see. :) The Item method of the IHTMLElementCollection is defined as Object; had it been HTMLHtmlElement, then at least the general properties would have shown.
So... if images would have been a (non-existing) IHTMLImagesCollection, with Items defined as IHTMLImg's, then we would have had IntelliSense menu's. Guess you learn something new everyday. ;)
Re: Webbrowser - MSHTML Library - Why Late Bound?
Quote:
Originally Posted by penagate
OK guys, try this.
Open the Object Browser (F2). Right-click anywhere on the blank area at the top, right of the library and search boxes. Select "Show Hidden Members" from the context menu.
You will see all the hidden members of classes and objects exposed in a grey font, both in the Object Browser listing and in the Intellisense menus.
Hope this fixes the problem. The HTML listings all show up for me anyway, but there ya go. Interesting to note that with that option enabled, you can access things like StrPtr and IUnknown through the Intellisense.
Thanks a lot for this Penagate!
Didn't know that hidden members can be shown in the object browser and intellisense menu in this way.
But just have a look at this:
VB Code:
Dim MyTblR As MSHTML.IHTMLTableRow2
It shows only one member in the object browser as well as the intellisense menu (i.e. height).
Though it supports cells as shown in my code.
Interestingly if the defination is changed to this:
VB Code:
Dim MyTblR As MSHTML.IHTMLTableRow
It shows many other members in the object browser as well as intellisense menu including cells.
Why so?
Re: Webbrowser - MSHTML Library - Why Late Bound?
Well the prefix 'I' on the class name would suggest that they are interfaces to another class. Removing the 'I' and any suffix numbers gives you the class name 'HTMLTableRow' and if you look that one up in the OB then you will see *all* of the members listed, including both 'cells' and 'height'.
My guess is the interfaces with different suffixes are just each extensions to the original interface IHTMLTableRow which in itself is just a way of referencing an object with the class HTMLTableRow.
If you are working with an object and you have set it = to an object in the DOM document, then use Dim MyTblR as HTMLTableRow instead.
Try it and see. HTH :)
Re: Webbrowser - MSHTML Library - Why Late Bound?
Quote:
Originally Posted by penagate
Well the prefix 'I' on the class name would suggest that they are interfaces to another class. Removing the 'I' and any suffix numbers gives you the class name 'HTMLTableRow' and if you look that one up in the OB then you will see *all* of the members listed, including both 'cells' and 'height'.
My guess is the interfaces with different suffixes are just each extensions to the original interface IHTMLTableRow which in itself is just a way of referencing an object with the class HTMLTableRow.
If you are working with an object and you have set it = to an object in the DOM document, then use Dim MyTblR as HTMLTableRow instead.
Try it and see. HTH :)
Hi Penagate,
I'm still a bit confused. Please correct me where I m wrong.
I understand that IHTMLTableRow2 is a sort of derived class of IHTMLTableRow. It extends the IHTMLTableRow interface. So it contains all its members and some of its own too.
If this is the case, there should be no reason for late binding coz members are already known.
Still don't understand why objects are Late Bound?
Pradeep
Re: Webbrowser - MSHTML Library - Why Late Bound?
Quote:
Originally Posted by Pradeep1210
Hi Penagate,
I'm still a bit confused. Please correct me where I m wrong.
I understand that IHTMLTableRow2 is a sort of derived class of IHTMLTableRow. It extends the IHTMLTableRow interface. So it contains all its members and some of its own too.
If this is the case, there should be no reason for late binding coz members are already known.
Still don't understand why objects are Late Bound?
Pradeep
Nothing is late bound :) In TheVader's example doc.images.Item was an Object type. Because it was inexplicitly defined (being an Object can mean almost anything) that was late-bound (i.e. at run-time it can be set to be any particular class you want). In your situation you are explicitly Dimming things as class names (even though they are really interfcaes). So it is in fact early binding.
Did you try Dimming it as HTMLTableRow instead?
Re: Webbrowser - MSHTML Library - Why Late Bound?
Just compare it to a VBA Word macro where u can go on indefinately:
VB Code:
ThisDocument.Range.Paragraphs(2).Range.Copy ....
Re: Webbrowser - MSHTML Library - Why Late Bound?
oh yeaah....
brwBrowser.document.parentWindow.document.parentWindow.document.parentWindow.document.parentWindow
:lol: :lol:
But it's still early binding :)
Re: Webbrowser - MSHTML Library - Why Late Bound?
Quote:
Originally Posted by penagate
Nothing is late bound :) In TheVader's example doc.images.Item was an Object type. Because it was inexplicitly defined (being an Object can mean almost anything) that was late-bound (i.e. at run-time it can be set to be any particular class you want). In your situation you are explicitly Dimming things as class names (even though they are really interfcaes). So it is in fact early binding.
Did you try Dimming it as HTMLTableRow instead?
OK now I understand :)
I was using Interfaces till now. (I wonder interfaces are exposed in this way.)
I should have defined class object variables instead of that. So the correct defination would now be:
VB Code:
Dim WebDoc As MSHTML.HTMLDocument
Dim MyTbl As MSHTML.HTMLTable
Dim MyTblR As MSHTML.HTMLTableRow
I also observe that there is no "2" like class objects. So there is nothing to be confused.
One thing more - will it include both interfaces. and what is the use of these interfaces? - To create my own HTMLTableRow3 kind of interface and add more members, i suppose?
Pradeep
Re: Webbrowser - MSHTML Library - Why Late Bound?
I dunno. All I have seen interface classes used for is internal callback events, when for some reason you can't use class events, and have to avoid circular references.
Dim WebDoc As MSHTML.HTMLDocument
Dim MyTbl As MSHTML.HTMLTable
Dim MyTblR As MSHTML.HTMLTableRow
this is correct :thumb:
You can't really use an interface to 'expose' more members as such, because all that you can use are the ones in the class you are interfacing to. You could I suppose make an interface class with some of your own methods that manipulate the original class in some way. The point of that, I can't really see.
Glad you solved the original prob :)
What I do, which is what Vader showed me, is this:
Code:
Private mobjHTMLDoc As HTMLDocument
'
Private Sub brwWebBrowser_DocumentComplete(ByVal pDisp As Object, URL As Variant)
Set mobjHTMLDoc = brwWebBrowser.document
End Sub
That is I guess what you are doing with the tables and rows.
Re: Webbrowser - MSHTML Library - Why Late Bound?
Thank u so much penagate for those wonderful thing u taught me :thumb:
So my modified code now is like this. (It's still working! :) )
But still somethings which are in bold are being late bound (oops! should I say that now!)
VB Code:
Option Explicit
Private Sub Command1_Click()
Dim i As Long, j As Long
Dim WebDoc As MSHTML.HTMLDocument
Dim MyTbl As MSHTML.HTMLTable
Dim MyTblR As MSHTML.HTMLTableRow
'Navigate to the page and wait until fully loaded in the webbrowser
WebBrowser1.Navigate2 "http://nitpu3.kar.nic.in/blrcustoms/pn_2005.htm"
Do While WebBrowser1.ReadyState <> READYSTATE_COMPLETE
DoEvents
Loop
For Each MyTbl In WebBrowser1.Document.[B]getElementsByTagName("table")[/B]
If MyTbl.cells.length > 4 Then
If Trim(MyTbl.cells(0).[B]innerText[/B]) = "Public Notice No" _
And Trim(MyTbl.cells(1).[B]innerText[/B]) = "Issue Date" _
And Trim(MyTbl.cells(2).[B]innerText[/B]) = "Subject" Then
''This is the table I was looking for
Exit For
End If
End If
Next
Set MyTblR = MyTbl.moveRow(0)
Dim ub As Long
ub = MyTbl.cells.length \ MyTblR.cells.length - 2
For i = 0 To ub
Set MyTblR = MyTbl.moveRow(0)
For j = 0 To MyTblR.cells.length - 1
Debug.Print MyTblR.cells(j).[B]innerText[/B],
If MyTblR.cells(j).[B]children.length [/B] > 0 Then
If MyTblR.cells(j).[B]children(0).tagName [/B] = "A" Then
Debug.Print MyTblR.cells(j).[B]children(0).href[/B],
End If
End If
Next
Debug.Print
Next
MsgBox "Done"
End Sub
Pradeep
Re: Webbrowser - MSHTML Library - Why Late Bound?
OK. You are right, basically MS doesn't seem to like early binding. So we must do it ourselves.
You haven't used this:
Dim WebDoc As MSHTML.HTMLDocument
you should put it here:
Code:
WebBrowser1.Navigate2 "http://nitpu3.kar.nic.in/blrcustoms/pn_2005.htm"
Do While WebBrowser1.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Set WebDoc = WebBrowser1.document
and then use WebDoc instead of WebBrowser1.document.
Also, 'cells' is defined as an IHTMLElementCollection class, which is basically a glorified object collection. As you probably know, when you directly reference an object in a collection it is late-binding. The workaround is to explicitly define an object of class 'HTMLTableCell' first and Set it to the cell you want to work with.
So, you could do this:
Code:
Dim objCell As HTMLTableCell
'...
Set objCell = MyTbl.cells(x)
But, there wouldn't be all that much point, cos having to do it 3 times (once for each index) would be slower than just using the object directly, late-binding or no late-binding.
As for the other cases, 'children' is a member of an object in an IHTMLElementCollection, which is why it isn't recognised until run-time, and 'getElementByTagName' returns an IHTMLElementCollection.
So really, there isn't all that much you can do, save for the trick above, which as I said would probably be a lot slower than living with late-binding.
Such is life :sick:
Re: Webbrowser - MSHTML Library - Why Late Bound?
Yes Penagate,
I got every bit of that.
Thanks for all the help and the increase in my knowledge bank u did. :thumb:
The thread is now closed.
Pradeep :):):)