Results 1 to 8 of 8

Thread: Extracting elements from Web Page using WebBrowser

  1. #1

    Thread Starter
    PowerPoster
    Join Date
    Jan 2008
    Posts
    11,074

    Extracting elements from Web Page using WebBrowser

    I found this code on the Net and it works for extracting hyperlinks from a Web Page but how do I do something similar for Images?

    Also, any reason why the author used Navigate2 instead of just Navigate. What's the difference?
    Code:
    Dim d As New HTMLDocument
     '
     ' You need to add a Reference to Microsoft HTML Objects Library
     '
    Private Sub Command1_Click()
     WebBrowser1.Navigate2 "http://www.somewebsite.com/......./"
    End Sub
    
    Private Sub Command2_Click()
     Dim HyperLinks As String
     
     Set d = WebBrowser1.Document
     
     HyperLinks = ""
     
     For i = 0 To d.links.length - 1
       l = d.links.Item(i)
       HyperLinks = HyperLinks & vbNewLine & "<a href='" & l & "'>" & l & "</a>" & "<br>"
     Next
     
     Text1.Text = Text1 & HyperLinks
    End Sub

  2. #2
    Junior Member
    Join Date
    Jan 2008
    Posts
    16

    Re: Extracting elements from Web Page using WebBrowser

    In place of "links" you could use any of the DHTML terms, so you could use "images" to extract the images.

    Perhaps this helps regarding the use of Navigate2:

    The Navigate2 method extends the Navigate method to support browsing on special folders—such as Desktop and My Computer—that are represented by a pointer to an item identifier list (PIDL). However, this is not applicable to the Visual Basic programming language.

  3. #3

    Thread Starter
    PowerPoster
    Join Date
    Jan 2008
    Posts
    11,074

    Re: Extracting elements from Web Page using WebBrowser

    I did that. I used images in place of links and all I got back was something that read "[object]" and I didn't know what to do with that return value.

  4. #4
    VB For Fun Edgemeal's Avatar
    Join Date
    Sep 2006
    Location
    WindowFromPoint
    Posts
    4,255

    Re: Extracting elements from Web Page using WebBrowser

    Not sure this helps, but I tried the one posted by iPrank and it does get the images,
    Saving Web Page Images In Temp Folder!
    http://www.vbforums.com/showthread.php?p=2268392

  5. #5

    Thread Starter
    PowerPoster
    Join Date
    Jan 2008
    Posts
    11,074

    Re: Extracting elements from Web Page using WebBrowser

    Thanks, edgemeal, that link was perfect and I used the one posted by iPrank.

  6. #6
    VB For Fun Edgemeal's Avatar
    Join Date
    Sep 2006
    Location
    WindowFromPoint
    Posts
    4,255

    Re: Extracting elements from Web Page using WebBrowser

    Quote Originally Posted by jmsrickland
    Thanks, edgemeal, that link was perfect and I used the one posted by iPrank.
    Glad I could help,
    Would be nice to know if it's possible to get the images directly from the webrowser control instead of having to download them again, since the images are already downloaded right? Anybody?

  7. #7

    Thread Starter
    PowerPoster
    Join Date
    Jan 2008
    Posts
    11,074

    Re: Extracting elements from Web Page using WebBrowser

    Yes, it would be nice. They are in the Internet Temp folders by default.

    BTW: How does one know to add that .src parameter? That's where I get lost since it didn't show up when I browsed through the VB Object Browser.

    WebBrowser1.Document.images.Item(i).src

  8. #8
    PoorPoster iPrank's Avatar
    Join Date
    Oct 2005
    Location
    In a black hole
    Posts
    2,729

    Re: Extracting elements from Web Page using WebBrowser

    Quote Originally Posted by Edgemeal
    Would be nice to know if it's possible to get the images directly from the webrowser control instead of having to download them again, since the images are already downloaded right? Anybody?
    You'll need to copy the files from local internet cache. Search for the following APIs at http://allapi.mentalis.org/
    FindFirstUrlCacheEntry
    FindNextUrlCacheEntry

    BTW, URLDownloadToFile uses IE cache. If the images were successfully loaded in the WB control, they will not be downloaded again.

    Quote Originally Posted by jmsrickland
    How does one know to add that .src parameter? That's where I get lost since it didn't show up when I browsed through the VB Object Browser.
    You'll not get it in Object Browser.
    See my Webbrowser related post in this thread for links to some great tutorials: http://www.vbforums.com/showthread.php?t=380879

    Also in your MSDN CD look for topics like IHTMLxxxx reference (like IHTMLDocumentElement). In older MSDN CDs you'll get them under 'Web Workshop'.
    Last edited by iPrank; Feb 7th, 2008 at 04:05 PM.
    Usefull VBF Threads/Posts I Found . My flickr page .
    "I love being married. It's so great to find that one special person you want to annoy for the rest of your life." - Rita Rudner


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width