Results 1 to 5 of 5

Thread: Download text from web page

  1. #1

    Thread Starter
    Member
    Join Date
    Sep 2006
    Posts
    63

    Download text from web page

    I am trying to get VB 2005 to be able to download the text from a web page, either as a string or a file, to later be searched. I have found lots of similar questions but all are of little help.

    I've tried using the webbrowser control and the Document.Body.InnerText property. Problems are: The webbrowser.documentcompleted event triggers multiple times for multiple frames. If I use webbrowser.readystate, I need to use application.doevents to get the page to load. This seems to use lots of resources and keeps allocating memory. I can't get the memory released, but it does release when the app is minimized.

    I have also tried the My.Computer.Network.DownloadFile and webclient.DownloadString but these don't download the proper information for URLs such as: http://www.amazon.com/Microsoft-Visu...TF8&s=software.

    I'm thinking the webbrowser control is the way to go, I just don't like the constant allocation of memory.

    Greg

  2. #2
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    111,221

    Re: Download text from web page

    What exactly does WebClient.DownloadString not do correctly?
    Why is my data not saved to my database? | MSDN Data Walkthroughs
    VBForums Database Development FAQ
    My CodeBank Submissions: VB | C#
    My Blog: Data Among Multiple Forms (3 parts)
    Beginner Tutorials: VB | C# | SQL

  3. #3
    Fanatic Member Jumpercables's Avatar
    Join Date
    Jul 2005
    Location
    Colorado
    Posts
    592

    Re: Download text from web page

    If you are trying to read information from Amazon.com I suggest using their Web Services http://www.amazon.com/gp/browse.html?node=3435361 it make searching and reading the information off their webpages much easier.

    Amazon Web Service allows you to access several Amazon Web Services for FREE.

    C# - .NET 1.1 / .NET 2.0

    "Take everything I say with a grain of salt, sometimes I'm right, sometimes I'm wrong but in the end we've both learned something."
    _____________________
    Regular Expressions Library
    Connection String
    API Functions
    Database FAQ & Tutorial

  4. #4

    Thread Starter
    Member
    Join Date
    Sep 2006
    Posts
    63

    Re: Download text from web page

    My amazon.com was just an example. There's a wide variety of sites, not only commerce types.

    I'm sorry. I guess I didn't state my original problem very well. I'm more interested in the text that's displayed on the website, not the entire HTML code.

    Both DownloadFile and DownloadString will return the HTML, although it's slightly different than the HTML from document.body.innerhtml, but that may not matter.

    Is there an easy way to convert the HTML returned from those methods and strip away unnecessary codes to just be left with the screen text?

    Greg

  5. #5
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    111,221

    Re: Download text from web page

    I believe that what you want to do is create a "screen scraper".

    http://pscode.com/vb/scripts/ShowCod...3332&lngWId=10
    http://www.eggheadcafe.com/articles/20010916.asp
    Why is my data not saved to my database? | MSDN Data Walkthroughs
    VBForums Database Development FAQ
    My CodeBank Submissions: VB | C#
    My Blog: Data Among Multiple Forms (3 parts)
    Beginner Tutorials: VB | C# | SQL

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width