Results 1 to 6 of 6

Thread: Using XmlDocument.Load to read in web pages

  1. #1

    Thread Starter
    New Member
    Join Date
    May 2006
    Posts
    2

    Question Using XmlDocument.Load to read in web pages

    I want to read a web page into an XmlDocumemnt so I can use all those cool XML tools to examine it .

    But HTML doesn't quite match XML specs. <HTML>, <IMG>, <BR> tags don't require ending tags, for example. And values don't have to be quoted
    <Table Width=130>

    vs
    <Table Width='130'>

    Is there a way to get HTMl pages into an XmlDocument without "correcting" all the anamolies?

    Thanks!

  2. #2
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339

    Re: Using XmlDocument.Load to read in web pages

    No but try doing a search on the HTMLDocument object. There is a Document Object Model for HTML docs as well which can make life easier depending on what you are doing.

  3. #3
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    111,221

    Re: Using XmlDocument.Load to read in web pages

    XML and HTML are two different languages. Just because they have a similar structure doesn't mean that something written to parse XML code should be able to read HTML code. HTML code that complies to the rules of XML is called XHTML and only if your Web pages conform to that spec will an XML parser be able to understand them. The chances of that are pretty minimal. Use the right tool for the job, as Edneeis suggests.
    Why is my data not saved to my database? | MSDN Data Walkthroughs
    VBForums Database Development FAQ
    My CodeBank Submissions: VB | C#
    My Blog: Data Among Multiple Forms (3 parts)
    Beginner Tutorials: VB | C# | SQL

  4. #4

    Thread Starter
    New Member
    Join Date
    May 2006
    Posts
    2

    Re: Using XmlDocument.Load to read in web pages

    Terrific. Perfect solution.

    One small problem... I'm using version .net 1.1 and the HTMLDocument object was introduced at 2.0.

    Ok, so I need to upgrade. Is the upgrade I want Visual Studio 2005? I'm finding the Microsoft update info unhelpful since they talk about Visual Studio, not ".net".

  5. #5
    PowerPoster techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,687

    Re: Using XmlDocument.Load to read in web pages

    They dropped the .NET from VS with 2005.... so yes, VS2005 is the next version of .NET (specificaly .NET FW 2.0)

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  6. #6
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339

    Re: Using XmlDocument.Load to read in web pages

    You can use the HTMLDocument in any version via COM. The .NET version is based on the COM one anyway.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width