Results 1 to 7 of 7

Thread: [RESOLVED] Getting information from HTML

  1. #1

    Thread Starter
    Fanatic Member
    Join Date
    Mar 2008
    Posts
    519

    Resolved [RESOLVED] Getting information from HTML

    Hi!
    I need to retrieve information from a table in a HTML code.

    The page i need to retrieve info from is not mine so i can't change the HTML code to make it solve my problem.

    Here's an example of the HTML:
    Code:
    <html>
    <head>
    <title>Nana</title>
    </head>
    <body>
    <TABLE BORDER=0 CELLSPACING=1 CELLPADDING=4 WIDTH=100&#37;>
    <TR BGCOLOR=#505050><TD COLSPAN=2 CLASS=white>
    <B>Character Information</B></TD></TR>
    <TR BGCOLOR=#F1E0C6><TD WIDTH=20%>Name:</TD><TD>Name Value</TD></TR>
    <TR BGCOLOR=#D4C0A1><TD>Sex:</TD><TD>Value of Sex</TD></TR>
    </TABLE>
    </body>
    </html>
    So what i need to do is get the value of the field 'Name' and the value of the row 'Sex'. How can i do this?

    Thanks!
    //Zeelia

  2. #2
    Hyperactive Member
    Join Date
    Apr 2009
    Posts
    358

    Re: Getting information from HTML

    You can use RegEx.
    Sorry if my posts are misleading or sometimes rude, I'm just trying to get information so try to help me out.

  3. #3
    Wait... what? weirddemon's Avatar
    Join Date
    Jan 2009
    Location
    USA
    Posts
    3,826

    Re: Getting information from HTML

    You should also be able to use the Web Browser control and .GetElementByID
    CodeBank contributions: Process Manager, Temp File Cleaner

    Quote Originally Posted by SJWhiteley
    "game trainer" is the same as calling the act of robbing a bank "wealth redistribution"....

  4. #4
    PowerPoster techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,687

    Re: Getting information from HTML

    if the elements had IDs that might work... seeing as how they don't ... RegEx would work, and would probably be most efficient. Also a possibility would be to load it up into an XMLDocument, BUT the HTML has to be fully formed XML-style (XHTML) ... but if there's not guarantee that it will.... I'd go with regex.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  5. #5

    Thread Starter
    Fanatic Member
    Join Date
    Mar 2008
    Posts
    519

    Re: Getting information from HTML

    Hi!
    Techgnome, i like that about XMLDocument, but to make a XML i need to get the values first of all.. Can you explain more about XMLDocument please?

    And if anyone could help me a bit with regex, i can only think of a way to get the field where it says 'Name:' since that's static information but i don't know how to get the next row (the one with the value).

    Thanks.
    //Zeelia

  6. #6
    PowerPoster techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,687

    Re: Getting information from HTML

    "...but to make a XML i need to get the values first of all.. C..." Not true... IF and ONLY IF the HTML is well formed, you can actually treat it like XML...

    Here's a link to the XMLDocument overview: http://msdn.microsoft.com/en-us/libr...ldocument.aspx
    You'll need either the Load method: http://msdn.microsoft.com/en-us/libr...ment.load.aspx
    Or the LoadXML method: http://msdn.microsoft.com/en-us/libr...t.loadxml.aspx

    From there, you can use SelectNodes ( http://msdn.microsoft.com/en-us/libr...lectnodes.aspx ) to get to the nodes you need.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  7. #7

    Thread Starter
    Fanatic Member
    Join Date
    Mar 2008
    Posts
    519

    Re: Getting information from HTML

    Hi!
    XMLDocument doesn't work since the page has errors like un-closed tags.

    So if anyone could send me into the right direction of how i should proceed with regex, i'd really appreciate it.

    Thanks!

    *EDIT*
    Okay i solved the problem, im not very good at regex so if you have a better solution, please share.
    Here's my solution:
    vb.net Code:
    1. Public Function RunRegEx(ByVal inputhtml As String, ByVal fieldname As String)
    2.         ' Define a regular expression for currency values.
    3.         Dim rx As New Regex("<([A-Z][A-Z0-9]*)\b[^>]*>" & fieldname & "</\1><([A-Z][A-Z0-9]*)\b[^>]*>(.*?)</\1>", RegexOptions.IgnoreCase)
    4.  
    5.         ' Find matches.
    6.         Dim matches As MatchCollection = rx.Matches(inputhtml)
    7.  
    8.         ' Report on each match.
    9.         Dim i As Integer = 1
    10.         Dim returnval As String = ""
    11.         For Each match As Match In matches
    12.             If i = 1 Then
    13.                 If match.ToString.Contains("width") Then
    14.                     returnval = match.ToString.Remove(0, (23 + fieldname.Length))
    15.                 Else
    16.                     returnval = match.ToString.Remove(0, (13 + fieldname.Length))
    17.                 End If
    18.                 returnval = returnval.Remove(returnval.Length - 5, 5)
    19.             End If
    20.                 i = i + 1
    21.                 MsgBox(match.ToString)
    22.         Next
    23.         MsgBox(returnval)
    24.         Return Nothing
    25.     End Function


    //Zeelia
    Last edited by Zeelia; Jun 15th, 2009 at 06:13 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width