Results 1 to 13 of 13

Thread: Parsing information from a webpage

  1. #1

    Thread Starter
    Junior Member
    Join Date
    Feb 2005
    Posts
    30

    Exclamation Parsing information from a webpage

    Hello:
    I want to know how to parse(read) the information contained in a web page, using a VB applet that will refresh this information every 2 or 3 seconds. The web page is very simple, does not contain images and only a marquee (which can be eliminated), it only contaings text that displays temperature readings etcetera.
    I know that i have to use the webbrowser control, and after the document has loaded i have to parse it, what i dont know is how to parse it.
    Thanks

  2. #2
    PowerPoster Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Parsing information from a webpage

    whats the URL? (or is it a local page)

    We will need to see the source to be able to give u some code to do this.

    if this page is online, the 2 to 3 seconds may not be possible with the webbrowser control. but what can be done is reload as soon as its done parsing which should still be very fast.
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  3. #3

    Thread Starter
    Junior Member
    Join Date
    Feb 2005
    Posts
    30

    Re: Parsing information from a webpage

    The webpage is local (its being generated by a microcontroller board), i dont really have that much of a source code, but i have read some. I also want to know if it is possible to just extract the information, ignoring a marquee thats on the page.
    thanks

  4. #4
    PowerPoster Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Parsing information from a webpage

    Yes. but I need to see the full source and what u want to get from it.
    Using the webbrowser control combined with a reference to the HTML Object Library..
    it should be a snap.

    Can u post the full source? (or attach the page)

    Thanks!
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  5. #5

    Thread Starter
    Junior Member
    Join Date
    Feb 2005
    Posts
    30

    Re: Parsing information from a webpage

    ok
    i will post the full source

  6. #6
    Fanatic Member
    Join Date
    Aug 2005
    Location
    South Africa
    Posts
    760

    Re: Parsing information from a webpage

    It would be easiest using the webbrowser control, but I suggest you use the Inet control to increase the speed of the download (if that is a priority to you).

    Post the source code of the page as Static said, and we will be able to help you generate code to parse it.
    If I helped you out, please consider adding to my reputation!

    -- "The faulty interface lies between the chair and the keyboard" --

    VB6 Programs By Me:
    ** Dictionary, Thesaurus & Rhyme-Generator In One ** WMP Recent Files List Editor ** Pretty Impressive Clock ** Extract Firefox History **

  7. #7

    Thread Starter
    Junior Member
    Join Date
    Feb 2005
    Posts
    30

    Re: Parsing information from a webpage

    Web Server for Embedded Applications\
    </I>\r\n\
    <BR>\
    <A HREF=http://www.violasystems.com>\
    www.violasystems.com - Embedding The Internet</A>\
    </BODY>";

    char Https_TestIndexPage[] = "HTTP/1.0 200 OK\r\n\
    Last-modified: Fri, 18 Oct 2002 12:04:32 GMT\r\n\
    Server: ESERV-10/1.0\nContent-type: text/html\r\n\
    Content-length: 400\r\n\
    \r\n\
    <HEAD>\
    <TITLE>UPRB Sistema de monitoreo de Sensores</TITLE></HEAD>\
    <BODY>\
    <DIV align=center>\
    <H2><MARQUEE behavior=scroll direction=right width=500> MCF5282 Microcontroller</DIV></MARQUEE></H2>\
    <BR>\
    <DIV align=center>\
    UPRB Salón 121 \
    <HR>\
    <BR> Temperatura Fahrenh = \
    <I>\r\n\
    <BR> Humedad Relativa = \
    <BR> Intensidad de luz = \
    <BR> Sensor de Humo = \
    <BR> Sensor de Movimiento = \
    <BR> Sensor de Puerta = \
    <BR> Servidor WEB <BR></I>\r\n\
    <BR>\
    <A HREF=http://www.uprb.edu>\
    www.uprb.edu</A>\
    <BR><BR>\
    <HR>\
    </DIV>\
    </BODY>";

  8. #8

    Thread Starter
    Junior Member
    Join Date
    Feb 2005
    Posts
    30

    Re: Parsing information from a webpage

    this the part of the C code that generates the web page

  9. #9
    PowerPoster Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Parsing information from a webpage


    we need just the final result (the HTML) with data included for testing
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  10. #10

    Thread Starter
    Junior Member
    Join Date
    Feb 2005
    Posts
    30

    Re: Parsing information from a webpage

    the thing is that the variables that are displayed are made in another section of the microcontroller, can u suggest any code with what you have up until now and i will fill in the gaps
    where it says
    <BR> Temperatura Fahrenh = \
    <I>\r\n\
    <BR> Humedad Relativa = \
    <BR> Intensidad de luz = \
    <BR> Sensor de Humo = \
    <BR> Sensor de Movimiento = \
    <BR> Sensor de Puerta = \

    this is the part of the html that i want to read (its in spanish)

  11. #11

    Thread Starter
    Junior Member
    Join Date
    Feb 2005
    Posts
    30

    Re: Parsing information from a webpage

    ok i have been able to parse the inner text out of the web-page,now i have another problem:
    when i try to reload the page after a variable(im controlling) changes, the information "read" by the program is not updated, and i know its a problem with the cache.
    Any Info on this i will post the source code following this

  12. #12

    Thread Starter
    Junior Member
    Join Date
    Feb 2005
    Posts
    30

    Re: Parsing information from a webpage

    Option Explicit

    Private Sub cmdExit_Click()
    If MsgBox("Are you sure?", vbYesNo, "Exiting the application") = vbYes Then
    Unload Me
    End If
    End Sub

    Private Sub cmdGo_Click()
    Dim objLink As HTMLLinkElement
    Dim objMSHTML As New MSHTML.HTMLDocument
    Dim objDocument As MSHTML.HTMLDocument


    lblStatus.Caption = "Gettting document via HTTP"

    ' This function is only available with Internet Explorer 5

    Set objDocument = objMSHTML.createDocumentFromUrl(txtURL.Text, vbNullString)

    lblStatus.Caption = "Getting and parsing HTML document"

    ' Tricky, to make the function wait for the document to complete, usually
    ' the transfer is asynchronus. Note that this string might be different if
    ' you have another language than english for Internet Explorer on the
    ' machine where the code is executed.

    While objDocument.readyState <> "complete"
    DoEvents
    Wend

    lblStatus.Caption = "Document completed"

    ' Copying the source to the text box

    txtSource.Text = objDocument.documentElement.innerText


    DoEvents

    ' Copying the title of the page to the label

    lblTitle.Caption = "Title : " & objDocument.Title

    DoEvents

    lblStatus.Caption = "Extracting links"

    ' Processing the link collection of the HTMLDocument object

    For Each objLink In objDocument.links
    lstLinks.AddItem objLink
    lblStatus.Caption = "Extracted " & objLink
    DoEvents
    Next

    lblStatus.Caption = "Done"

    Beep

    End Sub

  13. #13

    Thread Starter
    Junior Member
    Join Date
    Feb 2005
    Posts
    30

    Re: Parsing information from a webpage

    is there a way to make the program always look on the web page for the document instead of the cache?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width