Results 1 to 11 of 11

Thread: [Resolved] Extracting Table Contents from HTML

  1. #1

    Thread Starter
    New Member
    Join Date
    Feb 2015
    Posts
    5

    [Resolved] Extracting Table Contents from HTML

    hey guys i have been searching from past many hours but i am not able to get the content from one website and display it in my form

    here's the code:

    Code:
    <td class='time' align='center'>4:48</td><td class='time' align='center'>6:05</td><td class='time' align='center'>11:47</td><td class='time' align='center'>3:05</td><td class='time' align='center'>5:32</td><td class='time' align='center'>7:02</td>
    basically i want to extract the timing highlighted in red color

    Thanks
    Last edited by abdullah143; May 8th, 2015 at 07:39 AM.

  2. #2
    Bad man! ident's Avatar
    Join Date
    Mar 2009
    Location
    Cambridge
    Posts
    5,390

    Re: Extracting Table Contents from HTML

    vb Code:
    1. Imports System.Text.RegularExpressions
    2.  
    3. Public Class Form1
    4.  
    5.     Private Sub Foo()
    6.         Dim value As String = "<td class='time' align='center'>4:48</td><td class='time' align='center'>6:05</td><td class='time' align='center'>11:47</td><td class='time' align='center'>3:05</td><td class='time' align='center'>5:32</td><td class='time' align='center'>7:02</td>"
    7.         Dim rx As New Regex("(?<=align='center'>)\d{1,2}:\d{2}(?=</td>)")
    8.  
    9.         MessageBox.Show(String.Join(vbNewLine, rx.Matches(value).Cast(Of Match).Select(Function(m) m.Value)))
    10.     End Sub
    11.  
    12. End Class

  3. #3

    Thread Starter
    New Member
    Join Date
    Feb 2015
    Posts
    5

    Re: Extracting Table Contents from HTML

    Quote Originally Posted by ident View Post
    vb Code:
    1. Imports System.Text.RegularExpressions
    2.  
    3. Public Class Form1
    4.  
    5.     Private Sub Foo()
    6.         Dim value As String = "<td class='time' align='center'>4:48</td><td class='time' align='center'>6:05</td><td class='time' align='center'>11:47</td><td class='time' align='center'>3:05</td><td class='time' align='center'>5:32</td><td class='time' align='center'>7:02</td>"
    7.         Dim rx As New Regex("(?<=align='center'>)\d{1,2}:\d{2}(?=</td>)")
    8.  
    9.         MessageBox.Show(String.Join(vbNewLine, rx.Matches(value).Cast(Of Match).Select(Function(m) m.Value)))
    10.     End Sub
    11.  
    12. End Class
    Thanks for the code although the code does work but its complicated for me as i am new on vb.net i can't understand much of it and plus i am not even able to implement it with the webpage where the code actually is :/ sorry but can you explain me a bit more ?

    Thanks

    edit: here's the website link
    Last edited by abdullah143; Feb 22nd, 2015 at 05:27 AM.

  4. #4
    Bad man! ident's Avatar
    Join Date
    Mar 2009
    Location
    Cambridge
    Posts
    5,390

    Re: Extracting Table Contents from HTML

    How does it not work with that website because it does.

  5. #5

    Thread Starter
    New Member
    Join Date
    Feb 2015
    Posts
    5

    Re: Extracting Table Contents from HTML

    Quote Originally Posted by ident View Post
    How does it not work with that website because it does.
    where exactly shall i use that link then? value as string? i don't think so :/

  6. #6
    Bad man! ident's Avatar
    Join Date
    Mar 2009
    Location
    Cambridge
    Posts
    5,390

    Re: Extracting Table Contents from HTML

    You wouldn't :/ it's just an example string. You replace that with the pages source. How ever you previously retrieved it.

  7. #7

    Thread Starter
    New Member
    Join Date
    Feb 2015
    Posts
    5

    Re: Extracting Table Contents from HTML

    Quote Originally Posted by ident View Post
    You wouldn't :/ it's just an example string. You replace that with the pages source. How ever you previously retrieved it.
    no the source gets updated like everyday like the timing changes so i am not retrieving the code although i was using web browser to navigate to the page, and what is regex? any explanation for that thanks

  8. #8
    Bad man! ident's Avatar
    Join Date
    Mar 2009
    Location
    Cambridge
    Posts
    5,390

    Re: Extracting Table Contents from HTML

    It makes no difference if the page changes. You should not be using a webbrowser to get the pages source. read the msdn documentation for the webclient class. Taking particular attention to it's downloadstring method.

    Explaining what regex is to much for One thread. If you google regex it will return loads of answers.

  9. #9
    Hyperactive Member
    Join Date
    Mar 2012
    Posts
    311

    Re: Extracting Table Contents from HTML

    Quote Originally Posted by ident View Post
    Explaining what regex is to much for One thread. If you google regex it will return loads of answers.
    RegEx is the Class that handles Regular Expressions in .Net, so you may want to read about that topic in general before read the more specific stuff on RegEx. Also, although using RegEx is one way of pulling data out of HTML, that isn't the only way... There are also other Classes that allow you to traverse the Document Object Model (DOM) of XML and more specifically HTML pages. If RegEx (which is more used for matching patterns in text) is too complex or returns too much data (in that the pattern you use is too general from the entire HTML code), you may also want to read on other topics / classes.

  10. #10
    Bad man! ident's Avatar
    Join Date
    Mar 2009
    Location
    Cambridge
    Posts
    5,390

    Re: Extracting Table Contents from HTML

    Please don't try to explain what regex is.

  11. #11

    Thread Starter
    New Member
    Join Date
    Feb 2015
    Posts
    5

    Re: Extracting Table Contents from HTML

    Quote Originally Posted by ident View Post
    It makes no difference if the page changes. You should not be using a webbrowser to get the pages source. read the msdn documentation for the webclient class. Taking particular attention to it's downloadstring method.

    Explaining what regex is to much for One thread. If you google regex it will return loads of answers.
    Thanks a lot for all the help dude ill look into and get back if i have some other problem you are the best

    Quote Originally Posted by Pyth007 View Post
    RegEx is the Class that handles Regular Expressions in .Net, so you may want to read about that topic in general before read the more specific stuff on RegEx. Also, although using RegEx is one way of pulling data out of HTML, that isn't the only way... There are also other Classes that allow you to traverse the Document Object Model (DOM) of XML and more specifically HTML pages. If RegEx (which is more used for matching patterns in text) is too complex or returns too much data (in that the pattern you use is too general from the entire HTML code), you may also want to read on other topics / classes.
    Thanks for the explanation

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width