Results 1 to 4 of 4

Thread: [RESOLVED] extract a "value" from html tag or a textbox

  1. #1

    Thread Starter
    Lively Member
    Join Date
    Sep 2010
    Location
    Glogow, Poland
    Posts
    104

    Resolved [RESOLVED] extract a "value" from html tag or a textbox

    Hi,

    Please help me do something.

    The full story is very very long, but the bottom line is:

    I'm reading html tags from a webpage using Webbrowser and HtmlElementCollection class. Then I invoke specific members (link, buttons etc.) depending on their attributes.

    I got stuck at one point. The webpage has many similar images in one place. They can be distinguished only but a version number. I need to get this number somehow to be able to invoke the correct link/image.

    this is the example of the html code (it looks like that - I only removed personal imformation from it) - lets say I got it already saved in a textbox:
    Code:
    <TR>
    <TD>My text for label</TD>
    <TD>my text for description</TD>
    <TD>27 February 2011 12:12</TD>
    <TD>my username</TD>
    <TD>197</TD>
    <TD>&nbsp;</TD>
    <TD><NOBR><A id=snapshot_revert_cms-company-3 onclick="document.forms['website']['website:act'].value='website:snapshot_revert_cms-company-3';document.forms['website']['sandbox'].value='cms-company-3';document.forms['website']['version'].value='197';document.forms['website'].submit();return false;" href="#"><IMG style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; VERTICAL-ALIGN: -4px; BORDER-LEFT-WIDTH: 0px" title=Revert alt=Revert src="/servertype/images/icons/revert.gif"></A>&nbsp;&nbsp;<A id=snapshot_deploy_cms-company-3 onclick="document.forms['website']['website:act'].value='website:snapshot_deploy_cms-company-3';document.forms['website']['store'].value='cms-company-3';document.forms['website']['version'].value='197';document.forms['website'].submit();return false;" href="#"><IMG style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; VERTICAL-ALIGN: -4px; BORDER-LEFT-WIDTH: 0px" title=Deploy alt=Deploy src="/servertype/images/icons/deploy.gif"></A>&nbsp;&nbsp;<A id=snapshot_compare_to_previous_cms-company-3 onclick="document.forms['website']['website:act'].value='website:snapshot_compare_to_previous_cms-company-3';document.forms['website']['sandbox'].value='cms-company-3';document.forms['website']['version'].value='197';document.forms['website'].submit();return false;" href="#"><IMG style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; VERTICAL-ALIGN: -4px; BORDER-LEFT-WIDTH: 0px" title="Compare to Previous Snapshot" alt="Compare to Previous Snapshot" src="/servertype/images/icons/comparetoprevious.png"></A>&nbsp;&nbsp;<A id=snapshot_compare_to_any_cms-company-3 onclick="document.forms['website']['website:act'].value='website:snapshot_compare_to_any_cms-company-3';document.forms['website']['sandbox'].value='cms-company-3';document.forms['website']['version'].value='197';document.forms['website'].submit();return false;" href="#"><IMG style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; VERTICAL-ALIGN: -4px; BORDER-LEFT-WIDTH: 0px" title="Compare to Any Snapshot" alt="Compare to Any Snapshot" src="/servertype/images/icons/comparetoany.png"></A></NOBR></TD></TR></TBODY></TABLE></DIV></TD></TR>
    Now - I'm after the number 197 which appears couple of times in the code.

    I know what is the 'My text for label', 'my text for description' and 'my username'. Also those 3 links in form of images have always the same structure, apart from the number in question - number 197.

    How do I extract the 197 number alone ?

    If I would get it, I would use it to identify the link with ID: "id=snapshot_deploy_cms-company-3" and invoke it.

    Basically I want to do this:
    If a TR tag contains 'My text for label' and also contains 'my username', then
    read the number 197
    and... say... save the number into a textbox.
    Last edited by marl; Feb 27th, 2011 at 09:40 AM. Reason: removed some personal info

  2. #2
    VB Addict Pradeep1210's Avatar
    Join Date
    Apr 2004
    Location
    Inside the CPU...
    Posts
    6,614

    Re: extract a "value" from html tag or a textbox

    Not tested code, but will give you an idea how to do it:

    vb.net Code:
    1. For Each TR In WebBrowser1.Document.Body.GetElementsByTagName("TR")
    2.     If TR.InnerText Like "My text for label*my username*" Then
    3.         For Each TD In TR.GetElementsByTagName("TD")
    4.             If IsNumeric(TD.InnerText) Then
    5.                 MsgBox(TD.InnerText)
    6.                 Exit For
    7.             End If
    8.         Next
    9.         Exit For
    10.     End If
    11. Next
    Pradeep, Microsoft MVP (Visual Basic)
    Please appreciate posts that have helped you by clicking icon on the left of the post.
    "A problem well stated is a problem half solved." — Charles F. Kettering

    Read articles on My Blog101 LINQ SamplesJSON ValidatorXML Schema Validator"How Do I" videos on MSDNVB.NET and C# ComparisonGood Coding PracticesVBForums Reputation SaverString EnumSuper Simple Tetris Game


    (2010-2013)
    NB: I do not answer coding questions via PM. If you want my help, then make a post and PM me it's link. If I can help, trust me I will...

  3. #3

    Thread Starter
    Lively Member
    Join Date
    Sep 2010
    Location
    Glogow, Poland
    Posts
    104

    Re: extract a "value" from html tag or a textbox

    just spotted an answer in this thread: http://www.vbforums.com/showthread.php?t=642582

    unfortunatelly it isn't complete for my purposes.

    here is the code:
    Code:
    Imports System
    Imports System.Text.RegularExpressions
    Code:
    Dim mCollect As MatchCollection = Regex.Matches(TextBox1.Text.ToString, "(?<=<TD>my username</TD><TD>).*?(?=</TD><TD>&nbsp;</TD>)", RegexOptions.IgnoreCase)
    
            For Each m As Match In mCollect
                MsgBox(m.Value)
            Next
    this works well if the html code comes in one line, but it does not. In the HTML code there are line breaks between those </TD><TD>. How do I tell it to take those line break into account?

    @Pradeep1210 - thank for the code. I will give it a go, but it may take me a while to get my head around it as I'm reading HTML codes in a different way, so I would need to adjust your code.

  4. #4

    Thread Starter
    Lively Member
    Join Date
    Sep 2010
    Location
    Glogow, Poland
    Posts
    104

    Re: extract a "value" from html tag or a textbox

    ok - I got it working.

    It is lame, but does the job (apologies for messy code):
    Code:
    Try
                Dim theElementCollection As HtmlElementCollection
                theElementCollection = Form2.WebBrowser1.Document.GetElementsByTagName("tr")
                For Each curElement As HtmlElement In theElementCollection
                    If curElement.GetAttribute("OuterHtml").Contains("My text for label") AndAlso curElement.GetAttribute("OuterHtml").Contains("my username") AndAlso curElement.GetAttribute("OuterHtml").Contains("my text for description") Then
                        Form5.TextBox1.Text = curElement.GetAttribute("OuterHtml").ToString
                    End If
                Next
                Dim mCollect As MatchCollection = Regex.Matches(Form5.TextBox1.Text.ToString, "(?<='version'].value=').*?(?=';document.forms)", RegexOptions.IgnoreCase)
    
                For Each m As Match In mCollect
    
                    
                    Form5.TextBox2.Text = m.Value
                    
                Next
    
            Catch exc As Exception
                MsgBox(exc.Message)
            End Try
    so what it does... it reads the html code within those links ("a") and it finds string (.*?) between other unique strings ('version'].value=' .... ';document.forms). This returns my number (197) several times, but I'm not bothered as it is the same it each case. It puts the number into a text box (Form5.TextBox2.Text = m.Value).

    Then I invoke the desired member using this code:
    Code:
    Try
                Dim theElementCollection As HtmlElementCollection
                theElementCollection = Form2.WebBrowser1.Document.GetElementsByTagName("a")
                For Each curElement As HtmlElement In theElementCollection
                    If curElement.GetAttribute("OuterHtml").Contains("A id=snapshot_deploy_cms-company-3") AndAlso curElement.GetAttribute("OuterHtml").Contains(Form5.TextBox2.Text) Then
                        curElement.InvokeMember("click")
                    End If
                Next
                
            Catch exc As Exception
                MsgBox(exc.Message)
            End Try
    so this knows which element is suppose to click as it looks for 'A id=snapshot_deploy_cms-company-3' and also number 197 in this case represented by 'Form5.TextBox2.Text'.

    Job done, but once again thank you Pradeep1210 for your reply.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width