|
-
Feb 27th, 2011, 09:33 AM
#1
Thread Starter
Lively Member
[RESOLVED] extract a "value" from html tag or a textbox
Hi,
Please help me do something.
The full story is very very long, but the bottom line is:
I'm reading html tags from a webpage using Webbrowser and HtmlElementCollection class. Then I invoke specific members (link, buttons etc.) depending on their attributes.
I got stuck at one point. The webpage has many similar images in one place. They can be distinguished only but a version number. I need to get this number somehow to be able to invoke the correct link/image.
this is the example of the html code (it looks like that - I only removed personal imformation from it) - lets say I got it already saved in a textbox:
Code:
<TR>
<TD>My text for label</TD>
<TD>my text for description</TD>
<TD>27 February 2011 12:12</TD>
<TD>my username</TD>
<TD>197</TD>
<TD> </TD>
<TD><NOBR><A id=snapshot_revert_cms-company-3 onclick="document.forms['website']['website:act'].value='website:snapshot_revert_cms-company-3';document.forms['website']['sandbox'].value='cms-company-3';document.forms['website']['version'].value='197';document.forms['website'].submit();return false;" href="#"><IMG style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; VERTICAL-ALIGN: -4px; BORDER-LEFT-WIDTH: 0px" title=Revert alt=Revert src="/servertype/images/icons/revert.gif"></A> <A id=snapshot_deploy_cms-company-3 onclick="document.forms['website']['website:act'].value='website:snapshot_deploy_cms-company-3';document.forms['website']['store'].value='cms-company-3';document.forms['website']['version'].value='197';document.forms['website'].submit();return false;" href="#"><IMG style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; VERTICAL-ALIGN: -4px; BORDER-LEFT-WIDTH: 0px" title=Deploy alt=Deploy src="/servertype/images/icons/deploy.gif"></A> <A id=snapshot_compare_to_previous_cms-company-3 onclick="document.forms['website']['website:act'].value='website:snapshot_compare_to_previous_cms-company-3';document.forms['website']['sandbox'].value='cms-company-3';document.forms['website']['version'].value='197';document.forms['website'].submit();return false;" href="#"><IMG style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; VERTICAL-ALIGN: -4px; BORDER-LEFT-WIDTH: 0px" title="Compare to Previous Snapshot" alt="Compare to Previous Snapshot" src="/servertype/images/icons/comparetoprevious.png"></A> <A id=snapshot_compare_to_any_cms-company-3 onclick="document.forms['website']['website:act'].value='website:snapshot_compare_to_any_cms-company-3';document.forms['website']['sandbox'].value='cms-company-3';document.forms['website']['version'].value='197';document.forms['website'].submit();return false;" href="#"><IMG style="BORDER-RIGHT-WIDTH: 0px; BORDER-TOP-WIDTH: 0px; BORDER-BOTTOM-WIDTH: 0px; VERTICAL-ALIGN: -4px; BORDER-LEFT-WIDTH: 0px" title="Compare to Any Snapshot" alt="Compare to Any Snapshot" src="/servertype/images/icons/comparetoany.png"></A></NOBR></TD></TR></TBODY></TABLE></DIV></TD></TR>
Now - I'm after the number 197 which appears couple of times in the code.
I know what is the 'My text for label', 'my text for description' and 'my username'. Also those 3 links in form of images have always the same structure, apart from the number in question - number 197.
How do I extract the 197 number alone ?
If I would get it, I would use it to identify the link with ID: "id=snapshot_deploy_cms-company-3" and invoke it.
Basically I want to do this:
If a TR tag contains 'My text for label' and also contains 'my username', then
read the number 197
and... say... save the number into a textbox.
Last edited by marl; Feb 27th, 2011 at 09:40 AM.
Reason: removed some personal info
-
Feb 27th, 2011, 09:53 AM
#2
Re: extract a "value" from html tag or a textbox
Not tested code, but will give you an idea how to do it:
vb.net Code:
For Each TR In WebBrowser1.Document.Body.GetElementsByTagName("TR") If TR.InnerText Like "My text for label*my username*" Then For Each TD In TR.GetElementsByTagName("TD") If IsNumeric(TD.InnerText) Then MsgBox(TD.InnerText) Exit For End If Next Exit For End If Next
-
Feb 27th, 2011, 10:14 AM
#3
Thread Starter
Lively Member
Re: extract a "value" from html tag or a textbox
just spotted an answer in this thread: http://www.vbforums.com/showthread.php?t=642582
unfortunatelly it isn't complete for my purposes.
here is the code:
Code:
Imports System
Imports System.Text.RegularExpressions
Code:
Dim mCollect As MatchCollection = Regex.Matches(TextBox1.Text.ToString, "(?<=<TD>my username</TD><TD>).*?(?=</TD><TD> </TD>)", RegexOptions.IgnoreCase)
For Each m As Match In mCollect
MsgBox(m.Value)
Next
this works well if the html code comes in one line, but it does not. In the HTML code there are line breaks between those </TD><TD>. How do I tell it to take those line break into account?
@Pradeep1210 - thank for the code. I will give it a go, but it may take me a while to get my head around it as I'm reading HTML codes in a different way, so I would need to adjust your code.
-
Feb 27th, 2011, 12:00 PM
#4
Thread Starter
Lively Member
Re: extract a "value" from html tag or a textbox
ok - I got it working.
It is lame, but does the job (apologies for messy code):
Code:
Try
Dim theElementCollection As HtmlElementCollection
theElementCollection = Form2.WebBrowser1.Document.GetElementsByTagName("tr")
For Each curElement As HtmlElement In theElementCollection
If curElement.GetAttribute("OuterHtml").Contains("My text for label") AndAlso curElement.GetAttribute("OuterHtml").Contains("my username") AndAlso curElement.GetAttribute("OuterHtml").Contains("my text for description") Then
Form5.TextBox1.Text = curElement.GetAttribute("OuterHtml").ToString
End If
Next
Dim mCollect As MatchCollection = Regex.Matches(Form5.TextBox1.Text.ToString, "(?<='version'].value=').*?(?=';document.forms)", RegexOptions.IgnoreCase)
For Each m As Match In mCollect
Form5.TextBox2.Text = m.Value
Next
Catch exc As Exception
MsgBox(exc.Message)
End Try
so what it does... it reads the html code within those links ("a") and it finds string (.*?) between other unique strings ('version'].value=' .... ';document.forms). This returns my number (197) several times, but I'm not bothered as it is the same it each case. It puts the number into a text box (Form5.TextBox2.Text = m.Value).
Then I invoke the desired member using this code:
Code:
Try
Dim theElementCollection As HtmlElementCollection
theElementCollection = Form2.WebBrowser1.Document.GetElementsByTagName("a")
For Each curElement As HtmlElement In theElementCollection
If curElement.GetAttribute("OuterHtml").Contains("A id=snapshot_deploy_cms-company-3") AndAlso curElement.GetAttribute("OuterHtml").Contains(Form5.TextBox2.Text) Then
curElement.InvokeMember("click")
End If
Next
Catch exc As Exception
MsgBox(exc.Message)
End Try
so this knows which element is suppose to click as it looks for 'A id=snapshot_deploy_cms-company-3' and also number 197 in this case represented by 'Form5.TextBox2.Text'.
Job done, but once again thank you Pradeep1210 for your reply.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|