-
[RESOLVED] WebBrowser - Html: Get values.
I want with the click on the Button to get values from the site.
The site has a code example:
1.
Code:
Name: <b>Daniel</b><BR>
Daniel is the value I want to get into TextBox1.text
2.Same as the first:
Code:
<a onclick=" infowin(478488) " href="#"> ' It's not the full code
in the code:
Green: variable
Red: Text that I want to get what inside the brackets (variable)
-
Re: WebBrowser - Html: Get values.
If those values are static, you can download the page, and then just find the strings...
Look to this thread. You can see how to get all the contents of the page, then you just need to search the values that you need in the string, you can use RegularExpressions to do the search.
-
Re: WebBrowser - Html: Get values.
Quote:
Originally Posted by
mickey_pt
If those values are static, you can download the page, and then just find the strings...
Look to this
thread. You can see how to get all the contents of the page, then you just need to search the values that you need in the string, you can use RegularExpressions to do the search.
I do not want to download the page, I want to use it through webbrowser
like:
Code:
TextBox1.Text = WebBrowser1.Document.GetElementById("ID").InnerText
-
Re: WebBrowser - Html: Get values.
loop through the <b> elements, if the OuterText.Contains "Name:" then that's the one you want. Get the InnerText of the <b> element that matches that condition.
for the other, loop through the anchor elements (<a>) and get the InnerHtml. Use a regular expression to get the "infowin(.*)" text. or try GetAttribute("onclick") to see if that returns what you want.
or view my blog in my sig to see other HTML scraping stuff...
-
Re: WebBrowser - Html: Get values.
You can show me the code? Did not really understand
-
Re: WebBrowser - Html: Get values.
replace webpage_source_code_string with a string of the source code
Code:
Imports System
Imports System.Text.RegularExpressions
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim mCollect As MatchCollection = Regex.Matches("webpage_source_code_string", "(?<=<b>).*?(?=</br>)", RegexOptions.IgnoreCase)
For Each m As Match In mCollect
MsgBox(m.Value)
Next
End Sub
End Class
-
Re: WebBrowser - Html: Get values.
It is not good (it brings me a different kind of source code site.) Only this way can help me:
Quote:
Originally Posted by
stateofidleness
loop through the <b> elements, if the OuterText.Contains "Name:" then that's the one you want. Get the InnerText of the <b> element that matches that condition.
for the other, loop through the anchor elements (<a>) and get the InnerHtml. Use a regular expression to get the "infowin(.*)" text. or try GetAttribute("onclick") to see if that returns what you want.
or view my blog in my sig to see other HTML scraping stuff...
Just i did not understood how to do it right.
-
Re: WebBrowser - Html: Get values.
-
Re: WebBrowser - Html: Get values.
-
Re: WebBrowser - Html: Get values.
maybe if you make your question simple and short me or the other forum users can solve it
EH I DIGIVOLVED INTO Hyperactive Member
-
Re: WebBrowser - Html: Get values.
Hi mamrom,
I had very similar problem recently.
I got a solution and I hope it will work for you as well.
Please bear in mind that some characters are not accepted in my solution.
this code is for your variable "Daniel"
Code:
Try
Dim theElementCollection As HtmlElementCollection
'this finds all HTML tags type b as in html tag for bold text
theElementCollection = WebBrowser1.Document.GetElementsByTagName("b")
For Each curElement As HtmlElement In theElementCollection
'now we look for the item we are interested in providing some code that is static within the html tag - in this example before and after the variable
If curElement.GetAttribute("OuterHtml").Contains("Name: <b>") AndAlso curElement.GetAttribute("OuterHtml").Contains("</b><BR>") Then
'if we got it we put it into a textbox - the entire html tag
TextBox1.Text = curElement.GetAttribute("OuterHtml").ToString
End If
Next
'now we are trying for find the variable between static parts of the html tag
'you may see the static parts there. the variable is represented by .*?
Dim mCollect As MatchCollection = Regex.Matches(TextBox1.Text.ToString, "(?<=Name: <b>).*?(?=</b><BR>)", RegexOptions.IgnoreCase)
For Each m As Match In mCollect
'and we are displaying the variable in a message box. This should return Daniel as the variable
MsgBox(m.Value)
Next
Catch exc As Exception
MsgBox(exc.Message)
End Try
basically what you are doing is that you are converting entire html tag into a normal text and you are looking for a stuff that is between specified pieces on other bits of text.
you provided just an example of the real code. I had a problem with such approach. I realised that some stuff (characters) is not accepted in this line:
Dim mCollect As MatchCollection = Regex.Matches(TextBox1.Text.ToString, "(?<=Name: <b>).*?(?=</b><BR>)", RegexOptions.IgnoreCase)
so stuff like "" or '' or ;; etc may is some cases be not accepted. I did not figured out the exact reasons for it yet.
please refer to my thread to get more info: http://www.vbforums.com/showthread.php?t=642625
hope this helps.
regards
-
Re: WebBrowser - Html: Get values.
First of all thanks for trying but it did not help: (
I made example of what I want:
Code:
<HTML>
<HEAD>
</HEAD>
<BODY>
Name: <b>Daniel</b><BR>
gender: <b>Male</b><BR>
</BODY>
</HTML>
1. Get the value "Daniel" into the text box (Daniel as variable)
-
Re: WebBrowser - Html: Get values.
am i being punked, dude post #6 is your solution c'mon man.
just raplace "webpage_source_code_string" with the source code which you can get with a webbrowser control, with it's intellisense navigate to the url, and get the source code (<html>...)
-
Re: WebBrowser - Html: Get values.
Download the source code and then find it does not solve the problem, it gives me a different source code.
-
Re: WebBrowser - Html: Get values.
mamrom post your current code.
moti's code will work perfectly, so i'm assuming an implementation problem at keyboard ;)
post your code so we can see/fix it.
-
Re: WebBrowser - Html: Get values.
It works but the source code wrong. This brings me to the source code of another site and not his.
I use this code To download source code:
Code:
Dim Cliant As New WebClient
Dim Url As String
Url = WebBrowser1.Url.ToString
Dim html As String = Cliant.DownloadString(New Uri(Url))
RichTextBox1.Text = html
Do you have any idea what could be the problem?
Maybe if i sign in server it will work? (How do I do this?)
-
Re: WebBrowser - Html: Get values.
Quote:
Originally Posted by
mamrom
It works but the source code wrong. This brings me to the source code of another site and not his.
I use this code To download source code:
Code:
Dim Cliant As New WebClient
Dim Url As String
Url = WebBrowser1.Url.ToString
Dim html As String = Cliant.DownloadString(New Uri(Url))
RichTextBox1.Text = html
Do you have any idea what could be the problem?
Maybe if i sign in server it will work? (How do I do this?)
I forgot to say, the site only supports Internet Explorer
-
Re: WebBrowser - Html: Get values.
-
Re: WebBrowser - Html: Get values.
Since you are loading the webpage in an WebBrowser Control(say "WebBrowser1") then you can use WebBrowser1.DocumentText to the Source Code of the WebPage
-
Re: WebBrowser - Html: Get values.
combining moti's code and aashish's last comment:
vb Code:
Imports System
Imports System.Text.RegularExpressions
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim mCollect As MatchCollection = Regex.Matches(WebBrowser1.DocumentText, "(?<=<b>).*?(?=</br>)", RegexOptions.IgnoreCase)
For Each m As Match In mCollect
MsgBox(m.Value)
Next
End Sub
End Class
-
Re: WebBrowser - Html: Get values.
Quote:
Originally Posted by
stateofidleness
combining moti's code and aashish's last comment:
vb Code:
Imports System
Imports System.Text.RegularExpressions
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim mCollect As MatchCollection = Regex.Matches(WebBrowser1.DocumentText, "(?<=<b>).*?(?=</br>)", RegexOptions.IgnoreCase)
For Each m As Match In mCollect
MsgBox(m.Value)
Next
End Sub
End Class
thx!
its not work (but the source code and code from marl yes)
Thank you very much
-
Re: WebBrowser - Html: Get values.
gonna assume that means it's working? lol
-
Re: WebBrowser - Html: Get values.
There's one little problem with the encoding
Source code looks like this:
http://img46.imageshack.us/img46/6301/29041057.png
"�"
(This Hebrew language)
-
Re: WebBrowser - Html: Get values.
we need to be clear here going forward.. when you use the phrase "Source code", indicate if you're talking about HTML code or the VB Source code you're using. It's a bit confusing.
Is the web page itself in Hebrew? Are you able to provide a link to the page you're trying this on?
-
Re: WebBrowser - Html: Get values.
Quote:
Originally Posted by
stateofidleness
we need to be clear here going forward.. when you use the phrase "Source code", indicate if you're talking about HTML code or the VB Source code you're using. It's a bit confusing.
Is the web page itself in Hebrew? Are you able to provide a link to the page you're trying this on?
About html.
u give me the code: "WebBrowser1.DocumentText" to get Source code of site and is v. good, but Encoding is not good (in Hebrew letters it displays question marks[?])
Site
And the last question (really lol):
Moti's code works very well (i just checked wrong before)
Code:
onclick='infowin(15555)'>
I want to take the variable (marked in red) from website
Now I take the variable but it takes me to the closing in addition to
this is what i do:
Code:
Dim mCollect As MatchCollection = Regex.Matches(WebBrowser1.DocumentText, "(?<='infowin).*?(?='>)")
this what is give me:
(15555)
Just I need only the number without closing
Once again, thanks for the help and i'm sorry for i not so clear...
-
Re: WebBrowser - Html: Get values.
well you can do it with the regular expression, but I'm not good at those so.. once you have the (15555) you can use substring to get the number
.Substring(1,numberVariable.Length-1)
-
Re: WebBrowser - Html: Get values.
"numberVariable" show error
and wht about encoding?
Maybe if I change encoding on Richtextbox this help? (How is that possible?)
-
Re: WebBrowser - Html: Get values.
you need to store the "(15555)" in a variable. then, you use that variable where I put "numberVariable". You can name it whatever you want.
-
Re: WebBrowser - Html: Get values.
Quote:
Originally Posted by
stateofidleness
you need to store the "(15555)" in a variable. then, you use that variable where I put "numberVariable". You can name it whatever you want.
thx
wht about encoding? :(
-
Re: WebBrowser - Html: Get values.
I can't view the site at work (it gets blocked), but I'm still not sure what the question/problem is? is the HTML source code returned in the HEBREW language? Does it even matter? if you're just trying to click that button, it should work no?
-
Re: WebBrowser - Html: Get values.
the Site have Hebrew, now I take/Download the source code from site and the Hebrew becomes a question mark
http://img46.imageshack.us/img46/6301/29041057.png
Create a new html file and paste the following code:
HTML Code:
<HTML>
<HEAD>
</HEAD>
<BODY id="bbody" background="/i/bg.jpg" bgcolor="#282828" onload="mess();">
שם: <b>דניאל</b><BR>
מין: <b>זכר</b><BR>
</BODY>
open new from and paste the following code:
*with: Button1, Textbox1 and WebBrowser1
Code:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
WebBrowser1.Navigate(Patch)
Textbox.text = WebBrowser1.DocumentText
End Sub
Now instead of getting the Hebrew language you get the following sign: "�"
If you can not see Hebrew on your computer Here is a picture to will understand:
http://img819.imageshack.us/img819/3932/57966555.png
-
Re: WebBrowser - Html: Get values.
try:
vb.net Code:
Imports System
Import System.Web 'New
Imports System.Text.RegularExpressions
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim NewSource As String = Nothing
Dim html As New System.Web.HttpUtility()
NewSource = html.HtmlDecode(WebBrowser1.DocumentText)
Dim mCollect As MatchCollection = Regex.Matches(NewSource, "(?<=<b>).*?(?=</br>)", RegexOptions.IgnoreCase)
For Each m As Match In mCollect
MsgBox(m.Value)
Next
End Sub
End Class
-
Re: WebBrowser - Html: Get values.
-
Re: WebBrowser - Html: Get values.
yep, change
Imports System.Web
to
Imports System.Net.WebUtility
-
Re: WebBrowser - Html: Get values.
http://img222.imageshack.us/img222/6048/36815121.png
i try:
VB.net Code:
Imports System
Imports System.Text.RegularExpressions
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim NewSource As String = Nothing
NewSource = Net.WebUtility.HtmlDecode(WebBrowser1.DocumentText)
Dim mCollect As MatchCollection = Regex.Matches(NewSource, "(?<=<b>).*?(?=</br>)", RegexOptions.IgnoreCase)
For Each m As Match In mCollect
MsgBox(m.Value)
Next
End Sub
End Class
But the problem with Hebrew is not changed.
-
Re: WebBrowser - Html: Get values.
דניאל אתה צריך להתקין עברית מהדיסק של הווינדוס שהדפדפן אומר זאת
או שתתקין את ה -
LANGUAGE PACK בעברית
i told him to install hebrew
-
Re: WebBrowser - Html: Get values.
you can also add a richtextbox and paste the html source code in it
then in the code:
Code:
Imports System
Imports System.Text.RegularExpressions
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim mCollect As MatchCollection = Regex.Matches(richtextbox1.text, "(?<=<b>).*?(?=</br>)", RegexOptions.IgnoreCase)
For Each m As Match In mCollect
MsgBox(m.Value)
Next
End Sub
End Class
-
Re: WebBrowser - Html: Get values.
Quote:
Originally Posted by
moti barski
you can also add a richtextbox and paste the html source code in it
then in the code:
Code:
Imports System
Imports System.Text.RegularExpressions
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim mCollect As MatchCollection = Regex.Matches(richtextbox1.text, "(?<=<b>).*?(?=</br>)", RegexOptions.IgnoreCase)
For Each m As Match In mCollect
MsgBox(m.Value)
Next
End Sub
End Class
אגב, תודה רבה על העזרה שלך. פתרת לי בעיה אחת ואני מודה לך על כך מאוד (:
---------------------------------------------------------------------------
Ignore this comment
Moti in last comment did not help me unfortunately
-
Re: WebBrowser - Html: Get values.
maybe if we have the sites link
-
Re: WebBrowser - Html: Get values.
Quote:
Originally Posted by
moti barski
maybe if we have the sites link
I wrote above :)
ClickOnMe