-
Sep 4th, 2017, 12:47 PM
#1
Thread Starter
New Member
Grab text from the web [how to?]
Hi, i'm not a rookie but i'm not (at all) an expert in VB, a few months ago, i created a simple program to auto renew my books from my school's library. The code works well, i just use a webbrowser navigate to it, and the click the buttons with a timer (that consider the internet speed, changing the time to start). e.g.: WebBrowser1.Document.GetElementById("login").SetAttribute("value", TextBox1.Text) / or WebBrowser1.Document.GetElementById("btn_gravar").InvokeMember("click") ... simple as that.
but now i want (after the renew thing) to withdraw some information on the page, i searched a lot of tutorials (some with, almost, the same problem as me) but none of them worked, it looks so simple but i cant find a way to work with. can you help me?
here is the website code (since you can't log in to see)
what i want is:
(1) Return until: 14/09/2017
(2) Total of renewals performed: 0
(3) Reference: MONTEIRO, Washington de Barros.
Code:
<div id="1b" style="">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tbody><tr>
<td class="box_do_detalhes">
<table width="100%" border="0" cellpadding="0" cellspacing="0">
<tbody><tr>
<td colspan="2" class="box_f7f7f7_c">Reference: MONTEIRO, Washington de Barros. <b> Curso de direito civil. </b> <b></b> 36. ed. São Paulo: Saraiva, 2001. 350 p. ISBN 8502020439 </td>
</tr>
<tr>
<td class="box_f7f7f7_c"><em><strong>Call number: 342.1 M775c 2001 (BU-JC)</strong></em></td>
<td class="box_fffff_c">Unidade de Informação source: Biblioteca </td>
</tr>
<tr>
<td class="box_f7f7f7_c">Type of loan: Estágio </td>
<td class="box_fffff_c">Description: v.2 , nº 4 </td>
</tr>
<tr>
<td class="box_f7f7f7_c">Date of loan: 31/08/2017 07:52:55</td>
<td class="box_fffff_c">Return until: <strong>14/09/2017</strong></td>
</tr>
<tr>
<td class="box_f7f7f7_c">Fine partial amount: $ 0</td>
<td class="box_fffff_c">Total of renewals performed: 0</td>
</tr>
</tbody></table>
</td>
</tr>
</tbody></table>
</div>
here's a second example, (a second book) this goes on, 1b ; 2b; 3b; 4b; 5b; 6b; and 7b (since 7 books is the max you may retain)
Code:
<div id="2b" style="">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tbody><tr>
<td class="box_do_detalhes">
<table width="100%" border="0" cellpadding="0" cellspacing="0">
<tbody><tr>
<td colspan="2" class="box_f7f7f7_c">Reference: FIUZA, César. <b> Direito civil: </b> curso completo. <b></b> 12. ed., rev., atual. e ampl. Belo Horizonte: Del Rey, 2008. xxiv, 1084 p. ISBN 9788573089868. </td>
</tr>
<tr>
<td class="box_f7f7f7_c"><em><strong>Call number: 342.1 F565d 2008 (BU-JC)</strong></em></td>
<td class="box_fffff_c">Unidade de Informação source: Biblioteca </td>
</tr>
<tr>
<td class="box_f7f7f7_c">Type of loan: Estágio </td>
<td class="box_fffff_c">Description: nº 6 </td>
</tr>
<tr>
<td class="box_f7f7f7_c">Date of loan: 31/08/2017 07:53:07</td>
<td class="box_fffff_c">Return until: <strong>14/09/2017</strong></td>
</tr>
<tr>
<td class="box_f7f7f7_c">Fine partial amount: $ 0</td>
<td class="box_fffff_c">Total of renewals performed: 0</td>
</tr>
</tbody></table>
</td>
</tr>
</tbody></table>
</div>
-
Sep 4th, 2017, 01:55 PM
#2
Re: Grab text from the web [how to?]
You can try something like this, and do any additional parsing using some basic string methods.
Code:
Dim elems As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("tr")
For Each tr As HtmlElement In elems
Dim colTD As HtmlElementCollection = tr.GetElementsByTagName("td")
For Each td As HtmlElement In colTD
Debug.WriteLine(td.InnerText) ' add this text to a list or something.
Next td
Next tr
And one way to parse html text, to strip data out,,...
Code:
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim tmp = TextBetween(html_source_text, ">Return until:", "</td>")
Dim retunUntil = TextBetween(tmp, "<strong>", "</strong>").Trim
MsgBox("Return until: " & retunUntil)
Dim totalRenews = TextBetween(html_source_text, ">Total of renewals performed:", "</td>").Trim
MsgBox("Total of renewals performed: " & totalRenews)
Dim reference = TextBetween(html_source_text, ">Reference:", "<b>").Trim
MsgBox("Reference: " & reference)
End Sub
Private Function TextBetween(mainText As String, findFirst As String, findSecond As String) As String
Dim l = mainText.IndexOf(findFirst) + findFirst.Length
Dim r = mainText.IndexOf(findSecond, l)
Return If(r > l, mainText.Substring(l, r - l), "")
End Function
Last edited by Edgemeal; Sep 5th, 2017 at 10:52 AM.
-
Sep 5th, 2017, 01:46 PM
#3
Thread Starter
New Member
Re: Grab text from the web [how to?]
Thank you Sr.,
but 1 problem here, it's saying that the html_source_text it's not declared, how should i proceed?
-
Sep 5th, 2017, 11:47 PM
#4
Re: Grab text from the web [how to?]
html_source_text it's not declared
You may replace that variable with the actual page source such as the DocumentText property of the WebBrowser control.
- kgc
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|