-
Feb 13th, 2020, 06:33 PM
#1
Thread Starter
New Member
Get regex from web source
Hi
I was wondering if there was a way to obtain data from a web source,using regex.
I am using VB.net 2019 (visual studio,community edition)
I have used Regex Builder,to obtain the regex for the dogs name ((?<=\<h4\>).+.+?(?=\<\/h4\>)),but how would I write the code to print the name,to a textbox.
The web source is https://greyhoundsform.betfair.com/racingform
And I need is the vb code to write the regex expression to obtain the dogs name (Glenside Ariel),the Trainers name (D K Hurlock)
This is the HTML from the web source.
<th class="header dog1" colspan="11"><h4>Glenside Ariel</h4><div class="trainer"><strong>Trainer: </strong><span>D K Hurlock</span></div><div class="breeding">Superior Product - Lisneal Emily (Mar '17)</div></th>
Any assistance would be greatfully appreciated.
Thanks in advance
-
Feb 13th, 2020, 07:40 PM
#2
Re: Get regex from web source
Here's an example for Using regex to parse an Html page source
https://www.dotnetperls.com/regex-vbnet
Parsing Html is not the most reliable. Source code can and probably will change over time, leaving your code not working. If betfair publish an xml or a Json file that'd be a much more reliable source...
- Coding Examples:
- Features:
- Online Games:
- Compiled Games:
-
Feb 14th, 2020, 04:38 AM
#3
Thread Starter
New Member
Re: Get regex from web source
Originally Posted by .paul.
Here's an example for Using regex to parse an Html page source
https://www.dotnetperls.com/regex-vbnet
Parsing Html is not the most reliable. Source code can and probably will change over time, leaving your code not working. If betfair publish an xml or a Json file that'd be a much more reliable source...
Hello .paul
Many thanks for your reply and link.
I have copy/pasted the xml file to notepad.
How would I use this to obtain the required information.
I have pasted the xml file to notepad
Thanks
-
Feb 14th, 2020, 05:07 AM
#4
Re: Get regex from web source
If you really do have an xml file and not an html file as I suspect, there are loads of examples of how to read an xml file on google
- Coding Examples:
- Features:
- Online Games:
- Compiled Games:
-
Feb 14th, 2020, 07:00 AM
#5
Re: Get regex from web source
you could use a Webbrowser to navigate, and display the Content in a Richtextbox
add a Webbrowser control to your form and a Richtextbox
Code:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
WebBrowser1.Navigate("https://greyhoundsform.betfair.com/racingform")
End Sub
Code:
Private Sub Button2_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button2.Click
RichTextBox1.Text = WebBrowser1.Document.Body.InnerText
End Sub
the result in Richtextbox
Code:
Betfair Greyhound Racing Guide Select Meeting...Central Park, 14th FebruaryCrayford, 14th FebruaryHarlow, 14th FebruaryHenlow, 14th FebruaryHove, 14th FebruaryKinsley, 14th FebruaryMonmore, 14th FebruaryNewcastle, 14th FebruaryNottingham, 14th FebruaryPeterborough, 14th FebruaryRomford, 14th FebruarySheffield, 14th FebruarySunderland, 14th FebruarySwindon, 14th February Select Race...18:33 D2 265m18:49 A1 480m19:07 D4 265m19:23 HP 480m19:38 A4 480m19:54 A1 480m20:11 A5 480m20:27 A3 480m20:42 D1 265m20:57 A2 480m21:16 A1 480m Last six races form summary:
Winners at track:1, 3, 4, 6Quickest race time:2
Winners at grade:Quickest break:
Winners over distance:1, 3, 4, 6Most wins:6
Enable/Disable Greyhound form help : ONCentral Park, 14th February - 18:33 D2 265m
Casabuca
Trainer: J T FosterKinloch Brae - Carnis Kate (Oct '15)
07 Feb26520.006-5-4thmsd brk, rls16.85-10D217.19
30 Jan26520.0011111strls16.97-15T116.82
29 Dec26510.001-3-2ndep, crd1, ran on16.94-20D216.78
08 Dec26510.002-3-4thblk1, fcd to ck216.78-10D217.10
28 Nov26520.003-3-4thcrd116.90-15D216.91
Chesterfield Oak
Trainer: S MavriasKnockglass Billy - Madams Babe (Mar '17)
07 Feb26510.004-2-2ndmsd brk, rls16.85-10D216.98
03 Feb26510.003-1-2ndep, ld1- rn in, cm ag16.95-10D216.86
27 Jan26510.003-4-2ndcrd1, rls mid17.03-30D216.91
20 Jan26520.003-3-3rdcrd116.70-10D216.80
14 Jan26520.001-2-4thep, sn ld- blk217.01-20D216.96
Bellside Well
Trainer: G L DavidsonGreenwell Hulk - Bellside Merry (Aug '16)
10 Feb26530.002-2-2ndep, mid, ev ch16.89-25D216.72
03 Feb26540.004-6-6thbmp116.95-10D217.50
28 Jan26540.002-2-2ndmid17.20-30T217.08
31 Dec26540.0011111stmiddle17.13-10T117.03
22 Nov26530.003-4-4thmid wide, ev ch16.65-15D116.86
Ballykevin Davy
Trainer: M N FenwickDroopy Jet - Fortwilliam Hawk (Sep '16)
10 Feb26540.005-5-5thmid, bmp 1/216.89-25D216.94
04 Feb26540.006-6-6thmsd brk, mid16.97-10D217.10
29 Jan26540.006-6-6thcrowded1, baulked216.98-20D217.28
24 Jan26530.004-2-1step, ld116.97-15D316.82
16 Jan26540.002-3-3rdep, chl rn in16.96-15D316.86
Let Her Linger
Trainer: A M P CollettVans Escalade - Mongys Girl (Jul '16)
10 Feb26550.006-4-4ths aw, mid wide, bmp rn in16.89-25D216.92
03 Feb26550.006-5-5ths aw, mid wide16.95-10D217.06
28 Jan26550.004-4-3rdbadly crowded117.03-30D217.19
22 Jan26550.001-3-3rdq aw, ld-116.76-10D216.98
16 Jan26550.006-6-5ths aw, mid to wide16.82-15D216.99
Nuke Na Bansha
Trainer: J T FosterDroopys Jet - Nuke Lassie (Sep '16)
07 Feb26550.005-2-2ndwide, crd 1/216.86-10D217.09
28 Jan26550.005-6-6thblk& stb1, ck wide216.95-30D217.45
10 Jan26560.001-1-1stwide, crd& ld117.13-15T316.98
12 Dec26550.001-6-6thep, chl& blk 1/2, v wide217.20-15D217.40
06 Dec26560.001-1-1stwide, a led17.03-10T316.93
Information (such as trainer name, breeding, previous run information etc) is provided "as is" and is for guidance only.
Betfair does not guarantee the accuracy of this information and use of it to place bets is entirely at your own risk.
to hunt a species to extinction is not logical !
since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.
-
Feb 16th, 2020, 10:48 AM
#6
Thread Starter
New Member
Re: Get regex from web source
Hello ChrisE
Thank you for your reply.
Will I be able to use regex to parse the required data,from the example that you have posted.
Thanks
-
Feb 16th, 2020, 10:57 AM
#7
Re: Get regex from web source
Regular expressions are very much sufficient to extract the contents of a single tag and the whole argument from SO was blown well out of context since it was tongue and cheek. You are not parsing HTML, but extracting.
-
Feb 16th, 2020, 11:11 AM
#8
Re: Get regex from web source
Originally Posted by ident
Regular expressions are very much sufficient to extract the contents of a single tag and the whole argument from SO was blown well out of context since it was tongue and cheek. You are not parsing HTML, but extracting.
Html is not reliably consistent. What works today might not work tomorrow or next week...
- Coding Examples:
- Features:
- Online Games:
- Compiled Games:
-
Feb 16th, 2020, 11:11 AM
#9
Re: Get regex from web source
Originally Posted by ident
Regular expressions are very much sufficient to extract the contents of a single tag and the whole argument from SO was blown well out of context since it was tongue and cheek. You are not parsing HTML, but extracting.
Html is not reliably consistent. What works today might not work tomorrow or next week...
- Coding Examples:
- Features:
- Online Games:
- Compiled Games:
-
Feb 16th, 2020, 11:25 AM
#10
Re: Get regex from web source
Originally Posted by BigDan
Hello ChrisE
Thank you for your reply.
Will I be able to use regex to parse the required data,from the example that you have posted.
Thanks
what do you mean, extract from the Richtextbox?
i would save the result from the Richtextbox to a Textfile, and then extract what I need.
something like..
Code:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Using Writer As New System.IO.StreamWriter("D:\Trainers.txt") 'extract and write to file
Using reader As New StreamReader("D:\Dograce.txt") 'the Textfile with data
While Not reader.EndOfStream
Dim line As String = reader.ReadLine()
If line.Contains("Trainer") Then
Debug.Print(line)
Writer.WriteLine(line)
End If
End While
End Using
End Using
End Sub
Code:
debug output
Trainer: J T FosterKinloch Brae - Carnis Kate (Oct '15)
Trainer: S MavriasKnockglass Billy - Madams Babe (Mar '17)
Trainer: G L DavidsonGreenwell Hulk - Bellside Merry (Aug '16)
Trainer: M N FenwickDroopy Jet - Fortwilliam Hawk (Sep '16)
Trainer: A M P CollettVans Escalade - Mongys Girl (Jul '16)
Trainer: J T FosterDroopys Jet - Nuke Lassie (Sep '16)
hth
to hunt a species to extinction is not logical !
since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.
-
Feb 16th, 2020, 11:51 AM
#11
Re: Get regex from web source
Originally Posted by .paul.
Html is not reliably consistent. What works today might not work tomorrow or next week...
Thats not relevant to anything now is it. Reading any HTML can change, regardless of what method is used could be breaking code at any time.
This famous stack over flow post has been transformed into utter garbage.
-
Feb 16th, 2020, 11:59 AM
#12
Re: Get regex from web source
That is very relevant. I like my code to continue working once i've written it...
- Coding Examples:
- Features:
- Online Games:
- Compiled Games:
-
Feb 16th, 2020, 03:42 PM
#13
Re: Get regex from web source
Then why can you not understand the difference between parsing html with regex or extracting elements using regex? Of course if you forcefully change an element that should be matched would result in no successful match. No one ever disputed this.
elements are not parsing
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|