regex help - first occurance
I am trying to get so it searches for just one row and returns the value. Right now my regex isn't working at all but guessing because it's not looking at first occurance.
How do I write the regex to find the first <tr><td>Description</td><td></td></tr>?
HTML Code:
<tr style="background-color:#EFF3FF;"> <td style="background-color:#DEE8F1;font-weight:bold;">Item</td><td>BOX123</td> </tr><tr style="background-color:White;"> <td style="background-color:#DEE8F1;font-weight:bold;">Description</td><td>Box 1 with blah blah</td> </tr><tr style="background-color:#EFF3FF;"> <td style="background-color:#DEE8F1;font-weight:bold;">Color</td><td>White</td> </tr><tr style="background-color:White;"> <td style="background-color:#DEE8F1;font-weight:bold;">Quantity</td><td>4</td> </tr>
Code:
Function GetItem(ByVal sHTML As String, ByVal sLabel As String) As String
Dim options As RegexOptions = RegexOptions.IgnoreCase Or RegexOptions.Multiline
Dim re As Regex = New Regex("<tr style="".*""> <td style="".*"">" & sLabel & "<\/td><td>(?<returnval>.*)<\/td> <\/tr>", options)
Dim sRetVal As String = ""
Dim mc As MatchCollection = re.Matches(sHTML)
For Each m As Match In mc
sRetVal = m.Groups("returnval").Value
Next
Return sRetVal
End Function
Re: regex help - first occurance
This would be far easier to do using HTML elements. Is there a particular reason for using Regex?
Also it's far from clear exactly what you want to capture. Could you please indicate what part exactly of the line ...
<tr style="background-color:#EFF3FF;"> <td style="background-color:#DEE8F1;font-weight:bold;">Item</td><td>BOX123</td> </tr>
... you want returned?
Re: regex help - first occurance
Just have done regex before when trying to grab stuff from html source code.
What do I want to capture? Well, if you look at my regex code you will see I am trying to capture a specific row in the html source code.
If you strip out the table row styles this is how it would look:
HTML Code:
<tr><td>Item</td><td>BOX123</td></tr>
<tr><td>Description</td><td>Box 1 with blah blah</td></tr>
<tr><td>Color</td><td>White</td></tr>
<tr><td>Quantity</td><td>4</td></tr>
If you strip it down even further, I have 4 table rows with 2 columns.
Item|BOX123
Description|Box 1 with blah blah
Color|White
Quantity|4
What my function is looking to do is pass in say 'Color' and it returns the value 'White'. Or I pass in Quantity and it returns 4. I have more then just these 4 rows to get data but just wanted to keep it as simple.
Do I need to use regex? No
What it be cool to have knowledge of how to get it to work? Yes
Could I do a substring with split and then other text manipulation? Yes but I was thinking with one regex expression I could get the value needed.
Just looking for the best solution.
Thanks in advance for any assistance.
Re: regex help - first occurance
try this...this is a little function I wrote to search for things/narrow them down
Code:
Public Class Form1
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
MsgBox(strbetween(TextBox1.Text, "Description|<td>", "</td>"))
End Sub
Function strbetween(ByVal text As String, ByVal starttexts As String, ByVal endtext As String)
strbetween = Nothing
Dim starttextarray As Array = Split(starttexts, "|")
Dim finaltext = Nothing
Dim startpos = 1
Dim endpos = Nothing
Try
For i = 0 To UBound(starttextarray)
startpos = InStr(startpos, text, starttextarray(i)) + Len(starttextarray(i))
Next
endpos = InStr(startpos, text, endtext)
finaltext = Mid(text, startpos, endpos - startpos)
strbetween = finaltext
Catch ex As Exception
strbetween = ex.Message
End Try
End Function
Strbetween function explanation..
Strbetween( FIRST PART, SECOND PART, THIRD PART)
First = the text to search
Second = the key words to search for. If the value us unique to finding it then it should be only one... separate it by the | symbol.
third = the ending word\symbol
it will pick out the values in between these....
Re: regex help - first occurance
I believe I figured it out with regex:
Code:
Function GetItem(ByVal sHTML As String, ByVal sLabel As String) As String
Dim options As RegexOptions = RegexOptions.IgnoreCase Or RegexOptions.Multiline
Dim re as Regex = New Regex("<tr .*?>.*?<td .*?>" & sLabel & "</td><td>(?<returnval>.*?)</td>.*?</tr>", options)
Dim sRetVal As String = ""
Dim mc As MatchCollection = re.Matches(sHTML)
For Each m As Match In mc
sRetVal = m.Groups("returnval").Value
Next
Return sRetVal
End Function