I am parsing through an HTML file using regex. I am trying to get the information that is between my pattern. But if I tell it to match any position preceding the expression (using ?=) it works. If I tell it to match following the expression it doesn't work.

Sample HTML Doc
Code:
    <table cellpadding="0" cellspacing="0" style="width:100%;" class="f-bold">
    
    
              <tr>
      
      <td valign="top" class="pr-10" style="width:50%;">
        <a href="http://www.aUrl">First Value Here</a><br />
        Second Value Here
      </td>
      
            
            
    
      
      <td valign="top" class="pr-10" style="width:50%;">
        <a href="http://www.aUrl">First Value Here</a><br />
        Second Value Here
      </td>
      
              </tr>
            
              <tr>
          <td colspan="2">
            <div class="divider"></div>
          </td>
        </tr>
            
    
                            <tr>
      
      <td valign="top" class="pr-10" style="width:50%;">
        <a href="http://www.aUrl">First Value Here</a><br />
        Second Value Here
      </td>
      
            
            
    
      
      <td valign="top" class="pr-10" style="width:50%;">
        <a href="http://www.aUrl">First Value Here</a><br />
        Second Value Here
      </td>
      
              </tr>

                .....
Using this code:
VB Code:
  1. Dim pattern As String = "(?=<td valign.*>).*?(?=</td>)"
  2. Dim reg As New Regex(pattern, RegexOptions.Singleline)
  3. Dim mc As MatchCollection = reg.Matches(pg)

mc has 4 values
Code:
<td valign="top" class="pr-10" style="width:50%;">
        <a href="http://www.aUrl">First Value Here</a><br />
        Second Value Here

<td valign="top" class="pr-10" style="width:50%;">
        <a href="http://www.aUrl">First Value Here</a><br />
        Second Value Here

<td valign="top" class="pr-10" style="width:50%;">
        <a href="http://www.aUrl">First Value Here</a><br />
        Second Value Here

<td valign="top" class="pr-10" style="width:50%;">
        <a href="http://www.aUrl">First Value Here</a><br />
        Second Value Here
The next code I try to make it so it doesn't return the <td.... > in the value
VB Code:
  1. Dim pattern As String = "(?<=<td valign.*>).*?(?=</td>)"
  2. Dim reg As New Regex(pattern, RegexOptions.Singleline)
  3. Dim mc As MatchCollection = reg.Matches(pg)

The first entry is correct then...
Code:
        <a href="http://www.aUrl.com">First Value</a><br />
        Second Value

'then

      
            
            
    
      
      <td valign="top" class="pl-10" style="width:50%;">
       <a href="http://www.aUrl.com">First Value</a><br />
        Second Value
notice all the white space. What should I do to just get what is inside the <td....> </td>?