Results 1 to 13 of 13

Thread: [RESOLVED] [2005] - Help with regular expression

  1. #1

    Thread Starter
    Lively Member
    Join Date
    Jan 2006
    Posts
    111

    Resolved [RESOLVED] [2005] - Help with regular expression

    Hi All,

    I am trying to extract 2 values from a string and place them into separate groups.

    String:

    Code:
    1.6919</td>
    <td class="Normal" align="Right">1.6835</td>
    <td class="Normal" align="Right">27 Mar 08</td>
    <td class="Normal" align="Right">6 Monthly</td>
    Values are 1.6919 and 1.6835.

    Below is the code I am using:

    Code:
    Const myPattern As String = "(?<entryprice>[\d.{1}+[\d.{4}])" & _ ".*?" & _ "(?<entryprice>[\d.{1}+[\d.{4}])" & _ ".*?"
    
    For each m As Match In Regex.Matches(mystring, myPattern)
    Dim EntryPriceE As String = m.groups("entryprice").ToString
    Dim ExitPriceE As String = m.Groups("exitprice").ToString
    
    Next
    I have read a whole bunch of regex articles but still can't seem to get it correct.

  2. #2
    Fanatic Member vijy's Avatar
    Join Date
    May 2007
    Location
    India
    Posts
    548

    Re: [2005] - Help with regular expression

    in all cases the values must be in this same case ah???
    *.**** (1.6919 and 1.6835)
    i.e with decimal...
    Visual Studio.net 2010
    If this post is useful, rate it


  3. #3

    Thread Starter
    Lively Member
    Join Date
    Jan 2006
    Posts
    111

    Talking Re: [2005] - Help with regular expression

    Yes, that is correct.

    These values are coming from a stock market website so the values could even be:

    2.4512
    12.1451
    12.45
    321.4567
    4586.1287

    etc...

  4. #4
    Fanatic Member vijy's Avatar
    Join Date
    May 2007
    Location
    India
    Posts
    548

    Re: [2005] - Help with regular expression

    ^([0-9]+^).^([0-9]+^)

    this Regex enough to find the string in the file.
    Visual Studio.net 2010
    If this post is useful, rate it


  5. #5

    Thread Starter
    Lively Member
    Join Date
    Jan 2006
    Posts
    111

    Question Re: [2005] - Help with regular expression

    Using the following pattern's returns nothing:

    Code:
    Const myPattern As String = "(?<entryprice>^([0-9]+^).^([0-9]+^))"
    Code:
    Const myPattern As String = "(?<entryprice>^([0-9]+^).^([0-9]+^))" & _ '".*?"

  6. #6
    PowerPoster stanav's Avatar
    Join Date
    Jul 2006
    Location
    Providence, RI - USA
    Posts
    9,290

    Re: [2005] - Help with regular expression

    Please post a larger sample data that includes the text before the 1st value to be extracted.

  7. #7

    Thread Starter
    Lively Member
    Join Date
    Jan 2006
    Posts
    111

    Question Re: [2005] - Help with regular expression

    Hi,

    There is no text before the first value because I start the cut at
    Code:
    <td class="Normal" align="right">
    and end at
    Code:
    </td>
    .

    Looking at the source of www.morningstar.com.au, this is the exact data sample I am extracting from.

    Code:
    <tr valign="top" bgcolor="#f5f5f5">
    <td><a class="Normal">Sandhurst BMF - Sandhurst Industrial Share Fund</a></td>
    <td class="Normal" align="right">1.6821</td>
    <td class="Normal" align="right">1.6737</td>
    <td class="Normal" align="right">28 Mar 08</td>
    
    <td class="Normal" align="right">6 Monthly</td>
    </tr>

  8. #8
    Fanatic Member vijy's Avatar
    Join Date
    May 2007
    Location
    India
    Posts
    548

    Re: [2005] - Help with regular expression

    Here three RegEx in three diferent script to find the string. its work fine for me, and find the string,.


    Code:
    Ultra-edit:- ^([0-9]+^).^([0-9]+^)
    Perl:- \d+\.\d+
    VBA:- ([0-9]{1,}).([0-9]{1,})
    Visual Studio.net 2010
    If this post is useful, rate it


  9. #9

    Thread Starter
    Lively Member
    Join Date
    Jan 2006
    Posts
    111

    Question Re: [2005] - Help with regular expression

    Cheers vijy for the RegEx. Testing it on www.regexlib.com works flawlessly. The next question I guess is to separate those two values into strings etc... Any suggestions?

  10. #10
    PowerPoster stanav's Avatar
    Join Date
    Jul 2006
    Location
    Providence, RI - USA
    Posts
    9,290

    Re: [2005] - Help with regular expression

    The pattern Viji gave you actually only searches for any positive decimal value. It won't pick up the negative sign if the value is negative. Besides, it'll search for all decimal values regardless they are enclosed by <td> tags or not.
    Try this:
    Code:
    'Use a webclient to download the source from the link your gave above
            Dim wc As New System.Net.WebClient()
            Dim sourceHtml As String = wc.DownloadString("http://www.morningstar.com.au")
    
            'Declare a list to hold prices
            Dim priceList As New List(Of Decimal)
    
            Dim pattern As String = "(?i-msnx:(?<=<td class=""normal"" align=""right"">)[\s+-]?(\d+\.?\d*|\d*\.?\d+)\s?(?=</td>))"
            Dim matches As System.Text.RegularExpressions.MatchCollection = System.Text.RegularExpressions.Regex.Matches(sourceHtml, pattern)
            For Each match As System.Text.RegularExpressions.Match In matches
                priceList.Add(Decimal.Parse(match.Value))
            Next

  11. #11

    Thread Starter
    Lively Member
    Join Date
    Jan 2006
    Posts
    111

    Question Re: [2005] - Help with regular expression

    Cheers stanav for the code. I tried your regular expression but that did not returns any matches (using VS and RegExLib.com).

    I have now created a new solution that uses a list (from your posted code), vijy's regular expression and more regular expressions to remove HTML Tags.

    This is all working fine, but I have run into a problem with the "Matches" part. I know that each match must contain 2 values. If a user was to enter a invalid stock code, the match would either = 0 Or 1 (From my testing). How can I detect this and log the error? Below is the code I am using

    Code:
    ' Read The Response Into A String
    
                    Dim strHTML As String = str.ReadToEnd
    
                    Dim strStartCut As String
                    Dim strEndCut As String
                    Dim strStartTag As String
                    Dim strEndTag As String
                    Dim strDesired As String
    
                    ' Cut Out Only What We Need
    
                    strStartTag = "<td class=""Normal"" align=""right"">"
                    strEndTag = "</tr>"
    
                    strStartCut = (InStr(1, strHTML, strStartTag, vbTextCompare) + Len(strStartTag))
                    strEndCut = (InStr(strStartCut, strHTML, strEndTag, vbTextCompare))
    
                    strDesired = Mid(strHTML, strStartCut, (strEndCut - strStartCut))
    
                    ' Strip HTML Tags
    
                    Dim Pattern As String = "<[^>]*>"
    
                    Dim R As New Regex(Pattern)
    
                    strDesired = R.Replace(strDesired, "")
    
                    Dim Pattern2 As String = "[<]"
    
                    Dim S As New Regex(Pattern2)
    
                    strDesired = S.Replace(strDesired, "<")
    
                    Dim Pattern3 As String = "[>]"
    
                    Dim T As New Regex(Pattern3)
    
                    strDesired = T.Replace(strDesired, ">")
    
                    'Declare a list to hold prices
    
                    Dim priceList As New List(Of Decimal)
    
                    Dim expression As String = "([0-9]{1,}).([0-9]{1,})"
    
                    Dim matches As System.Text.RegularExpressions.MatchCollection = System.Text.RegularExpressions.Regex.Matches(strDesired, expression)
    
                    For Each Match As System.Text.RegularExpressions.Match In matches
    
                        If Match.Value = 0 Or 1 Then
    
    Me.RichTextBoxError.Text += Date.Now + " - Data Not Available For UNLM Code " + MorningStarCode + vbCrLf
    
                        Else
    
    priceList.Add(Decimal.Parse(Match.Value))
                            'MessageBox.Show(Decimal.Parse(Match.Value))
    
                        End If
    
                    Next
    
                    'Dim ExtractedPrices As String = "2B" & vbNewLine & ("00" & MATECode) & vbNewLine & sDate & vbNewLine & ("000000" & priceList.Item(0) & "00") & vbNewLine & ("000000" & priceList.Item(1) & "00")
                    'Dim Lines() As String = Split(ExtractedPrices.ToString, vbCrLf)
                    'Me.DataGridViewDD.Rows.Add(Lines)

  12. #12
    PowerPoster stanav's Avatar
    Join Date
    Jul 2006
    Location
    Providence, RI - USA
    Posts
    9,290

    Re: [2005] - Help with regular expression

    First of all, your pattern is searching for numeric values in decimal format (that is, 123.456 for example). So none of the matches will be 0 or 1. Secondly, match.Value returns a string, and 0 or 1 is an Integer. If you're to test a match against 0 or 1, you have to do this
    Code:
    If match.value = "0" OrElse match.Value = "1" Then
    
    End If
    I strongly suggest you to turn Option Strict On. It helps you catch a lot of silly errors like this and it also makes your application run faster.

  13. #13

    Thread Starter
    Lively Member
    Join Date
    Jan 2006
    Posts
    111

    Talking Re: [2005] - Help with regular expression

    Cheers for the suggestion in using "Option Strict On". Found alot of errors in my code which were able to be corrected. As for determining if the match holds two values, I used the following code:

    Code:
     For Each Match As System.Text.RegularExpressions.Match in matches
    
    pricelist.add(Decimal.Parse(Match.Value)
    
    Next
    
    If pricelist.count = "2" then
    
    'add values to datagridview
    
    Else
    
    'Log error to richtextbox
    
    End If

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width