Results 1 to 14 of 14

Thread: [RESOLVED] Parse the Integers out of this string quickly and efficiently

  1. #1

    Thread Starter
    Hyperactive Member jazFunk's Avatar
    Join Date
    Dec 2008
    Location
    Palm Harbor
    Posts
    407

    Resolved [RESOLVED] Parse the Integers out of this string quickly and efficiently

    payItemID = 2085 Or payItemID = 1745 Or payItemID = 885 Or payItemID = 726 Or payItemID = 1133 Or payItemID = 1135 Or payItemID = 952 Or payItemID = 728 Or payItemID = 886 Or payItemID = 729 Or payItemID = 290 Or payItemID = 1789 Or payItemID = 1792 Or payItemID = 1820 Or payItemID = 1829 Or payItemID = 1843 Or payItemID = 1844 Or payItemID = 1883 Or payItemID = 1893 Or payItemID = 1894 Or payItemID = 1895 Or payItemID = 1896 Or payItemID = 1897 Or payItemID = 1899 Or payItemID = 1900 Or payItemID = 1902 Or payItemID = 1904 Or payItemID = 1905 Or payItemID = 1906 Or payItemID = 1910 Or payItemID = 1911 Or payItemID = 1912 Or payItemID = 1913 Or payItemID = 1916 Or payItemID = 1917 Or payItemID = 1918 Or payItemID = 1919 Or payItemID = 1403


    This string could be longer or shorter at times. I want to obtain a list (Dictionary, List(Of T), etc.) of just the Numeric values in this string. My understanding of the performance hits on the various ways I could come up with to get what I need is quite limited.

    So, I throw this out to the Forum to get input on the quickest and most effective way to do so.
    Things I've found useful:
    DateTime.ToString() Patterns | Retrieving and Saving Data in Databases | ADO.NET Data Containers: An Explanation

    Quote of the day from jmcilhinney:
    'Talking about Option Strict and Option Explicit in the same sentence is pointless unless it is to say that they should both be On.'

  2. #2
    Frenzied Member IanRyder's Avatar
    Join Date
    Jan 2013
    Location
    Healing, UK
    Posts
    1,232

    Re: Parse the Integers out of this string quickly and efficiently

    Hi,

    Let me throw this one in the pot for starters:-

    Code:
    Imports System.Text.RegularExpressions
    
    Public Class Form1
    
      Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
        Dim NumberMatchCriteria As New Regex("[0-9]{1,}")
        Dim myString As String = "payItemID = 2085 Or payItemID = 1745 Or payItemID = 885 Or payItemID = 726 Or payItemID = 1133 Or payItemID = 1135 Or payItemID = 952 Or payItemID = 728 Or payItemID = 886 Or " & _
                                 "payItemID = 729 Or payItemID = 290 Or payItemID = 1789 Or payItemID = 1792 Or payItemID = 1820 Or payItemID = 1829 Or payItemID = 1843 Or payItemID = 1844 Or payItemID = 1883 Or " & _
                                 "payItemID = 1893 Or payItemID = 1894 Or payItemID = 1895 Or payItemID = 1896 Or payItemID = 1897 Or payItemID = 1899 Or payItemID = 1900 Or payItemID = 1902 Or payItemID = 1904 Or " & _
                                 "payItemID = 1905 Or payItemID = 1906 Or payItemID = 1910 Or payItemID = 1911 Or payItemID = 1912 Or payItemID = 1913 Or payItemID = 1916 Or payItemID = 1917 Or payItemID = 1918 Or " & _
                                 "payItemID = 1919 Or payItemID = 1403"
        Dim myListOfValues As List(Of Match) = NumberMatchCriteria.Matches(myString).Cast(Of Match).ToList
        MsgBox("Done!")
    
        'to check the list
        For Each myValue As Match In myListOfValues
          TextBox1.Text += myValue.Value.ToString & vbCrLf
        Next
      End Sub
    End Class
    I have not done any performance checks on this so I will leave that up to you.

    Cheers,

    Ian

  3. #3
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,106

    Re: Parse the Integers out of this string quickly and efficiently

    In the very little testing I have done, RegEx is not the fastest alternative. However, pretty nearly any string manipulation is going to suck to some extent. What I would suggest as a comparison would be this:

    I note that each number follows an = sign, but then is followed by other stuff. My first thought was just splitting on =, but in most cases that won't work. Alternatively, split on the space. Then iterate through the array, and the index following a slot with an = will be a number.
    My usual boring signature: Nothing

  4. #4
    PowerPoster
    Join Date
    Feb 2012
    Location
    West Virginia
    Posts
    14,206

    Re: Parse the Integers out of this string quickly and efficiently

    I would thing uses 2 splits would be the ticket
    Split on = and then in a loop trim and split on space the first entry in each pass should be the numeric one

    Also looks like you could split on = then in a loop trim and use indexof to find the first space and substring to get the value up to that space which should eb your numeric value in all but the first and last ones

  5. #5
    PowerPoster dunfiddlin's Avatar
    Join Date
    Jun 2012
    Posts
    8,245

    Re: Parse the Integers out of this string quickly and efficiently

    One split ...

    Dim ss = s.Replace("payItemID = ", "").Split({" Or "}, StringSplitOptions.RemoveEmptyEntries)
    As the 6-dimensional mathematics professor said to the brain surgeon, "It ain't Rocket Science!"

    Reviews: "dunfiddlin likes his DataTables" - jmcilhinney

    Please be aware that whilst I will read private messages (one day!) I am unlikely to reply to anything that does not contain offers of cash, fame or marriage!

  6. #6

    Thread Starter
    Hyperactive Member jazFunk's Avatar
    Join Date
    Dec 2008
    Location
    Palm Harbor
    Posts
    407

    Re: Parse the Integers out of this string quickly and efficiently

    Cheers to you, Ian!

    Using the RegEx method you provided worked nearly instantaneously. I dig it.

    I'm reading up to understand Regular Expressions because I honestly couldn't tell you what's going on in your code... but I want to know!

    Would you mind expanding on the "...{1,)") part of the Regex Dim statement?

    What does this indicate?

    Thanks.
    Things I've found useful:
    DateTime.ToString() Patterns | Retrieving and Saving Data in Databases | ADO.NET Data Containers: An Explanation

    Quote of the day from jmcilhinney:
    'Talking about Option Strict and Option Explicit in the same sentence is pointless unless it is to say that they should both be On.'

  7. #7
    Fanatic Member AceInfinity's Avatar
    Join Date
    May 2011
    Posts
    696

    Re: Parse the Integers out of this string quickly and efficiently

    For a string this small, I wouldn't resort to regex, we're not really looking for a pattern here, we're looking for numbers, therefore a split would do just fine, and evaluating the items with a TryParse() if you want to use these values at a later date, adding them to a List<int> or something in the process.

    vbnet Code:
    1. Dim numList As New List(Of Integer)
    2. Dim i As Integer
    3. For Each s As String In input.Split(" "c)
    4.     If Integer.TryParse(s, i) Then
    5.         numList.Add(i)
    6.     End If
    7. Next
    8. Console.WriteLine(String.Join(", ", numList))
    <<<------------
    Improving Managed Code Performance | .NET Application Performance
    < Please if this helped you out. Any kind of thanks is gladly appreciated >


    .NET Programming (2012 - 2018)
    ®Crestron - DMC-T Certified Programmer | Software Developer
    <<<------------

  8. #8
    Frenzied Member IanRyder's Avatar
    Join Date
    Jan 2013
    Location
    Healing, UK
    Posts
    1,232

    Re: Parse the Integers out of this string quickly and efficiently

    Hi,

    There are plenty of other good ideas and examples on this thread now so you will have lots to go at with regards finding the one that returns the best performance.

    With regards to my original post however, the statement [0-9]{1,}, means to use the RegEx "Greedy" concept. That is, match at least one character from 0 to 9 and as many characters as possible thereafter which fit the same criteria.

    Have a look here:-

    http://www.regular-expressions.info/reference.html

    Hope that helps.

    Cheers,

    Ian

  9. #9
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,106

    Re: Parse the Integers out of this string quickly and efficiently

    Pretty much ANYTHING will work nearly instantaneously for something like that. To even look at the efficiency between the various options you'd have to run each one in a loop for several thousand iterations to get the time taken high enough to be able to see a difference that even a Stopwatch object can see (which means milliseconds).

    When you said you were looking for efficiency, I figured that you were going to be doing this so often that it would matter. If RegEx is fast enough then that's great, but it still isn't all that efficient.
    My usual boring signature: Nothing

  10. #10
    Powered By Medtronic dbasnett's Avatar
    Join Date
    Dec 2007
    Location
    Jefferson City, MO
    Posts
    9,897

    Re: Parse the Integers out of this string quickly and efficiently

    $.02

    Code:
            Dim s As String = "payItemID = 2085 Or payItemID = 1745 Or payItemID = 885 Or payItemID = 726 Or payItemID = 1133 Or payItemID = 1135 Or payItemID = 952 Or payItemID = 728 Or payItemID = 886 Or payItemID = 729 Or payItemID = 290 Or payItemID = 1789 Or payItemID = 1792 Or payItemID = 1820 Or payItemID = 1829 Or payItemID = 1843 Or payItemID = 1844 Or payItemID = 1883 Or payItemID = 1893 Or payItemID = 1894 Or payItemID = 1895 Or payItemID = 1896 Or payItemID = 1897 Or payItemID = 1899 Or payItemID = 1900 Or payItemID = 1902 Or payItemID = 1904 Or payItemID = 1905 Or payItemID = 1906 Or payItemID = 1910 Or payItemID = 1911 Or payItemID = 1912 Or payItemID = 1913 Or payItemID = 1916 Or payItemID = 1917 Or payItemID = 1918 Or payItemID = 1919 Or payItemID = 1403"
            Dim nums As New List(Of Integer) 'the result
            Dim stpw As New Stopwatch
            stpw.Restart()
            'split the string
            Dim items() As String = s.ToLower.Split(New String() {" ", "=", "or"}, StringSplitOptions.RemoveEmptyEntries)
            For Each v As String In items 'look for integers
                Dim n As Integer
                If Integer.TryParse(v, n) Then
                    nums.Add(n) 'add to result list
                End If
            Next
            stpw.Stop()
            Debug.WriteLine(stpw.ElapsedTicks)
    My First Computer -- Documentation Link (RT?M) -- Using the Debugger -- Prime Number Sieve
    Counting Bits -- Subnet Calculator -- UI Guidelines -- >> SerialPort Answer <<

    "Those who use Application.DoEvents have no idea what it does and those who know what it does never use it." John Wein

  11. #11
    Fanatic Member AceInfinity's Avatar
    Join Date
    May 2011
    Posts
    696

    Re: Parse the Integers out of this string quickly and efficiently

    Quote Originally Posted by dbasnett View Post
    $.02

    Code:
            Dim s As String = "payItemID = 2085 Or payItemID = 1745 Or payItemID = 885 Or payItemID = 726 Or payItemID = 1133 Or payItemID = 1135 Or payItemID = 952 Or payItemID = 728 Or payItemID = 886 Or payItemID = 729 Or payItemID = 290 Or payItemID = 1789 Or payItemID = 1792 Or payItemID = 1820 Or payItemID = 1829 Or payItemID = 1843 Or payItemID = 1844 Or payItemID = 1883 Or payItemID = 1893 Or payItemID = 1894 Or payItemID = 1895 Or payItemID = 1896 Or payItemID = 1897 Or payItemID = 1899 Or payItemID = 1900 Or payItemID = 1902 Or payItemID = 1904 Or payItemID = 1905 Or payItemID = 1906 Or payItemID = 1910 Or payItemID = 1911 Or payItemID = 1912 Or payItemID = 1913 Or payItemID = 1916 Or payItemID = 1917 Or payItemID = 1918 Or payItemID = 1919 Or payItemID = 1403"
            Dim nums As New List(Of Integer) 'the result
            Dim stpw As New Stopwatch
            stpw.Restart()
            'split the string
            Dim items() As String = s.ToLower.Split(New String() {" ", "=", "or"}, StringSplitOptions.RemoveEmptyEntries)
            For Each v As String In items 'look for integers
                Dim n As Integer
                If Integer.TryParse(v, n) Then
                    nums.Add(n) 'add to result list
                End If
            Next
            stpw.Stop()
            Debug.WriteLine(stpw.ElapsedTicks)
    You're making things more complicated than they need to be here, why not split by a space? Like I did here: http://www.vbforums.com/showthread.p...=1#post4336055

    Code:
    Dim stpw As New Stopwatch
    stpw.Restart()
    There's also:
    Code:
    Dim sw As StopWatch = StopWatch.StartNew()
    <<<------------
    Improving Managed Code Performance | .NET Application Performance
    < Please if this helped you out. Any kind of thanks is gladly appreciated >


    .NET Programming (2012 - 2018)
    ®Crestron - DMC-T Certified Programmer | Software Developer
    <<<------------

  12. #12
    Powered By Medtronic dbasnett's Avatar
    Join Date
    Dec 2007
    Location
    Jefferson City, MO
    Posts
    9,897

    Re: Parse the Integers out of this string quickly and efficiently

    Quote Originally Posted by AceInfinity View Post
    You're making things more complicated than they need to be here, why not split by a space? Like I did here: http://www.vbforums.com/showthread.p...=1#post4336055



    There's also:
    Code:
    Dim sw As StopWatch = StopWatch.StartNew()
    Didn't want to make assumptions about the data, i.e. payItemID=1745 would work. I also used the split to remove the number of items being processed, though it may have been more efficient to parse "or" than remove it. Didn't check.

    I know about StopWatch.StartNew. So often I time multiple sections of code that I use .Restart out of habit, sometimes even when I have done StartNew.
    My First Computer -- Documentation Link (RT?M) -- Using the Debugger -- Prime Number Sieve
    Counting Bits -- Subnet Calculator -- UI Guidelines -- >> SerialPort Answer <<

    "Those who use Application.DoEvents have no idea what it does and those who know what it does never use it." John Wein

  13. #13

    Thread Starter
    Hyperactive Member jazFunk's Avatar
    Join Date
    Dec 2008
    Location
    Palm Harbor
    Posts
    407

    Re: Parse the Integers out of this string quickly and efficiently

    Quote Originally Posted by Shaggy Hiker View Post
    Pretty much ANYTHING will work nearly instantaneously for something like that. To even look at the efficiency between the various options you'd have to run each one in a loop for several thousand iterations to get the time taken high enough to be able to see a difference that even a Stopwatch object can see (which means milliseconds).

    When you said you were looking for efficiency, I figured that you were going to be doing this so often that it would matter. If RegEx is fast enough then that's great, but it still isn't all that efficient.
    I don't have to do it all that often, the RegEx method seems to do the trick. With that in mind, I'm going to try a couple of the other suggestions in this thread and pay attention to the performance impact. I'll post back the results if they are significant.
    Things I've found useful:
    DateTime.ToString() Patterns | Retrieving and Saving Data in Databases | ADO.NET Data Containers: An Explanation

    Quote of the day from jmcilhinney:
    'Talking about Option Strict and Option Explicit in the same sentence is pointless unless it is to say that they should both be On.'

  14. #14
    Fanatic Member AceInfinity's Avatar
    Join Date
    May 2011
    Posts
    696

    Re: Parse the Integers out of this string quickly and efficiently

    Quote Originally Posted by jazFunk View Post
    I don't have to do it all that often, the RegEx method seems to do the trick. With that in mind, I'm going to try a couple of the other suggestions in this thread and pay attention to the performance impact. I'll post back the results if they are significant.
    For small Regex operations, the overhead of instantiating the Regex, because it is a Class, and not a method like Split() for instance, that alone makes Regex slower. There's a compiled Regex option flag that can be applied, but that makes the overhead much more time consumptive, and is only validly appropriate if you're dealing with larger data, because it will be faster.

    I seen someone trying to compare Regex with the Split() method before, and that alone was already flawed, for, as I said, Regex is a class, it needs to be instantiated first, and String.Split() is a function. They serve different purposes. MSDN recommends the use of Regex over split if you're looking for a pattern, and split if you don't need to look at any kind of pattern to find a value. If you need to search through a pattern, that would be where Split() would be tough to use.

    Regex can be much slower than Split() if you have the option to use one over the other, not withstanding any of the other factors for Split() vs. using Regex in this case.

    In some cases I've seen Regex up to 10 times slower than using Split(). The only thing with Split() is that it returns an array (of strings), so keep that in mind. Regex can be fast as well, so i'm not by any means saying don't use Regex. But I really doubt it in your case, and for such a small string too. I've seen where Regex can be faster than using Split(). It really boils down to just benchmarking your own situation to find out what makes sense sometimes.

    Same thing with even usages of the Regex class. Various patterns may be faster than others, and various flags may contribute to added or decreased performance too. Compiled vs. non-compiled Regex for instance, is a good example. Simply, most of the performance factors in with the size of the data that you're analyzing.
    Last edited by AceInfinity; Feb 5th, 2013 at 11:03 PM.
    <<<------------
    Improving Managed Code Performance | .NET Application Performance
    < Please if this helped you out. Any kind of thanks is gladly appreciated >


    .NET Programming (2012 - 2018)
    ®Crestron - DMC-T Certified Programmer | Software Developer
    <<<------------

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width