|
-
Feb 3rd, 2013, 08:17 AM
#1
Thread Starter
Hyperactive Member
[RESOLVED] Parse the Integers out of this string quickly and efficiently
payItemID = 2085 Or payItemID = 1745 Or payItemID = 885 Or payItemID = 726 Or payItemID = 1133 Or payItemID = 1135 Or payItemID = 952 Or payItemID = 728 Or payItemID = 886 Or payItemID = 729 Or payItemID = 290 Or payItemID = 1789 Or payItemID = 1792 Or payItemID = 1820 Or payItemID = 1829 Or payItemID = 1843 Or payItemID = 1844 Or payItemID = 1883 Or payItemID = 1893 Or payItemID = 1894 Or payItemID = 1895 Or payItemID = 1896 Or payItemID = 1897 Or payItemID = 1899 Or payItemID = 1900 Or payItemID = 1902 Or payItemID = 1904 Or payItemID = 1905 Or payItemID = 1906 Or payItemID = 1910 Or payItemID = 1911 Or payItemID = 1912 Or payItemID = 1913 Or payItemID = 1916 Or payItemID = 1917 Or payItemID = 1918 Or payItemID = 1919 Or payItemID = 1403
This string could be longer or shorter at times. I want to obtain a list (Dictionary, List(Of T), etc.) of just the Numeric values in this string. My understanding of the performance hits on the various ways I could come up with to get what I need is quite limited.
So, I throw this out to the Forum to get input on the quickest and most effective way to do so.
-
Feb 3rd, 2013, 08:58 AM
#2
Re: Parse the Integers out of this string quickly and efficiently
Hi,
Let me throw this one in the pot for starters:-
Code:
Imports System.Text.RegularExpressions
Public Class Form1
Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
Dim NumberMatchCriteria As New Regex("[0-9]{1,}")
Dim myString As String = "payItemID = 2085 Or payItemID = 1745 Or payItemID = 885 Or payItemID = 726 Or payItemID = 1133 Or payItemID = 1135 Or payItemID = 952 Or payItemID = 728 Or payItemID = 886 Or " & _
"payItemID = 729 Or payItemID = 290 Or payItemID = 1789 Or payItemID = 1792 Or payItemID = 1820 Or payItemID = 1829 Or payItemID = 1843 Or payItemID = 1844 Or payItemID = 1883 Or " & _
"payItemID = 1893 Or payItemID = 1894 Or payItemID = 1895 Or payItemID = 1896 Or payItemID = 1897 Or payItemID = 1899 Or payItemID = 1900 Or payItemID = 1902 Or payItemID = 1904 Or " & _
"payItemID = 1905 Or payItemID = 1906 Or payItemID = 1910 Or payItemID = 1911 Or payItemID = 1912 Or payItemID = 1913 Or payItemID = 1916 Or payItemID = 1917 Or payItemID = 1918 Or " & _
"payItemID = 1919 Or payItemID = 1403"
Dim myListOfValues As List(Of Match) = NumberMatchCriteria.Matches(myString).Cast(Of Match).ToList
MsgBox("Done!")
'to check the list
For Each myValue As Match In myListOfValues
TextBox1.Text += myValue.Value.ToString & vbCrLf
Next
End Sub
End Class
I have not done any performance checks on this so I will leave that up to you.
Cheers,
Ian
-
Feb 3rd, 2013, 11:16 AM
#3
Re: Parse the Integers out of this string quickly and efficiently
In the very little testing I have done, RegEx is not the fastest alternative. However, pretty nearly any string manipulation is going to suck to some extent. What I would suggest as a comparison would be this:
I note that each number follows an = sign, but then is followed by other stuff. My first thought was just splitting on =, but in most cases that won't work. Alternatively, split on the space. Then iterate through the array, and the index following a slot with an = will be a number.
My usual boring signature: Nothing
 
-
Feb 3rd, 2013, 11:30 AM
#4
Re: Parse the Integers out of this string quickly and efficiently
I would thing uses 2 splits would be the ticket
Split on = and then in a loop trim and split on space the first entry in each pass should be the numeric one
Also looks like you could split on = then in a loop trim and use indexof to find the first space and substring to get the value up to that space which should eb your numeric value in all but the first and last ones
-
Feb 3rd, 2013, 04:34 PM
#5
Re: Parse the Integers out of this string quickly and efficiently
One split ...
Dim ss = s.Replace("payItemID = ", "").Split({" Or "}, StringSplitOptions.RemoveEmptyEntries)
As the 6-dimensional mathematics professor said to the brain surgeon, "It ain't Rocket Science!"
Reviews: "dunfiddlin likes his DataTables" - jmcilhinney
Please be aware that whilst I will read private messages (one day!) I am unlikely to reply to anything that does not contain offers of cash, fame or marriage!
-
Feb 3rd, 2013, 05:27 PM
#6
Thread Starter
Hyperactive Member
Re: Parse the Integers out of this string quickly and efficiently
Cheers to you, Ian!
Using the RegEx method you provided worked nearly instantaneously. I dig it.
I'm reading up to understand Regular Expressions because I honestly couldn't tell you what's going on in your code... but I want to know!
Would you mind expanding on the "...{1,)") part of the Regex Dim statement?
What does this indicate?
Thanks.
-
Feb 3rd, 2013, 10:44 PM
#7
Re: Parse the Integers out of this string quickly and efficiently
For a string this small, I wouldn't resort to regex, we're not really looking for a pattern here, we're looking for numbers, therefore a split would do just fine, and evaluating the items with a TryParse() if you want to use these values at a later date, adding them to a List<int> or something in the process.
vbnet Code:
Dim numList As New List(Of Integer) Dim i As Integer For Each s As String In input.Split(" "c) If Integer.TryParse(s, i) Then numList.Add(i) End If Next Console.WriteLine(String.Join(", ", numList))
<<<------------
.NET Programming (2012 - 2018)
®Crestron - DMC-T Certified Programmer | Software Developer <<<------------
-
Feb 3rd, 2013, 11:23 PM
#8
Re: Parse the Integers out of this string quickly and efficiently
Hi,
There are plenty of other good ideas and examples on this thread now so you will have lots to go at with regards finding the one that returns the best performance.
With regards to my original post however, the statement [0-9]{1,}, means to use the RegEx "Greedy" concept. That is, match at least one character from 0 to 9 and as many characters as possible thereafter which fit the same criteria.
Have a look here:-
http://www.regular-expressions.info/reference.html
Hope that helps.
Cheers,
Ian
-
Feb 4th, 2013, 11:34 AM
#9
Re: Parse the Integers out of this string quickly and efficiently
Pretty much ANYTHING will work nearly instantaneously for something like that. To even look at the efficiency between the various options you'd have to run each one in a loop for several thousand iterations to get the time taken high enough to be able to see a difference that even a Stopwatch object can see (which means milliseconds).
When you said you were looking for efficiency, I figured that you were going to be doing this so often that it would matter. If RegEx is fast enough then that's great, but it still isn't all that efficient.
My usual boring signature: Nothing
 
-
Feb 4th, 2013, 11:53 AM
#10
Re: Parse the Integers out of this string quickly and efficiently
$.02
Code:
Dim s As String = "payItemID = 2085 Or payItemID = 1745 Or payItemID = 885 Or payItemID = 726 Or payItemID = 1133 Or payItemID = 1135 Or payItemID = 952 Or payItemID = 728 Or payItemID = 886 Or payItemID = 729 Or payItemID = 290 Or payItemID = 1789 Or payItemID = 1792 Or payItemID = 1820 Or payItemID = 1829 Or payItemID = 1843 Or payItemID = 1844 Or payItemID = 1883 Or payItemID = 1893 Or payItemID = 1894 Or payItemID = 1895 Or payItemID = 1896 Or payItemID = 1897 Or payItemID = 1899 Or payItemID = 1900 Or payItemID = 1902 Or payItemID = 1904 Or payItemID = 1905 Or payItemID = 1906 Or payItemID = 1910 Or payItemID = 1911 Or payItemID = 1912 Or payItemID = 1913 Or payItemID = 1916 Or payItemID = 1917 Or payItemID = 1918 Or payItemID = 1919 Or payItemID = 1403"
Dim nums As New List(Of Integer) 'the result
Dim stpw As New Stopwatch
stpw.Restart()
'split the string
Dim items() As String = s.ToLower.Split(New String() {" ", "=", "or"}, StringSplitOptions.RemoveEmptyEntries)
For Each v As String In items 'look for integers
Dim n As Integer
If Integer.TryParse(v, n) Then
nums.Add(n) 'add to result list
End If
Next
stpw.Stop()
Debug.WriteLine(stpw.ElapsedTicks)
-
Feb 4th, 2013, 12:43 PM
#11
Re: Parse the Integers out of this string quickly and efficiently
 Originally Posted by dbasnett
$.02
Code:
Dim s As String = "payItemID = 2085 Or payItemID = 1745 Or payItemID = 885 Or payItemID = 726 Or payItemID = 1133 Or payItemID = 1135 Or payItemID = 952 Or payItemID = 728 Or payItemID = 886 Or payItemID = 729 Or payItemID = 290 Or payItemID = 1789 Or payItemID = 1792 Or payItemID = 1820 Or payItemID = 1829 Or payItemID = 1843 Or payItemID = 1844 Or payItemID = 1883 Or payItemID = 1893 Or payItemID = 1894 Or payItemID = 1895 Or payItemID = 1896 Or payItemID = 1897 Or payItemID = 1899 Or payItemID = 1900 Or payItemID = 1902 Or payItemID = 1904 Or payItemID = 1905 Or payItemID = 1906 Or payItemID = 1910 Or payItemID = 1911 Or payItemID = 1912 Or payItemID = 1913 Or payItemID = 1916 Or payItemID = 1917 Or payItemID = 1918 Or payItemID = 1919 Or payItemID = 1403"
Dim nums As New List(Of Integer) 'the result
Dim stpw As New Stopwatch
stpw.Restart()
'split the string
Dim items() As String = s.ToLower.Split(New String() {" ", "=", "or"}, StringSplitOptions.RemoveEmptyEntries)
For Each v As String In items 'look for integers
Dim n As Integer
If Integer.TryParse(v, n) Then
nums.Add(n) 'add to result list
End If
Next
stpw.Stop()
Debug.WriteLine(stpw.ElapsedTicks)
You're making things more complicated than they need to be here, why not split by a space? Like I did here: http://www.vbforums.com/showthread.p...=1#post4336055
Code:
Dim stpw As New Stopwatch
stpw.Restart()
There's also:
Code:
Dim sw As StopWatch = StopWatch.StartNew()
<<<------------
.NET Programming (2012 - 2018)
®Crestron - DMC-T Certified Programmer | Software Developer <<<------------
-
Feb 4th, 2013, 01:47 PM
#12
Re: Parse the Integers out of this string quickly and efficiently
 Originally Posted by AceInfinity
Didn't want to make assumptions about the data, i.e. payItemID=1745 would work. I also used the split to remove the number of items being processed, though it may have been more efficient to parse "or" than remove it. Didn't check.
I know about StopWatch.StartNew. So often I time multiple sections of code that I use .Restart out of habit, sometimes even when I have done StartNew.
-
Feb 5th, 2013, 05:43 PM
#13
Thread Starter
Hyperactive Member
Re: Parse the Integers out of this string quickly and efficiently
 Originally Posted by Shaggy Hiker
Pretty much ANYTHING will work nearly instantaneously for something like that. To even look at the efficiency between the various options you'd have to run each one in a loop for several thousand iterations to get the time taken high enough to be able to see a difference that even a Stopwatch object can see (which means milliseconds).
When you said you were looking for efficiency, I figured that you were going to be doing this so often that it would matter. If RegEx is fast enough then that's great, but it still isn't all that efficient.
I don't have to do it all that often, the RegEx method seems to do the trick. With that in mind, I'm going to try a couple of the other suggestions in this thread and pay attention to the performance impact. I'll post back the results if they are significant.
-
Feb 5th, 2013, 10:58 PM
#14
Re: Parse the Integers out of this string quickly and efficiently
 Originally Posted by jazFunk
I don't have to do it all that often, the RegEx method seems to do the trick. With that in mind, I'm going to try a couple of the other suggestions in this thread and pay attention to the performance impact. I'll post back the results if they are significant.
For small Regex operations, the overhead of instantiating the Regex, because it is a Class, and not a method like Split() for instance, that alone makes Regex slower. There's a compiled Regex option flag that can be applied, but that makes the overhead much more time consumptive, and is only validly appropriate if you're dealing with larger data, because it will be faster.
I seen someone trying to compare Regex with the Split() method before, and that alone was already flawed, for, as I said, Regex is a class, it needs to be instantiated first, and String.Split() is a function. They serve different purposes. MSDN recommends the use of Regex over split if you're looking for a pattern, and split if you don't need to look at any kind of pattern to find a value. If you need to search through a pattern, that would be where Split() would be tough to use.
Regex can be much slower than Split() if you have the option to use one over the other, not withstanding any of the other factors for Split() vs. using Regex in this case.
In some cases I've seen Regex up to 10 times slower than using Split(). The only thing with Split() is that it returns an array (of strings), so keep that in mind. Regex can be fast as well, so i'm not by any means saying don't use Regex. But I really doubt it in your case, and for such a small string too. I've seen where Regex can be faster than using Split(). It really boils down to just benchmarking your own situation to find out what makes sense sometimes.
Same thing with even usages of the Regex class. Various patterns may be faster than others, and various flags may contribute to added or decreased performance too. Compiled vs. non-compiled Regex for instance, is a good example. Simply, most of the performance factors in with the size of the data that you're analyzing.
Last edited by AceInfinity; Feb 5th, 2013 at 11:03 PM.
<<<------------
.NET Programming (2012 - 2018)
®Crestron - DMC-T Certified Programmer | Software Developer <<<------------
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|