Click to See Complete Forum and Search --> : Can you do it faster??
*PsyKE1*
Jan 12th, 2011, 06:37 PM
Hi all, here is my way to extract numbers of text .
Can you do it faster without RegExp? :confused:
Option Explicit
Option Base 0
Private Declare Function ArrayPtr Lib "msvbvm60" Alias "VarPtr" (ByRef Ptr() As Any) As Long
Private Declare Sub PutMem4 Lib "msvbvm60" (ByVal Ptr As Long, ByVal Value As Long)
Private Function GetNumbersbyStringI(ByRef sText As String) As String()
Dim intAsc() As Integer
Dim lngAscHeader(5) As Long
Dim lngPos As Long
Dim strNum As String
Dim strTempArr() As String
Dim lngTextLen As Long
Dim Q As Long
lngTextLen = LenB(sText) \ 2 + 1
If lngTextLen > 1 Then
ReDim strTempArr$(0)
lngAscHeader(0) = 1
lngAscHeader(1) = 2
lngAscHeader(3) = StrPtr(sText)
lngAscHeader(4) = lngTextLen
PutMem4 ArrayPtr(intAsc), VarPtr(lngAscHeader(0))
Do Until lngPos = lngTextLen
Do
If intAsc(lngPos) < 48 Then Exit Do
If intAsc(lngPos) > 57 Then Exit Do
strNum = strNum + ChrW$(intAsc(lngPos))
lngPos = lngPos + 1
Loop Until lngPos = lngTextLen
If LenB(strNum) Then
strTempArr(Q) = strNum
strNum = vbNullString
Q = Q + 1
ReDim Preserve strTempArr$(Q)
End If
lngPos = lngPos + 1
Loop
PutMem4 ArrayPtr(intAsc), 0
GetNumbersbyStringI = strTempArr
End If
End Function
Private Sub Form_Load()
Const s As String = "hi 345 vivan 09453 las456 r4n4s 1 lov3 vbf0rum5"
Const sLine As String = "--------------------------------"
Dim vItem As Variant
Debug.Print sLine; "Array", Time$
For Each vItem In GetNumbersbyStringI(s)
Debug.Print vItem
Next vItem
End Sub
Give me advice, I want to improve.;)
Thanks
Ellis Dee
Jan 12th, 2011, 09:07 PM
String concatenation (which should use &, not +, btw) is slow. ReDim Preserve is very slow, especially inside a loop. I'd rewrite the logic to remove both of these elements.
Here's a pure native VB6 solution (no API) that runs a hair slower than your solution. It uses the native Split() function, which is slow. No doubt implementing one of Merri's faster Split() variations would speed it up.Public Function GetNumbers(ByVal pstrText As String) As String()
Dim strChar As String
Dim lngPos As Long
Dim blnDelimit As Boolean
Dim blnStarted As Boolean
Dim i As Long
For i = 1 To Len(pstrText)
strChar = Mid$(pstrText, i, 1)
Select Case AscW(strChar)
Case 48 To 57
blnStarted = True
If blnDelimit Then
lngPos = lngPos + 1: Mid$(pstrText, lngPos, 1) = " "
blnDelimit = False
End If
lngPos = lngPos + 1: Mid$(pstrText, lngPos, 1) = strChar
Case Else: blnDelimit = blnStarted
End Select
Next
GetNumbers = Split(Left$(pstrText, lngPos), " ")
End Function
penagate
Jan 12th, 2011, 11:03 PM
A hair slower?
Also (splitting hairs) the original code doesn't use any API calls. ArrayPtr and PutMem4 are both VB6 runtime functions.
*PsyKE1*
Jan 13th, 2011, 02:15 AM
Hi, thanks Ellis Dee for the tips. ;)
Option Explicit
Option Base 0
Private Declare Function ArrayPtr Lib "msvbvm60" Alias "VarPtr" (ByRef Ptr() As Any) As Long
Private Declare Sub PutMem4 Lib "msvbvm60" (ByVal Ptr As Long, ByVal Value As Long)
Private Function GetNumbersbyStringII(ByRef sText As String) As String()
Dim intAsc() As Integer
Dim lngAscHeader(5) As Long
Dim lngPos As Long
Dim strNum As String
Dim strTempArr() As String
Dim lngTextLen As Long
Dim Q As Long
lngTextLen = LenB(sText) \ 2 + 1
If lngTextLen > 1 Then
ReDim strTempArr$(lngTextLen - 1)
lngAscHeader(0) = 1
lngAscHeader(1) = 2
lngAscHeader(3) = StrPtr(sText)
lngAscHeader(4) = lngTextLen
PutMem4 ArrayPtr(intAsc), VarPtr(lngAscHeader(0))
Do Until lngPos = lngTextLen
Do
If intAsc(lngPos) < 48 Then Exit Do
If intAsc(lngPos) > 57 Then Exit Do
strNum = strNum & ChrW$(intAsc(lngPos))
lngPos = lngPos + 1
Loop Until lngPos = lngTextLen
If LenB(strNum) Then
strTempArr(Q) = strNum
strNum = vbNullString
Q = Q + 1
End If
lngPos = lngPos + 1
Loop
PutMem4 ArrayPtr(intAsc), 0
ReDim Preserve strTempArr$(Q)
GetNumbersbyStringII = strTempArr
End If
End Function
I modified the code, using only one time ReDim Preserve() but i tested it and is slower than the previous one did. :S
Why?
Thanks! :D
Ellis Dee
Jan 13th, 2011, 03:32 AM
A hair slower?Yes, according to the benchmark program I tossed together. I ran his sample string through both algorithms 100,000 times and got:
His: 1.094
Mine: 1.171
What were your results?
Also (splitting hairs) the original code doesn't use any API calls. ArrayPtr and PutMem4 are both VB6 runtime functions.Yes it does use API calls. If you comment out the two declarations at the top of his code it'll throw a "Sub or Function not defined" compilation error. By definition, that's not native VB6.
Ellis Dee
Jan 13th, 2011, 03:41 AM
I modified the code, using only one time ReDim Preserve() but i tested it and is slower than the previous one did. :S
Why?
Thanks! :DYeah, your code confused me by being counter-intuitively fast. My tips are true in the general sense, but I think your particular implementation is the exception that proves the rule.
I think it's because the tokens are so very short that concatenation is plenty fast -- what are we thinking, an average of 2-3 digits per token? -- and the difference between the amount of memory that needs to be cleared by redimming the maximum possible space is much larger than redim preserving as needed inside the loop.
If your strings were 1000 characters long and your numbers were 20 digits long I think your second implementation would be faster. That's my best guess, at least.
I'm only useful for optimizing native code. Others that post regularly are orders of magnitude better with esoteric optimization methods using API, so keep an eye on the thread over the next couple days.
EDIT: We're now in Code It Better, which gets very little traffic. You might consider linking to this thread from your previous one to get the big guns to come over and play. (Merri in particular enjoys squeezing the most speed possible out of VB6.)
*PsyKE1*
Jan 13th, 2011, 06:01 AM
Yes, I'm afraid this section we'll be alone... :(
Well i tested it using this way:
Using CTiming.cls
Private Sub Form_Load()
Dim tmr As New CTiming
Dim s As String
Dim x As Long
Me.Show: DoEvents
Me.AutoRedraw = True
Me.Print "Wait...": DoEvents
For x = 0 To 100000
s = s & ChrW$(Rnd * 255)
Next
Me.Cls
tmr.Reset
For x = 0 To 100
GetNumbers s
Next
Me.Print "Ellis Dee", tmr.sElapsed
tmr.Reset
For x = 0 To 100
GetNumbersbyStringI s
Next
Me.Print "Mr.Frog", tmr.sElapsed
End Sub
It returns:
Ellis Dee 1.623,141 msec
Mr.Frog 942,073 msec
Hack
Jan 13th, 2011, 06:08 AM
The question is not "How do I do this"?
The question is "How do I do this FASTER?"
As such, it is perfectly appropriate for this thread to be in the "Code It Better" section.
Ellis Dee
Jan 13th, 2011, 06:20 AM
This forum is for intellectual exercises, not functional solutions. The OP is trying to implement actual code, not run a contest.
It was inappropriate to move the thread here. It should be moved back.Well i tested it using this way:Yeah, as I said above, the native Split() function is slow. That problem is magnified by larger strings.
My test was with the string used in the OP, 100,000 times.
*PsyKE1*
Jan 13th, 2011, 06:28 AM
Hi Hack, Why you said that?
Look the tittle:Can you do it faster??
Ook, I'll post more this section. :D
I think is good way to learn... :P
@Ellis Dee
Ook friend :)
MarMan
Jan 13th, 2011, 07:47 AM
Try this one:
Public Function GetNumbers(ByVal strText As String) As Integer()
Dim r As Integer
Dim bHaveDigit As Boolean
Dim intChar As Integer
Dim intCounter As Integer = -1
Dim intExponent As Integer = -1
Dim intNumbers As Integer() = Nothing
Array.Resize(intNumbers, 9)
For r = Len(strText) To 1 Step -1
intChar = Asc(Mid(strText, r, 1))
Select Case intChar
Case 48 To 57
If Not bHaveDigit Then
intCounter += 1
If intCounter / 10 = Int(intCounter / 10) Then
Array.Resize(intNumbers, intCounter + 10)
End If
End If
bHaveDigit = True
intExponent += 1
intNumbers(intCounter) += CType((intChar - 48) * 10 ^ intExponent, Integer)
Case Else
If bHaveDigit Then
bHaveDigit = False
End If
intExponent = -1
End Select
Next
Array.Resize(intNumbers, intCounter)
Array.Reverse(intNumbers)
Return intNumbers
End Function
*PsyKE1*
Jan 13th, 2011, 07:51 AM
MarMan, I'm afraid it is VB.Net... :P
We're trying in vb6.
MarMan
Jan 13th, 2011, 08:29 AM
OK, then try this:
Public Function GetNumbers(ByVal strText As String) As Integer()
Dim r As Integer
Dim bHaveDigit As Boolean
Dim intChar As Integer
Dim intCounter As Integer
Dim intExponent As Integer
Dim intNumbers() As Integer
intCounter = -1
intExponent = -1
redim intNumbers (9)
For r = Len(strText) To 1 Step -1
intChar = Asc(Mid(strText, r, 1))
Select Case intChar
Case 48 To 57
If Not bHaveDigit Then
intCounter = intCounter + 1
If intCounter / 10 = Int(intCounter / 10) Then
redim preserve intNumbers(intCounter + 10)
End If
End If
bHaveDigit = True
intExponent = intExponent + 1
intNumbers(intCounter) = intNumbers(intCounter) + (intChar - 48) * 10 ^ intExponent
Case Else
If bHaveDigit Then
bHaveDigit = False
End If
intExponent = -1
End Select
Next
redim preserve intNumbers (intCounter)
'Array.Reverse(intNumbers)
GetNumbers = intNumbers
End Function
I don't have VB6 anymore and haven't used it in awhile so let me know if it doesn't compile.
Also the numbers are stored in the array in opposite order which can easily be corrected if the speed increase is significant.
Merri
Jan 13th, 2011, 10:32 AM
Without API:Public Function GetNumberM(ByVal Text As String) As String()
Dim B() As Byte, i As Long, M As Boolean, L As Long, P As Long, PC As Long, PT As Long
PT = 1
If LenB(Text) Then
B = Text
For i = 0 To UBound(B) - 1 Step 2
Select Case B(i)
Case 48 To 57
M = B(i + 1) = 0
Case Else
M = False
End Select
If M Then
If P = 0 Then P = (i \ 2) + 1
ElseIf P Then
PC = (i \ 2) + 1
L = PC - P
If PT < P Then Mid$(Text, PT, L + 1) = Mid$(Text, P, L) & " " Else Mid$(Text, PC, 1) = " "
PT = PT + L + 1
P = 0
End If
Next i
If P Then
PC = (i \ 2) + 1
L = PC - P
If PT < P Then Mid$(Text, PT, L + 1) = Mid$(Text, P, L)
PT = PT + L + 1
End If
GetNumberM = Split(Left$(Text, PT - 2), " ")
Else
GetNumberM = Split(vbNullString)
End If
End Function
Don't have the time for API-doped version just right now.
*PsyKE1*
Jan 13th, 2011, 01:51 PM
Hi Merri!
Thanks for post! :D
I wanna your API-doped version!! :)
Private Sub Form_Load()
Dim tmr As New CTiming
Dim s As String
Dim x As Long
Me.Show: DoEvents
Me.AutoRedraw = True
Me.Print "Wait...": DoEvents
For x = 0 To 100000
s = s & ChrW$(Rnd * 255)
Next
Me.Cls
tmr.Reset
For x = 0 To 100
GetNumbersE s
Next
Me.Print "Ellis Dee", tmr.sElapsed
tmr.Reset
For x = 0 To 100
GetNumbersbyStringI s
Next
Me.Print "Mr.Frog", tmr.sElapsed
'Error ---> OverFlow in : For r = Len(strText) To 1 Step -1
'tmr.Reset
'For x = 0 To 100
' GetNumbers s
'Next
'Me.Print "MarMan", tmr.sElapsed
tmr.Reset
For x = 0 To 100
GetNumberM s
Next
Me.Print "Merri", tmr.sElapsed
End Sub
Result:
Ellis Dee 4.440,700 msec
Mr.Frog 3.437,803 msec
Merri 3.670,915 msec
MarMan
Jan 13th, 2011, 01:59 PM
Didn't test mine? Must be scared:D
Merri
Jan 13th, 2011, 02:00 PM
Testing under IDE does not tell the whole truth.
MarMan: your function is slowest when compiled.
*PsyKE1*
Jan 13th, 2011, 02:39 PM
Merri put the result here, now i'm not in my own computer and i have the portable version wich can't compilate... :(
I'm seeing yor new functions.
It would better if you put the Ubound() of your array in a varible, if you doo this the For...Next will not have to calculate it every time.
I love your way to do it but i have some questions that i will post tomorrow :D
---------------------------------------------------------------------------------------------------------------------------
Edit:
Any advice about my function Merri?
Ellis Dee
Jan 13th, 2011, 10:06 PM
Attached is a benchmark program that includes the four methods posted to the thread. My solution had a bug where I forgot to specify what I was splitting the return value by, and it appears Merri had the same bug. I corrected that for both of us in the benchmark program.
I included a "validate" button to verify the results, which revealed the limitation of MarMan's method of using a numeric array: leading zeroes are lost. It also revealed that Psyke's method leaves a trailing blank in the returned array. Here's the validation results:
Psyke : 345-09453-456-4-4-1-3-0-5-
Ellis : 345-09453-456-4-4-1-3-0-5
Merri : 345-09453-456-4-4-1-3-0-5
MarMan: 5-0-3-1-4-4-456-9453-345
The benchmark results:
Merri : 0.891
Psyke : 1.094
Ellis : 1.188
MarMan: 1.391
The benchmark program:
Lord Orwell
Jan 14th, 2011, 01:47 AM
This forum is for intellectual exercises, not functional solutions.
not according to this:
The Purpose Of This Forum (by Brad Jones) (http://www.vbforums.com/showthread.php?t=272023)
anyway, here's my attempt. I haven't programmed in vb6 in a while, and needed the challenge.
smaller code is "usually" faster but i'll let someone else benchmark it. I tried to keep mine as a direct replacement for the first function. If i were to write my own function, i would have made use of a string and split it into an array instead of using a method that required a variant.
Private Function GetNumbersbyStringI(ByRef s As String) As String()
Dim cl As Long, Tempchar As String, tempchar2 As String, TempNumString As String
Dim myarray() As String, arraycount As Long
For cl = 1 To Len(s)
Tempchar = Mid(s, cl, 1)
If IsNumeric(Tempchar) Then
Do
TempNumString = TempNumString + Tempchar
cl = cl + 1
Tempchar = Mid(s, cl, 1)
If Not IsNumeric(Tempchar) Then
arraycount = arraycount + 1
ReDim Preserve myarray(1 To arraycount)
myarray(arraycount) = TempNumString
TempNumString = ""
Exit Do
End If
Loop
End If
Next cl
GetNumbersbyStringI = myarray
End Function
Edit: Don't bother. It's about 50% slower. I actually managed to get this down to about 5 lines of code and every single thing i did (except getting rid of the redim preserve) slowed it down. I give up :rolleyes:
*PsyKE1*
Jan 14th, 2011, 04:05 AM
Ook, I'll correct it later... :P
@Ellis Dee
You have to test with big strings too.
Merri
Jan 14th, 2011, 06:09 AM
Upgrading the benchmarker I have with this:Private Sub Form_Load()
Dim S As String
S = "1A2B3C4D5E6F7G8H9"
Combo1.AddItem S
S = Space$(10241)
Mid$(S, 1, 18) = "1A2B3C4D5E6F7G8H9I"
Mid$(S, 19) = S
Combo1.AddItem S
S = "123456789"
Combo1.AddItem S
S = Space$(10241)
Mid$(S, 1, 10) = "123456789A"
Mid$(S, 11) = S
Combo1.AddItem S
Combo1.ListIndex = 0
End Sub
In output it now it creates:
a short list of very short strings
a long list of very short strings
a single item of a long string (for a number nine digits is pretty long in regular use)
a long list of long strings
So this benchmark tests the extreme cases, not your avarage ones.
Compiled benchmark one:
Psyke: 73 ms
Ellis Dee: 70 ms
MarMan: 53 ms
Merri 1: 53 ms
Merri 2: 30 ms
Compiled benchmark two:
Psyke: 5555 ms
Ellis Dee: 3544 ms
MarMan: 2940 ms
Merri 1: 2333 ms
Merri 2: 1508 ms
Compiled benchmark three:
Psyke: 32 ms
Ellis Dee: 28 ms
MarMan: 37 ms
Merri 1: 20 ms
Merri 2: 14 ms
Compiled benchmark four:
Psyke: 2427 ms
Ellis Dee: 1736 ms
MarMan: 3114 ms
Merri 1: 553 ms
Merri 2: 357 ms
As my faster one was missed being only in the attachment:Private Declare Function ArrPtr Lib "msvbvm60" Alias "VarPtr" (Arr() As Any) As Long
Private Declare Sub PutMem4 Lib "msvbvm60" (ByVal Ptr As Long, ByVal Value As Long)
Public Sub GetNumberM2(Text As String, O() As String)
Dim LA() As Long, LH(0 To 5) As Long, LP As Long
Dim IA() As Integer, IH(0 To 5) As Long, IP As Long
Dim C As Integer, i As Long, P As Long, T As Long
LP = ArrPtr(LA)
LH(0) = 1: LH(1) = 4: LH(4) = &H3FFFFFFF
PutMem4 LP, VarPtr(LH(0))
IP = ArrPtr(IA)
IH(0) = 1: IH(1) = 2: IH(3) = StrPtr(Text): IH(4) = Len(Text)
LH(3) = IP: LA(0) = VarPtr(IH(0))
ReDim O(0 To Len(Text) \ 2)
For i = 0 To UBound(IA)
C = IA(i)
If C > 57 Then
If P Then
O(T) = Mid$(Text, P, i - P + 1)
T = T + 1
P = 0
End If
ElseIf C < 48 Then
If P Then
O(T) = Mid$(Text, P, i - P + 1)
T = T + 1
P = 0
End If
ElseIf P = 0 Then
P = i + 1
End If
Next i
If P Then
O(T) = Mid$(Text, P, i - P + 1)
ReDim Preserve O(T)
ElseIf T Then
ReDim Preserve O(T - 1)
Else
O = Split(vbNullString)
End If
LH(3) = IP: LA(0) = 0
LH(3) = LP: LA(0) = 0
End Sub
The attached project also contains fixed code for MarMan (Integer -> Long) and my first function.
Merri
Jan 14th, 2011, 06:18 AM
smaller code is "usually" faster
This isn't true in VB6. Smaller code usually means you are using built-in functions, for example Split, to keep your code short. With longer code that I write (and it is often the longest solution) I do a lot of small tweaking, which results in a long code. But it is often the fastest as well.
The world of .NET may be a lot different though, as they probably have optimized a lot of the functions. In VB6 only a few built-in functions have been truly optimized for speed.
Merri
Jan 14th, 2011, 08:43 AM
MarMan: here is an optimized version of your function:Public Function GetNumbers(strText As String) As Long()
Dim LA() As Long, LH(0 To 5) As Long, LP As Long
Dim IA() As Integer, IH(0 To 5) As Long, IP As Long
Dim r As Long
Dim bHaveDigit As Boolean
Dim intChar As Integer
Dim intCounter As Long
Dim intValue As Long
Dim intNumbers() As Long
LP = ArrPtr(LA)
LH(0) = 1: LH(1) = 4: LH(4) = &H3FFFFFFF
PutMem4 LP, VarPtr(LH(0))
IP = ArrPtr(IA)
IH(0) = 1: IH(1) = 2: IH(3) = StrPtr(strText): IH(4) = Len(strText)
LH(3) = IP: LA(0) = VarPtr(IH(0))
ReDim intNumbers(255)
For r = 0 To UBound(IA)
intChar = IA(r) - 48
If intChar >= 0 And intChar <= 9 Then
intValue = intValue * 10 + intChar
bHaveDigit = True
ElseIf bHaveDigit Then
intNumbers(intCounter) = intValue
intCounter = intCounter + 1
If (intCounter Mod 256) = 0 Then
ReDim Preserve intNumbers(intCounter + 255)
End If
intValue = 0
bHaveDigit = False
End If
Next
If bHaveDigit Then intNumbers(intCounter) = intValue: intCounter = intCounter + 1
If intCounter > 0 Then
ReDim Preserve intNumbers(intCounter - 1)
GetNumbers = intNumbers
End If
LH(3) = IP: LA(0) = 0
LH(3) = LP: LA(0) = 0
End Function
Exponent is slow, so I turned the logic around to multiply by 10. This also cures the problem of having results in the reversed order. I also removed Select Case structure to a somewhat simpler If structure, which in this case executes faster (yes, I tested this). Finally I added the Integer array instead for Asc & Mid combination which makes this solution the fastest. If you want a pure VB6 solution instead then the change is quite easy to do:Public Function GetNumbers(strText As String) As Long()
Dim r As Long
Dim bHaveDigit As Boolean
Dim intChar As Integer
Dim intCounter As Long
Dim intValue As Long
Dim intNumbers() As Long
ReDim intNumbers(255)
For r = 1 To Len(Text)
intChar = AscW(Mid$(strText, r, 1)) - 48
If intChar >= 0 And intChar <= 9 Then
intValue = intValue * 10 + intChar
bHaveDigit = True
ElseIf bHaveDigit Then
intNumbers(intCounter) = intValue
intCounter = intCounter + 1
If (intCounter Mod 256) = 0 Then
ReDim Preserve intNumbers(intCounter + 255)
End If
intValue = 0
bHaveDigit = False
End If
Next
If bHaveDigit Then intNumbers(intCounter) = intValue: intCounter = intCounter + 1
If intCounter > 0 Then
ReDim Preserve intNumbers(intCounter - 1)
GetNumbers = intNumbers
End If
End Function
But this slows it down quite a bit, because Mid$ creates a new string for each character. The earlier code just reads the strText data directly as an Integer array.
MarMan
Jan 14th, 2011, 08:58 AM
Edit: Please disregard this. I didn't read the above post.:blush:
My results weren't very good.:( (optimized for short strings with sequential digits does poorly with single digits interspersed with other characters in very large strings)
I tried to improve it to work well with different sized strings. Thanks for testing it for me!;)
Public Function GetNumbers(ByVal strText As String) As Integer()
Dim r As Integer
Dim bHaveDigit As Boolean
Dim intChar As Integer
Dim intCounter As Integer
Dim intExponent As Integer
Dim intNumbers() As Integer
Dim lngGuessCount As Long
intCounter = -1
intExponent = -1
lngGuessCount = 10 * (Len(Str(Len(strText))) - 2)
if lngGuessCount < 10 then
lngGuessCount = 10
elseif lngGuessCount > 1000000 then
lngGuessCount = 1000000
endif
redim intNumbers (lngGuessCount - 1)
For r = Len(strText) To 1 Step -1
intChar = Asc(Mid(strText, r, 1))
Select Case intChar
Case 48 To 57
If Not bHaveDigit Then
intCounter = intCounter + 1
If intCounter / lngGuessCount = Int(intCounter / lngGuessCount) Then
redim preserve intNumbers(intCounter + lngGuessCount)
End If
End If
bHaveDigit = True
intExponent = intExponent + 1
intNumbers(intCounter) = intNumbers(intCounter) + (intChar - 48) * 10 ^ intExponent
Case Else
If bHaveDigit Then
bHaveDigit = False
End If
intExponent = -1
End Select
Next
redim preserve intNumbers (intCounter)
'Array.Reverse(intNumbers)
GetNumbers = intNumbers
End Function
Lord Orwell
Jan 14th, 2011, 09:58 AM
This isn't true in VB6. Smaller code usually means you are using built-in functions, for example Split, to keep your code short. With longer code that I write (and it is often the longest solution) I do a lot of small tweaking, which results in a long code. But it is often the fastest as well.
The world of .NET may be a lot different though, as they probably have optimized a lot of the functions. In VB6 only a few built-in functions have been truly optimized for speed.
my first idea was to convert to a byte array and work with it that way, but you had already done that and i didn't want to rehash your work.
Merri
Jan 14th, 2011, 10:38 AM
Well, my first byte array solution isn't the fastest possible, so feel free to improve on the byte array method :)
Atm I have two functions quite finished, GetNumbersToLong & GetNumbersToString – I guess there isn't much that could still be improved so I'll post them:Option Explicit
Private Declare Function ArrPtr Lib "msvbvm60" Alias "VarPtr" (Arr() As Any) As Long
Private Declare Sub PutMem4 Lib "msvbvm60" (ByVal Ptr As Long, ByVal Value As Long)
Private Declare Function SafeArrayRedim Lib "oleaut32" (ByVal saPtr As Long, Optional saBound As Currency) As Long
Public Function GetNumbersToLong(Text As String) As Long()
' temporary safe array variables
Dim LA() As Long, LH(0 To 5) As Long, LP As Long
Dim IA() As Integer, IH(0 To 5) As Long, IP As Long
' other variables
Dim C As Long, D As Long, I As Long, N As Boolean, O() As Long, V As Long
' create a temporary Long array to replace the need for PutMem4
LP = ArrPtr(LA)
' create safe array header for Long array
LH(0&) = 1&: LH(1&) = 4&: LH(4&) = &H3FFFFFFF
' this is the only PutMem4 call we need, accessing a Long array is much faster than calling PutMem4!
PutMem4 LP, VarPtr(LH(0&))
' create a temporary Integer array to access the contents of Text
IP = ArrPtr(IA)
' create safe array header for Integer array
IH(0&) = 1&: IH(1&) = 2&: IH(3&) = StrPtr(Text): IH(4&) = Len(Text)
' does the same as the PutMem4 line above, but without calling PutMem4 :)
LH(3&) = IP: LA(0&) = VarPtr(IH(0&))
' estimate the absolute maximum amount of items
V = IH(4&) \ 2&
If V > 255& Then
' if over 256 items then limit ReDim to 256 items
ReDim O(255&)
Else
' otherwise we only ReDim what we will absolutely need
' note: if this line executes then we won't ever call ReDim Preserve within the For loop
ReDim O(V)
End If
' then we loop through all characters
For I = 0& To UBound(IA)
' convert from Integer to Long for better speed and drop highest bit (= negative indicator)
D = (CLng(IA(I)) - 48&) And &H7FFFFFFF
' are we processing numbers?
If Not N Then
' is this the first number?
If D <= 9& Then
' first number!
V = D
' enter "processing numbers" mode
N = True
End If
' we are in "processing numbers" mode, see if we need to add a new digit
ElseIf D <= 9& Then
V = V * 10 + D
' we must end "processing numbers" mode
Else
' store the final number into array
O(C) = V
' increase counter
C = C + 1&
' see if we are in danger of going out of buffer, reserve 256 new items for us if so
If (C Mod 256&) = 0& Then ReDim Preserve O(C + 255&)
' end the "processing numbers" mode
N = False
End If
Next
' if we are in "processing numbers" mode then we still must add the final item to the array
If N Then O(C) = V: C = C + 1&
' did we get any items?
If C > 0& Then
' set ubound
C = C - 1&
' do we need to resize the array?
If UBound(O&) > C Then ReDim Preserve O(C)
Else
' remove all items from the array (LBound = 0, UBound = -1)
SafeArrayRedim Not Not O
' VB6 IDE has a bug calling Not for an array, must call this to get rid of it
Debug.Assert App.hInstance
End If
' remove temporary Integer array
LH(3&) = IP: LA(0&) = 0&
' remove temporary Long array
LH(3&) = LP: LA(0&) = 0&
' return the resulting array
GetNumbersToLong = O
End Function
Public Function GetNumbersToString(Text As String) As String()
' temporary safe array variables
Dim LA() As Long, LH(0 To 5) As Long, LP As Long
Dim IA() As Integer, IH(0 To 5) As Long, IP As Long
' other variables
Dim C As Long, D As Long, I As Long, O() As String, V As Long
' create a temporary Long array to replace the need for PutMem4
LP = ArrPtr(LA)
' create safe array header for Long array
LH(0&) = 1&: LH(1&) = 4&: LH(4&) = &H3FFFFFFF
' this is the only PutMem4 call we need, accessing a Long array is much faster than calling PutMem4!
PutMem4 LP, VarPtr(LH(0&))
' create a temporary Integer array to access the contents of Text
IP = ArrPtr(IA)
' create safe array header for Integer array
IH(0&) = 1&: IH(1&) = 2&: IH(3&) = StrPtr(Text): IH(4&) = Len(Text)
' does the same as the PutMem4 line above, but without calling PutMem4 :)
LH(3&) = IP: LA(0&) = VarPtr(IH(0&))
' estimate the absolute maximum amount of items
D = IH(4&) \ 2&
If D > 255& Then
' if over 256 items then limit ReDim to 256 items
ReDim O(255&)
Else
' otherwise we only ReDim what we will absolutely need
' note: if this line executes then we won't ever call ReDim Preserve within the For loop
ReDim O(D)
End If
' then we loop through all characters
For I = 0 To UBound(IA)
' convert from Integer to Long for better speed and drop highest bit (= negative indicator)
D = (CLng(IA(I)) - 48&) And &H7FFFFFFF
' are we processing numbers?
If V = 0& Then
' if this the first number then enter "processing numbers" mode
If D <= 9& Then V = I + 1&
' do we have to end "processing numbers" mode?
ElseIf D > 9& Then
' store the string into array
If V > 1& Then
O(C) = Mid$(Text, V, I + 1& - V)
Else
O(C) = Left$(Text, I + 1& - V)
End If
' increase counter
C = C + 1&
' see if we are in danger of going out of buffer, reserve 256 new items for us if so
If (C Mod 256&) = 0& Then ReDim Preserve O(C + 255&)
' end the "processing numbers" mode
V = 0&
End If
Next
' if we are in "processing numbers" mode then we still must add the final item to the array
If V > 0& Then O(C) = Right$(Text, I + 1& - V): C = C + 1&
' did we get any items?
If C > 0& Then
' set ubound
C = C - 1&
' do we need to resize the array?
If UBound(O) > C Then ReDim Preserve O(C)
Else
' remove all items from the array (LBound = 0, UBound = -1)
SafeArrayRedim Not Not O
' VB6 IDE has a bug calling Not for an array, must call this to get rid of it
Debug.Assert App.hInstance
End If
' remove temporary Integer array
LH(3&) = IP: LA(0) = 0&
' remove temporary Long array
LH(3&) = LP: LA(0) = 0&
' return the resulting array
GetNumbersToString = O
End Function
Long version is much faster, but it is limited to nine digits (2 147 483 647 = max for Long).
Screenshot from compiled benchmarker.
MarMan
Jan 14th, 2011, 11:13 AM
Nice job!
Merri
Jan 14th, 2011, 02:36 PM
Any advice about my function Merri?
lngTextLen = LenB(sText) \ 2 + 1
Len is faster than LenB(sText) \ 2 – I don't know why.
ReDim strTempArr$(0)
This would be equal to thinking "I expect to have only one item, it is rare to have more than one"
Do
If intAsc(lngPos) < 48 Then Exit Do
If intAsc(lngPos) > 57 Then Exit Do
strNum = strNum + ChrW$(intAsc(lngPos))
lngPos = lngPos + 1
Loop Until lngPos = lngTextLen
While you have optimized the check for string length (lngPos = lngTextLen) it doesn't help much as you're doing a string concatenation. String concatenation is actually relatively fast with short strings, but it is always faster to create the final full size string if possible.
If LenB(strNum) Then
Q = Q + 1
ReDim Preserve strTempArr$(Q)
End If
You do ReDim Preserve each time you add a new item. This is too often and is the biggest performance bottleneck in your code.
Finally, your code doesn't account for the empty extra item at the end of your array.
I've been playing around with the idea of reserving one big space in memory for all the BSTRs that are created and then manually fill this space. Well, I found out that I can't do this. BSTRs are reserved from process heap, you can validate this with HeapSize(GetProcessHeap, 0, StrPtr(strText) - 4) – and this reveals the fact that each string is a single item in the heap, and one such item has always a specific size. So you can't have one continuous, minimal block of memory that has all the output strings there, because that could be only one big item in the heap. The system will not be able to follow this and you'll eventually get a crash, because it expects each BSTR to be a single item in the heap.
So the conclusion is that it'll be very hard to optimize the string solution, because creating strings via API calls will be slower than native VB6 code (such as Mid$).
*PsyKE1*
Jan 15th, 2011, 03:52 PM
Hi Merri!
Thanks for the tips men! ;)
I have two little questions for you:
1.-Are you sure that Len() is faster that LenB()\2 ?
I think that i tested it days ago and if i compile the program LenB()\2 was faster... I think... :P
2.-I can´t create 2 intarrays using PutMem4 api...
Imagine that i have 2 string variables in the call, how is the way to convert it to Integer? my vb crash when I try do something... :(
Thanks friend, now i haven't much time, but in a few days i'll look your new functions.
@MarMan
This:
Int(intCounter / lngGuessCount)
is the same of this:
intCounter \ lngGuessCount
Merri
Jan 15th, 2011, 03:59 PM
1) It is quite minor so as it isn't in a loop it doesn't really matter – in this particular use Len() would win as it is shorter to write. Many speed optimizations only really matter within a loop.
2) Probably better to just post code, but I guess it would be a matter of another topic (to keep this one cleaner).
MarMan
Jan 17th, 2011, 09:10 AM
@PsyKe1
This:
Int(intCounter / lngGuessCount)
is the same of this:
intCounter \ lngGuessCount
I keep forgetting about that. Is the latter faster?
Merri
Jan 17th, 2011, 09:17 AM
Yes, roughly twice faster.
However that part of the code:
If intCounter / lngGuessCount = Int(intCounter / lngGuessCount) Then
Could also be dealt this way:
If (intCounter Mod lngGuessCount) = 0 Then
I think it should be faster, I don't bother to test.
Also if you used a value that is a multiple of 2 then you could use And, ie. in case lngGuessCount = 1024:
If (intCounter And (lngGuessCount - 1)) = 0 Then
It would trigger every 1024th, because 1023 would match all the lower bits. But only works with multiple of 2 (4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048 etc.) – and this would certainly be the fastest.
MarMan
Jan 17th, 2011, 09:24 AM
Very clever.. Thank you!
So that's why you used 256 in post #24, I couldn't figure out the significance till your last explanation.
Merri
Jan 17th, 2011, 09:40 AM
Well, I did use Mod there anyway as I didn't stop to really think about And until my last post :) Mod works with any number.
Instead I used 256 as, without testing really, I thought it may work more optimally with memory. The array would always grow with 1024 bytes when more space is needed. However I'm not certain whether this in itself would be a good reason for better speed, but using And would definately give a nice boost over Mod.
If to be compared with memory allocation benchmark it has to be remembered that each computer is a bit different. Something that works great with one computer in comparison to another method may not give equal kind of difference on another computer. So for me allocation style is something that comes with another kind of priority: minimizing memory usage. Increasing 1 kB step-by-step is quite minimal waste, but if size of array would be doubled each time then it would be likely the array would at one point have a lot of waste. Worst case scenario: 100000 items in array, ReDim Preserve to 200000 items, whoops, the next item is the last one to be added, ReDim Preserve back to 100001 items...
So if one wanted to make the allocation more intelligent then the remaining amount of max. possibilities could be accounted for before doing such waste. Or one could allocate more items based on avarage findings until current position... this kind of optimizations could be done, but they'd require new test strings that are not as straightforward as the ones that for example I used in my test.
Tell me about getting lost in my own thoughts!
vbforums.com
Copyright Internet.com Inc., All Rights Reserved.