-
Latest of my project ... Need for speed
My source code is below
Private Sub Command1_Click()
MsgBox FilterData("0012-1189-4567", 1, "12")
MsgBox FilterData("0012-1189-4567", 3, "126")
End Sub
Private Function FilterData(strDg As String, ipos As Byte, strcut As String) As String
Dim idg() As String
Dim i As Integer
Dim icnt As Integer
Dim icntcut As Integer
Dim icut As Byte
Dim ilen As Byte
Dim tmpStr As String
If Trim(strDg) <> "" Then
idg = Split(strDg, "-")
tmpStr = ""
icnt = 1
icntcut = 0
For i = 0 To UBound(idg)
If InStr(1, strcut, Mid(idg(i), ipos, 1), vbTextCompare) <> 0 And InStr(1, tmpStr, idg(i), vbTextCompare) = 0 Then
If icnt <= 9 Then
tmpStr = tmpStr & Trim(idg(i)) & "-"
Else
tmpStr = tmpStr & Trim(idg(i)) & vbCrLf
icnt = 0
End If
icnt = icnt + 1
icntcut = icntcut + 1
End If
Next
FilterData = tmpStr
Else
FilterData = ""
End If
End Function
How to modify function FilterData to very fast if strdg = "0000-0001-...-9999" (10000 sets)
-
Re: Latest of my project ... Need for speed
I can't tell what your filtering is supposed to be doing. Could you describe it in words?
Here's the code from the OP with highlight tags:
vb Code:
Private Sub Command1_Click()
MsgBox FilterData("0012-1189-4567", 1, "12")
MsgBox FilterData("0012-1189-4567", 3, "126")
End Sub
Private Function FilterData(strDg As String, ipos As Byte, strcut As String) As String
Dim idg() As String
Dim i As Integer
Dim icnt As Integer
Dim icntcut As Integer
Dim icut As Byte
Dim ilen As Byte
Dim tmpStr As String
If Trim(strDg) <> "" Then
idg = Split(strDg, "-")
tmpStr = ""
icnt = 1
icntcut = 0
For i = 0 To UBound(idg)
If InStr(1, strcut, Mid(idg(i), ipos, 1), vbTextCompare) <> 0 And InStr(1, tmpStr, idg(i), vbTextCompare) = 0 Then
If icnt <= 9 Then
tmpStr = tmpStr & Trim(idg(i)) & "-"
Else
tmpStr = tmpStr & Trim(idg(i)) & vbCrLf
icnt = 0
End If
icnt = icnt + 1
icntcut = icntcut + 1
End If
Next
FilterData = tmpStr
Else
FilterData = ""
End If
End Function
-
Re: Latest of my project ... Need for speed
You won't get much faster. EVERY program would struggle with that many iterations.. unless it were written in optimized-binary and running on NASA computers..
The best you'll get is having some sort of strange lookup table.. so you don't have to keep calling Mid(), Instr(), etc. Other than that.. thats the best you'll get for speed.
chem
-
Re: Latest of my project ... Need for speed
I disagree. Mid() and InStr() can be tinkered with to improve speed; they don't necessarily cause a routine to be slow.
String concatenation using the ampersand (&) operator, on the other hand, is extremely slow and can be greatly speeded up.
I could help with that if I could understand what the filtering is intended to do.
-
Re: Latest of my project ... Need for speed
If you want to go that far you might aswell start using byte arrays and 32-bit types to increase speed.
chem
-
Re: Latest of my project ... Need for speed
Quote:
Originally Posted by Ellis Dee
I disagree. Mid() and InStr() can be tinkered with to improve speed; they don't necessarily cause a routine to be slow.
String concatenation using the ampersand (&) operator, on the other hand, is extremely slow and can be greatly speeded up.
I could help with that if I could understand what the filtering is intended to do.
For example
1. MsgBox FilterData("0012-1189-4567", 1, "12")
That mean select data that have 1 or 2 in First position (please see red character that is a first position). Result is "1189"
2. MsgBox FilterData("0012-1189-4567", 3, "126")
That mean select data that have 1 or 2 or 6 in third position (please see blue character that is a third position). Result is "0012-4567"
-
Re: Latest of my project ... Need for speed
It looks like you're filtering out duplicates. Is that the case?
-
Re: Latest of my project ... Need for speed
Data 10000 sets not duplicates then after filter it will not duplicates too.
-
Re: Latest of my project ... Need for speed
Quote:
Originally Posted by standardusr
Data 10000 sets not duplicates then after filter it will not duplicates too.
In that case, this will be much, much faster:
Code:
Private Function FilterData(pstrText As String, plngPosition As Long, ByVal pstrFilter As String) As String
Dim strToken() As String
Dim lngCount As Long
Dim lngLen As Long
Dim i As Long
Dim lngOldPos As Long
Dim lngNewPos As Long
If Len(pstrText) = 0 Then Exit Function
strToken = Split(LCase$(pstrText), "-")
FilterData = Space$(Len(pstrText) + (UBound(strToken) + 1) \ 9)
pstrFilter = LCase$(pstrFilter)
lngNewPos = 1
lngOldPos = 1
For i = 0 To UBound(strToken)
If Len(strToken(i)) >= plngPosition And InStr(pstrFilter, Mid$(strToken(i), plngPosition, 1)) <> 0 Then
lngCount = lngCount + 1
If lngCount < 11 Then
Mid$(FilterData, lngNewPos, 1) = "-"
lngNewPos = lngNewPos + 1
Else
Mid$(FilterData, lngNewPos, 2) = vbNewLine
lngNewPos = lngNewPos + 2
lngCount = 1
End If
Mid$(FilterData, lngNewPos, Len(strToken(i))) = Mid$(pstrText, lngOldPos, Len(strToken(i)))
lngNewPos = lngNewPos + Len(strToken(i))
End If
lngOldPos = lngOldPos + Len(strToken(i)) + 1
Next
Erase strToken
FilterData = Mid$(FilterData, 2, lngNewPos - 2)
End Function
-
Re: Latest of my project ... Need for speed
The edited code above should do what you want. I ran a quick benchmark against 10,000 4-digit numbers:
Old way: 1.57 seconds (5100 chars)
New way: 0.08 seconds (5098 chars)
Note that the new way doesn't include the trailing delimiter, which is either a dash (-) or vbNewLine. For the benchmark, the final delimiter was vbNewLine, which is why the new way is two characters shorter.
-
Re: Latest of my project ... Need for speed
-
Re: Latest of my project ... Need for speed
Please stop PM'ing me; I get email notifications whenever the thread gets bumped.
In neither you PM or your last post have you explained what is wrong. If you post the function call (even if as an attached project) that causes the problem, I'd be happy to take a look.
-
Re: Latest of my project ... Need for speed
Expert tip: Turn off email notifications :p That used to annoy the hell out of me when I had free time at work.. go on a posting spree.. 15 minutes later my inbox was filled with emails from this forum :p
chem
-
Re: Latest of my project ... Need for speed
Quote:
Originally Posted by Ellis Dee
Please stop PM'ing me; I get email notifications whenever the thread gets bumped.
In neither you PM or your last post have you explained what is wrong. If you post the function call (even if as an attached project) that causes the problem, I'd be happy to take a look.
OK, I am sorry. I will not send PM to you.
-
Re: Latest of my project ... Need for speed
Quote:
Originally Posted by chemicalNova
Expert tip: Turn off email notifications :p That used to annoy the hell out of me when I had free time at work.. go on a posting spree.. 15 minutes later my inbox was filled with emails from this forum :p
I like email notifications.
-
Re: Latest of my project ... Need for speed
You're crazy :p
This ends my spam in this thread.
chem
-
Re: Latest of my project ... Need for speed
Did the results shown in post 11 come from my code? If so, those linefeeds look all wrong.
I am unable to reproduce the problem; maybe attach a text file containing the source string plus the actual function call and I'll be able to figure it out.
-
1 Attachment(s)
Re: Latest of my project ... Need for speed
Quote:
Originally Posted by Ellis Dee
Did the results shown in post 11 come from my code? If so, those linefeeds look all wrong.
I am unable to reproduce the problem; maybe attach a text file containing the source string plus the actual function call and I'll be able to figure it out.
Please see full data in attatchment
After that test by call function
debug.print FilterData(strFulldata,1,"1235")
-
Re: Latest of my project ... Need for speed
Couple quick things:
Your original code appears to produce the same bug. Weird.
Did you want to clean up the linefeeds so that the filtered results puts 10 items per line?
-
Re: Latest of my project ... Need for speed
Quote:
Originally Posted by Ellis Dee
Couple quick things:
Your original code appears to produce the same bug. Weird.
Did you want to clean up the linefeeds so that the filtered results puts 10 items per line?
Yes, I want to show results 10 items per line.
-
Re: Latest of my project ... Need for speed
Ah, I figured it out.
The tokens are being split by dashes (-), but not all the tokens are delimited by dashes. Consider these two lines:
5990-5991-5992-5993-5994-5995-5996-5997-5998-5999
6000-6001-6002-6003-6004-6005-6006-6007-6008-6009
Since there is no dash between 5999 and 6000, they are being read as a single token equivalent to:
"5999" & vbNewLine & "6000"
The left 1 character of this expression is 5, so it qualifies as a valid token. That's why 6000 is appearing in both versions.
The fix is simple enough, and it will correct the output formatting as well.
Code:
Private Function FilterData(ByVal pstrText As String, plngPosition As Long, ByVal pstrFilter As String) As String
Dim strToken() As String
Dim lngCount As Long
Dim lngLen As Long
Dim i As Long
Dim lngOldPos As Long
Dim lngNewPos As Long
If Len(pstrText) = 0 Then Exit Function
pstrText = Replace$(pstrText, vbNewLine, "-")
strToken = Split(LCase$(pstrText), "-")
FilterData = Space$(Len(pstrText) + (UBound(strToken) + 1) \ 9)
pstrFilter = LCase$(pstrFilter)
lngNewPos = 1
lngOldPos = 1
For i = 0 To UBound(strToken)
If Len(strToken(i)) >= plngPosition And InStr(pstrFilter, Mid$(strToken(i), plngPosition, 1)) <> 0 Then
lngCount = lngCount + 1
If lngCount < 11 Then
Mid$(FilterData, lngNewPos, 1) = "-"
lngNewPos = lngNewPos + 1
Else
Mid$(FilterData, lngNewPos, 2) = vbNewLine
lngNewPos = lngNewPos + 2
lngCount = 1
End If
Mid$(FilterData, lngNewPos, Len(strToken(i))) = Mid$(pstrText, lngOldPos, Len(strToken(i)))
lngNewPos = lngNewPos + Len(strToken(i))
End If
lngOldPos = lngOldPos + Len(strToken(i)) + 1
Next
Erase strToken
FilterData = Mid$(FilterData, 2, lngNewPos - 2)
End Function
-
Re: Latest of my project ... Need for speed
Thank you very much Ellis Dee. You are expert programmer.
-
Re: Latest of my project ... Need for speed
I have some question to you Ellis Dee
When I call function below
MsgBox FilterData("0123", 1, "01")
Why it is show "012" and how to fix it?
-
Re: Latest of my project ... Need for speed
Quote:
Originally Posted by standardusr
When I call function below
MsgBox FilterData("0123", 1, "01")
Why it is show "012" and how to fix it?
Looks like the initial buffer isn't large enough. To fix it, make the following change near the top of the function:
FilterData = Space$(Len(pstrText) + (UBound(strToken) + 1) \ 9 + 1)
This bug crept in because I changed the way I handled the first matching token. Originally, I identified inside the loop whether or not the token was the first match found, and if so, I didn't add a "-" prefix. Later, I decided that that was wasted cpu cycles inside the loop, so I changed it to always stuff a dash before every matching token, and then at the very end strip off the first character.
In that buffer allocation line above, you'll note that it's adding Ubound() \ 9 instead of UBound() \ 10. It could probably change to \ 10; I set it to \ 9 to stay on the safe side and ensure there was enough space to handle the linefeeds.
That means that this bug you found would only surface if the following two conditions were met:- Every token is a match
- The total number of tokens <= 10
So that's bug I never would have noticed. Kudos to you on your debugging.
-
Re: Latest of my project ... Need for speed
Thank you very much. Now I not found bug in my program.