Results 1 to 25 of 25

Thread: Latest of my project ... Need for speed

  1. #1

    Thread Starter
    Addicted Member
    Join Date
    Aug 2007
    Posts
    227

    Latest of my project ... Need for speed

    My source code is below

    Private Sub Command1_Click()
    MsgBox FilterData("0012-1189-4567", 1, "12")
    MsgBox FilterData("0012-1189-4567", 3, "126")
    End Sub

    Private Function FilterData(strDg As String, ipos As Byte, strcut As String) As String
    Dim idg() As String
    Dim i As Integer
    Dim icnt As Integer
    Dim icntcut As Integer
    Dim icut As Byte
    Dim ilen As Byte
    Dim tmpStr As String
    If Trim(strDg) <> "" Then
    idg = Split(strDg, "-")
    tmpStr = ""
    icnt = 1
    icntcut = 0
    For i = 0 To UBound(idg)
    If InStr(1, strcut, Mid(idg(i), ipos, 1), vbTextCompare) <> 0 And InStr(1, tmpStr, idg(i), vbTextCompare) = 0 Then
    If icnt <= 9 Then
    tmpStr = tmpStr & Trim(idg(i)) & "-"
    Else
    tmpStr = tmpStr & Trim(idg(i)) & vbCrLf
    icnt = 0
    End If
    icnt = icnt + 1
    icntcut = icntcut + 1
    End If
    Next
    FilterData = tmpStr
    Else
    FilterData = ""
    End If
    End Function

    How to modify function FilterData to very fast if strdg = "0000-0001-...-9999" (10000 sets)

  2. #2
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: Latest of my project ... Need for speed

    I can't tell what your filtering is supposed to be doing. Could you describe it in words?

    Here's the code from the OP with highlight tags:
    vb Code:
    1. Private Sub Command1_Click()
    2.     MsgBox FilterData("0012-1189-4567", 1, "12")
    3.     MsgBox FilterData("0012-1189-4567", 3, "126")
    4. End Sub
    5.  
    6. Private Function FilterData(strDg As String, ipos As Byte, strcut As String) As String
    7.     Dim idg() As String
    8.     Dim i As Integer
    9.     Dim icnt As Integer
    10.     Dim icntcut As Integer
    11.     Dim icut As Byte
    12.     Dim ilen As Byte
    13.     Dim tmpStr As String
    14.     If Trim(strDg) <> "" Then
    15.         idg = Split(strDg, "-")
    16.         tmpStr = ""
    17.         icnt = 1
    18.         icntcut = 0
    19.         For i = 0 To UBound(idg)
    20.             If InStr(1, strcut, Mid(idg(i), ipos, 1), vbTextCompare) <> 0 And InStr(1, tmpStr, idg(i), vbTextCompare) = 0 Then
    21.                 If icnt <= 9 Then
    22.                     tmpStr = tmpStr & Trim(idg(i)) & "-"
    23.                 Else
    24.                     tmpStr = tmpStr & Trim(idg(i)) & vbCrLf
    25.                     icnt = 0
    26.                 End If
    27.                 icnt = icnt + 1
    28.                 icntcut = icntcut + 1
    29.             End If
    30.         Next
    31.         FilterData = tmpStr
    32.     Else
    33.         FilterData = ""
    34.     End If
    35. End Function

  3. #3
    G&G Moderator chemicalNova's Avatar
    Join Date
    Jun 2002
    Location
    Victoria, Australia
    Posts
    4,246

    Re: Latest of my project ... Need for speed

    You won't get much faster. EVERY program would struggle with that many iterations.. unless it were written in optimized-binary and running on NASA computers..

    The best you'll get is having some sort of strange lookup table.. so you don't have to keep calling Mid(), Instr(), etc. Other than that.. thats the best you'll get for speed.

    chem

    Visual Studio 6, Visual Studio.NET 2005, MASM

  4. #4
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: Latest of my project ... Need for speed

    I disagree. Mid() and InStr() can be tinkered with to improve speed; they don't necessarily cause a routine to be slow.

    String concatenation using the ampersand (&) operator, on the other hand, is extremely slow and can be greatly speeded up.

    I could help with that if I could understand what the filtering is intended to do.

  5. #5
    G&G Moderator chemicalNova's Avatar
    Join Date
    Jun 2002
    Location
    Victoria, Australia
    Posts
    4,246

    Re: Latest of my project ... Need for speed

    If you want to go that far you might aswell start using byte arrays and 32-bit types to increase speed.

    chem

    Visual Studio 6, Visual Studio.NET 2005, MASM

  6. #6

    Thread Starter
    Addicted Member
    Join Date
    Aug 2007
    Posts
    227

    Re: Latest of my project ... Need for speed

    Quote Originally Posted by Ellis Dee
    I disagree. Mid() and InStr() can be tinkered with to improve speed; they don't necessarily cause a routine to be slow.

    String concatenation using the ampersand (&) operator, on the other hand, is extremely slow and can be greatly speeded up.

    I could help with that if I could understand what the filtering is intended to do.

    For example
    1. MsgBox FilterData("0012-1189-4567", 1, "12")
    That mean select data that have 1 or 2 in First position (please see red character that is a first position). Result is "1189"

    2. MsgBox FilterData("0012-1189-4567", 3, "126")
    That mean select data that have 1 or 2 or 6 in third position (please see blue character that is a third position). Result is "0012-4567"

  7. #7
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: Latest of my project ... Need for speed

    It looks like you're filtering out duplicates. Is that the case?

  8. #8

    Thread Starter
    Addicted Member
    Join Date
    Aug 2007
    Posts
    227

    Re: Latest of my project ... Need for speed

    Data 10000 sets not duplicates then after filter it will not duplicates too.

  9. #9
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: Latest of my project ... Need for speed

    Quote Originally Posted by standardusr
    Data 10000 sets not duplicates then after filter it will not duplicates too.
    In that case, this will be much, much faster:
    Code:
    Private Function FilterData(pstrText As String, plngPosition As Long, ByVal pstrFilter As String) As String
        Dim strToken() As String
        Dim lngCount As Long
        Dim lngLen As Long
        Dim i As Long
        Dim lngOldPos As Long
        Dim lngNewPos As Long
        
        If Len(pstrText) = 0 Then Exit Function
        strToken = Split(LCase$(pstrText), "-")
        FilterData = Space$(Len(pstrText) + (UBound(strToken) + 1) \ 9)
        pstrFilter = LCase$(pstrFilter)
        lngNewPos = 1
        lngOldPos = 1
        For i = 0 To UBound(strToken)
            If Len(strToken(i)) >= plngPosition And InStr(pstrFilter, Mid$(strToken(i), plngPosition, 1)) <> 0 Then
                lngCount = lngCount + 1
                If lngCount < 11 Then
                    Mid$(FilterData, lngNewPos, 1) = "-"
                    lngNewPos = lngNewPos + 1
                Else
                    Mid$(FilterData, lngNewPos, 2) = vbNewLine
                    lngNewPos = lngNewPos + 2
                    lngCount = 1
                End If
                Mid$(FilterData, lngNewPos, Len(strToken(i))) = Mid$(pstrText, lngOldPos, Len(strToken(i)))
                lngNewPos = lngNewPos + Len(strToken(i))
            End If
            lngOldPos = lngOldPos + Len(strToken(i)) + 1
        Next
        Erase strToken
        FilterData = Mid$(FilterData, 2, lngNewPos - 2)
    End Function
    Last edited by Ellis Dee; Sep 8th, 2007 at 01:33 AM.

  10. #10
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: Latest of my project ... Need for speed

    The edited code above should do what you want. I ran a quick benchmark against 10,000 4-digit numbers:

    Old way: 1.57 seconds (5100 chars)
    New way: 0.08 seconds (5098 chars)

    Note that the new way doesn't include the trailing delimiter, which is either a dash (-) or vbNewLine. For the benchmark, the final delimiter was vbNewLine, which is why the new way is two characters shorter.

  11. #11

    Thread Starter
    Addicted Member
    Join Date
    Aug 2007
    Posts
    227

    Re: Latest of my project ... Need for speed


  12. #12
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: Latest of my project ... Need for speed

    Please stop PM'ing me; I get email notifications whenever the thread gets bumped.

    In neither you PM or your last post have you explained what is wrong. If you post the function call (even if as an attached project) that causes the problem, I'd be happy to take a look.

  13. #13
    G&G Moderator chemicalNova's Avatar
    Join Date
    Jun 2002
    Location
    Victoria, Australia
    Posts
    4,246

    Re: Latest of my project ... Need for speed

    Expert tip: Turn off email notifications That used to annoy the hell out of me when I had free time at work.. go on a posting spree.. 15 minutes later my inbox was filled with emails from this forum

    chem

    Visual Studio 6, Visual Studio.NET 2005, MASM

  14. #14

    Thread Starter
    Addicted Member
    Join Date
    Aug 2007
    Posts
    227

    Re: Latest of my project ... Need for speed

    Quote Originally Posted by Ellis Dee
    Please stop PM'ing me; I get email notifications whenever the thread gets bumped.

    In neither you PM or your last post have you explained what is wrong. If you post the function call (even if as an attached project) that causes the problem, I'd be happy to take a look.
    OK, I am sorry. I will not send PM to you.

  15. #15
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: Latest of my project ... Need for speed

    Quote Originally Posted by chemicalNova
    Expert tip: Turn off email notifications That used to annoy the hell out of me when I had free time at work.. go on a posting spree.. 15 minutes later my inbox was filled with emails from this forum
    I like email notifications.

  16. #16
    G&G Moderator chemicalNova's Avatar
    Join Date
    Jun 2002
    Location
    Victoria, Australia
    Posts
    4,246

    Re: Latest of my project ... Need for speed

    You're crazy

    This ends my spam in this thread.

    chem

    Visual Studio 6, Visual Studio.NET 2005, MASM

  17. #17
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: Latest of my project ... Need for speed

    Did the results shown in post 11 come from my code? If so, those linefeeds look all wrong.

    I am unable to reproduce the problem; maybe attach a text file containing the source string plus the actual function call and I'll be able to figure it out.

  18. #18

    Thread Starter
    Addicted Member
    Join Date
    Aug 2007
    Posts
    227

    Re: Latest of my project ... Need for speed

    Quote Originally Posted by Ellis Dee
    Did the results shown in post 11 come from my code? If so, those linefeeds look all wrong.

    I am unable to reproduce the problem; maybe attach a text file containing the source string plus the actual function call and I'll be able to figure it out.
    Please see full data in attatchment

    After that test by call function
    debug.print FilterData(strFulldata,1,"1235")
    Attached Files Attached Files

  19. #19
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: Latest of my project ... Need for speed

    Couple quick things:

    Your original code appears to produce the same bug. Weird.

    Did you want to clean up the linefeeds so that the filtered results puts 10 items per line?

  20. #20

    Thread Starter
    Addicted Member
    Join Date
    Aug 2007
    Posts
    227

    Re: Latest of my project ... Need for speed

    Quote Originally Posted by Ellis Dee
    Couple quick things:

    Your original code appears to produce the same bug. Weird.

    Did you want to clean up the linefeeds so that the filtered results puts 10 items per line?
    Yes, I want to show results 10 items per line.

  21. #21
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: Latest of my project ... Need for speed

    Ah, I figured it out.

    The tokens are being split by dashes (-), but not all the tokens are delimited by dashes. Consider these two lines:

    5990-5991-5992-5993-5994-5995-5996-5997-5998-5999
    6000-6001-6002-6003-6004-6005-6006-6007-6008-6009

    Since there is no dash between 5999 and 6000, they are being read as a single token equivalent to:

    "5999" & vbNewLine & "6000"

    The left 1 character of this expression is 5, so it qualifies as a valid token. That's why 6000 is appearing in both versions.

    The fix is simple enough, and it will correct the output formatting as well.
    Code:
    Private Function FilterData(ByVal pstrText As String, plngPosition As Long, ByVal pstrFilter As String) As String
        Dim strToken() As String
        Dim lngCount As Long
        Dim lngLen As Long
        Dim i As Long
        Dim lngOldPos As Long
        Dim lngNewPos As Long
        
        If Len(pstrText) = 0 Then Exit Function
        pstrText = Replace$(pstrText, vbNewLine, "-")
        strToken = Split(LCase$(pstrText), "-")
        FilterData = Space$(Len(pstrText) + (UBound(strToken) + 1) \ 9)
        pstrFilter = LCase$(pstrFilter)
        lngNewPos = 1
        lngOldPos = 1
        For i = 0 To UBound(strToken)
            If Len(strToken(i)) >= plngPosition And InStr(pstrFilter, Mid$(strToken(i), plngPosition, 1)) <> 0 Then
                lngCount = lngCount + 1
                If lngCount < 11 Then
                    Mid$(FilterData, lngNewPos, 1) = "-"
                    lngNewPos = lngNewPos + 1
                Else
                    Mid$(FilterData, lngNewPos, 2) = vbNewLine
                    lngNewPos = lngNewPos + 2
                    lngCount = 1
                End If
                Mid$(FilterData, lngNewPos, Len(strToken(i))) = Mid$(pstrText, lngOldPos, Len(strToken(i)))
                lngNewPos = lngNewPos + Len(strToken(i))
            End If
            lngOldPos = lngOldPos + Len(strToken(i)) + 1
        Next
        Erase strToken
        FilterData = Mid$(FilterData, 2, lngNewPos - 2)
    End Function

  22. #22

    Thread Starter
    Addicted Member
    Join Date
    Aug 2007
    Posts
    227

    Re: Latest of my project ... Need for speed

    Thank you very much Ellis Dee. You are expert programmer.

  23. #23

    Thread Starter
    Addicted Member
    Join Date
    Aug 2007
    Posts
    227

    Re: Latest of my project ... Need for speed

    I have some question to you Ellis Dee

    When I call function below

    MsgBox FilterData("0123", 1, "01")

    Why it is show "012" and how to fix it?

  24. #24
    PowerPoster Ellis Dee's Avatar
    Join Date
    Mar 2007
    Location
    New England
    Posts
    3,530

    Re: Latest of my project ... Need for speed

    Quote Originally Posted by standardusr
    When I call function below

    MsgBox FilterData("0123", 1, "01")

    Why it is show "012" and how to fix it?
    Looks like the initial buffer isn't large enough. To fix it, make the following change near the top of the function:

    FilterData = Space$(Len(pstrText) + (UBound(strToken) + 1) \ 9 + 1)

    This bug crept in because I changed the way I handled the first matching token. Originally, I identified inside the loop whether or not the token was the first match found, and if so, I didn't add a "-" prefix. Later, I decided that that was wasted cpu cycles inside the loop, so I changed it to always stuff a dash before every matching token, and then at the very end strip off the first character.

    In that buffer allocation line above, you'll note that it's adding Ubound() \ 9 instead of UBound() \ 10. It could probably change to \ 10; I set it to \ 9 to stay on the safe side and ensure there was enough space to handle the linefeeds.

    That means that this bug you found would only surface if the following two conditions were met:
    1. Every token is a match
    2. The total number of tokens <= 10
    So that's bug I never would have noticed. Kudos to you on your debugging.
    Last edited by Ellis Dee; Sep 8th, 2007 at 09:04 PM.

  25. #25

    Thread Starter
    Addicted Member
    Join Date
    Aug 2007
    Posts
    227

    Re: Latest of my project ... Need for speed

    Thank you very much. Now I not found bug in my program.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width