dcsimg
Page 2 of 2 FirstFirst 12
Results 41 to 74 of 74

Thread: [RESOLVED] Fast count of words in a list

  1. #41
    PowerPoster
    Join Date
    Feb 2012
    Location
    West Virginia
    Posts
    13,023

    Re: Fast count of words in a list

    What do you mean by not sorted? I would expect VB to sort that as widow, widow's, windowhood, widows, widows'


    For the simple fact that the ASCII value of ' is 39 which is lower than the value of any alpha character.

  2. #42
    Fanatic Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    948

    Re: Fast count of words in a list

    @Olaf
    In my machine is 12sec for binary search with auto sorting, 20 if I use the old one that compares only equal length words, and 90sec if I use binary search and quick sort for every entry.

    As I said before there are routines to do much better, and yours is very good, but the task here for me is not to stay to that solutions, which are a black box, but explore ways that I can understand how it works, and share to others. You share the ways to do something, that is good, but you didn't give the deep knowledge.

    I thing that I have to use a hash table. I never make one and I will try.
    Merry Christmas
    George
    Last edited by georgekar; Dec 21st, 2014 at 08:09 PM.

  3. #43
    Fanatic Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    948

    Re: Fast count of words in a list

    @DataMiser,
    perhaps you can find that this
    ? strcomp("Aaronites","Aaron's")
    and this
    ? strcomp("Aaronites","Aaron's",vbtextCompare)
    give totally different output...

  4. #44
    PowerPoster
    Join Date
    Jun 2013
    Posts
    4,514

    Re: Fast count of words in a list

    Quote Originally Posted by georgekar View Post
    @DataMiser,
    perhaps you can find that this
    ? strcomp("Aaronites","Aaron's")
    and this
    ? strcomp("Aaronites","Aaron's",vbtextCompare)
    give totally different output...
    George, maybe you noticed, that the "make it case-insensitive"-Flag of the
    cSortedDictionary is a different Enum-Type - and named TextCompare instead of vbTextCompare.

    So it is not surprising, that you will get results which are a bit differently sorted.

    If you want to play around with all kinds of different Sort- and Compare-Schemes, the cSortedDictionary
    supports that as well, over its SetStringCompareFlags method, which is quite versatile.

    E.g. if you add a new small function to the CodeBase as:
    Code:
    Private Function ReSortDictionaryKeys(WD As cSortedDictionary, Flags As CmpFlags, Optional ByVal LCID As LangIDs) As cSortedDictionary
    Dim i As Long
      Set ReSortDictionaryKeys = New_c.SortedDictionary(, False)
          ReSortDictionaryKeys.SetStringCompareFlags Flags, LCID
    
      For i = 0 To WD.Count - 1
        ReSortDictionaryKeys.Add WD.KeyByIndex(i), WD.ItemByIndex(i)
      Next i
    End Function
    You could then apply it after the counting went through - by adding an additional line into the
    List-Click-Event:

    Code:
      ...
      New_c.Timing True
        ParseWords App.Path & "\Bible13.txt", WD, WordCountTotal, lstBooks.ListIndex
        Set WD = ReSortDictionaryKeys(WD, cmpIgnoreCase)
      lblReport.Caption = lstBooks.Text & ": -> Words Total: " & WordCountTotal & ", unique Words: " & WD.Count & ", Time needed:" & New_c.Timing
      ...
    The blue line above will cause the performance to dropy by ~10% (now about 0.37-0.38 seconds instead of the ~0.34sec without it).
    But those who want to sort the results by whatever LCID (or SubSorting-Schemes) of their specific countries, can do so now.

    Intellisense for playing around with that is supported - and I can easily choose e.g.:
    cSortedDictionary.SetStringCompareFlags cmpIgnoreCase, German_Germany, SORT_GERMAN_PHONE_BOOK
    (in case I want the special treatment for german umlauts, according to german phonebook-rules).

    Olaf
    Last edited by Schmidt; Dec 21st, 2014 at 09:55 PM.

  5. #45
    PowerPoster
    Join Date
    Jun 2013
    Posts
    4,514

    Re: Fast count of words in a list

    Quote Originally Posted by georgekar View Post
    As I said before there are routines to do much better,
    There are some approaches which *could* do better, but not by a very high margin -
    and neither your word-splitter-routine is currently up to the task - nor your current
    Binary-Sorting-Whilst-adding approach (which is BTW also used within the cSortedDictionary).

    Quote Originally Posted by georgekar View Post
    ...the task here for me is not to stay to that solutions, which are a black box, ...
    That's kind of a hypocrisy, because - when you apply that consequently, then you should refrain
    also from using the CompareStringW-call in your code (which sits in the "blackbox"-kernel32.dll).

    Please re-implement all its functionality in your own procedures first, to understand better.

    Quote Originally Posted by georgekar View Post
    You share the ways to do something, that is good, but you didn't give the deep knowledge.
    That's something one could really take offense with - George, the amount of code I shared
    with the community over the last two decades is immense. And if you have specific questions
    about something, you only need to ask (but perhaps better in another, dedicated thread).

    And it's again not really decent behaviour from you, that you not even *considered* looking at
    (or working with) the Word-Splitting routine I've posted, instead you stubbornly insist on using
    your own one further, although it is slower by a magnitude.

    For fast String-routines in VB, you have to "avoid using Strings" - and as "oxymoronic" as that
    sounds, it is nevertheless true - please look at the fastest performing StringRoutines at:
    http://www.xbeat.net/vbspeed/

    See, I could even point you to some older postings of mine, where I explained how to implement
    something like a Sorting Dictionary (with code) - but given your ignorance of the code I *did* post
    here already - I'm not sure, whether that would make any sense.

    Olaf

  6. #46
    Fanatic Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    948

    Re: Fast count of words in a list

    @Olaf
    Ok, I will look better your library. Not to replicated it

  7. #47
    Fanatic Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    948

    Re: Fast count of words in a list

    @Schmidt
    Your example is very fast but from using safearry isn't too much gain of speed.
    To prove that please insert this line :
    WD.SetStringCompareFlags cmpIgnoreCase, English_United_States, SORT_DEFAULT

    After these lines

    WordCountTotal = 0 'reset the WordTotal-Counter
    WB(-2) = 0 'set the initial StrLenB to zero
    Set WD = New_c.SortedDictionary(TextCompare)
    ..............
    WD.SetStringCompareFlags cmpIgnoreCase, English_United_States, SORT_DEFAULT

    in the original your example. As you see now need 3 times more to finish.

    Can you describe what exactly do in the TextCompare...option in a SortedDictionary?
    Last edited by georgekar; Dec 22nd, 2014 at 08:30 PM.

  8. #48
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    18,572

    Re: Fast count of words in a list

    Sam, or is it SOB?

    Trying to wade thru all these posts. Let me see if I got this right?

    1) Both lists are sorted
    2) You want to count how many times a word is repeated in List1
    3) Then you want to add that word, with number of repeats, to List2, deleting the previous List2 entry (if one exists?)

    If I got the above correct, I see one flaw in your original routine & correct me if I'm wrong:
    Code:
    If LCase(List2.List(i)) = LCase(List2.List(totListItms)) Then...
    Since each item in List2 has an occurrence count appended to the word, are you only removing the entry if they have the same occurrence count as before? In other words, lets say List2 already has an entry: "The - 5 Occurrences". And after processing List1, you have "The - 3 Occurrences". What is suppose to happen. Is the previous entry to be removed, or only removed if the previous and new entry match exactly. Equal sign in your code is a problem if trying to match just on the word in List2

    Either way, I see this being done looping thru List1 just once. And using APIs, not looping thru List2 at all.

    Edited: Here's an example using the API to locate a match just on the word, not the entire entry in List2. It could be what you are looking for a close enough to modify. The API usage prevents us from having to loop thru List2 every iteration until we found a 'match' or exhausted the list
    Code:
    Private Declare Function SendMessage Lib "user32.dll" Alias "SendMessageA" (ByVal hwnd As Long, ByVal wMsg As Long, ByVal wParam As Long, ByRef lParam As Any) As Long
    Private Const LB_FINDSTRING As Long = &H18F
    
    Private Sub Command1_Click()
        Dim l1 As Long, l2 As Long, nrRepeat As Long
        Dim lLast As Long, sEntry as String
        
        nrRepeat = 1
        For l1 = 1 To List1.ListCount - 1
            If StrComp(List1.List(l1), List1.List(lLast), vbTextCompare) = 0 Then ' match
                nrRepeat = nrRepeat + 1
            Else
                sEntry = List1.List(lLast) & " -"
                l2 = SendMessage(List2.hwnd, LB_FINDSTRING, -1, ByVal sEntry)
                sEntry = sEntry & nrRepeat & " Occurrences"
                If l2 = -1 Then             ' no match found on the word, add it
                    List2.AddItem sEntry
                Else                        ' match found, replace it
                    List2.List(l2) = sEntry
                End If
                nrRepeat = 1
                lLast = l1
            End If
        Next
        sEntry = List1.List(lLast) & " - "
        l2 = SendMessage(List2.hwnd, LB_FINDSTRING, -1, ByVal sEntry)
        sEntry = sEntry & nrRepeat & " Occurrences"
        If l2 = -1 Then             ' no match found on the word, add it
            List2.AddItem sEntry
        Else                        ' match found, replace it
            List2.List(l2) = sEntry
        End If
    End Sub
    I didn't add a check to see if List1 is empty. You should probably do that
    If an exact match is what you need, then replace that API constant with this one
    Code:
    Private Const LB_FINDSTRINGEXACT As Long = &H1A2
    Last edited by LaVolpe; Dec 22nd, 2014 at 08:56 PM.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  9. #49
    Fanatic Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    948

    Re: Fast count of words in a list

    I Have 6 examples. We have to put the Bible13.txt.
    Because some examples cannot work with books, I make all examples to count all books. I put an example with a HashTable class (from Francesco Balena, changed slight by me). I put my binary search routine, and my first slow routine...with a sort routine. So I have three more. One using MS Dictionary without using the Olaf's safearray. One using Olaf's RC5 cSortedDictionary but with proper sorting by using
    WD.SetStringCompareFlags cmpIgnoreCase, English_United_States, SORT_DEFAULT
    (It seems that TextCompare ignore case but has not the right sorting order unless we set these flags)
    And finaly we can click on the list of books (any item gives all books and introduction text).

    I do all of this to find out what factor add speed. In my computer in first version the binary search way of counting took 12000 ms. The last version took 1400ms (my slow routine start from 20000ms and end to 7500ms). These two routines and Olafs with RC5 give sorted result. Olafs without CompareFlags settings took 550ms...with these settings 1550. He propose to do make a sort after the routine run, and with that optimization took 760ms (200ms more, because the sort routine is before we check timer). But this is not the same, as for the other routines that do the proper comparison.

    HashTable has no sorted output. Dictionary output started with a click in listbox, are 200ms more quick from my implementation -started from a button Dictionary- (is like Olaf's without the safearray). So safe array add speed.

    From all that experiments...I have to notice that Olaf has do very good work with cSortedDictionary but must clarify some things about the options for that. I look help and only the syntax described, not what we can do, no specifications. (and these are the same for all of classes in RC5...so not only is a black box, but also has no embedded info or examples for each class, method, property. So if I need a life to understand...how this RC5 can be useful?


    6examples.zip
    Last edited by georgekar; Dec 23rd, 2014 at 06:33 AM.

  10. #50

    Thread Starter
    PowerPoster SamOscarBrown's Avatar
    Join Date
    Aug 2012
    Location
    NC, USA
    Posts
    7,268

    Re: Fast count of words in a list

    From LV:
    Sam, or is it SOB?

    Trying to wade thru all these posts. Let me see if I got this right?

    1) Both lists are sorted
    2) You want to count how many times a word is repeated in List1
    3) Then you want to add that word, with number of repeats, to List2, deleting the previous List2 entry (if one exists?)
    1- it is Sam!!!!!!!!!!!!! (Actually , Samuel Oscar Brown (Shorted to Sam Oscar Brown) is simply a moniker I created LONG LONG time ago...."Sam Brown" has a unique/funny history behind it (used to have it posted on my Website (MacIsIrish.com) but have since discontinued the site). I added Oscar, yes, to make it SOB. I hope I don't appear that way on this site....my life has changed considerably since the event occurred wherein I started using this alias.
    2-Yes, 1, 2 and 3 is precisely how I had approached the problem, and yesterday (very busy day), used an array instead of list1---speed about the same.
    I have not yet had time to experiment any further...hopefully today....with the many suggestions above.

    Here is what it looks like (and works fine (it seems)) except for the speed, when I was using two lists...

    Attachment 122007

    I'll respond to more of your post in a few..but the 'answers' appear correct with my orginal code. (Again, I have not had the opportunity to play around with others' suggestions yet.) From the image, you can see I used chapter 1 of Genesis in the NIV version---I clicked on the right-hand side GO button. (The left GO button would do the entire BOOK of, in this case, Gensis, and takes around 5 minutes or so). (Just an image to show how I originally approached this).

  11. #51

    Thread Starter
    PowerPoster SamOscarBrown's Avatar
    Join Date
    Aug 2012
    Location
    NC, USA
    Posts
    7,268

    Re: Fast count of words in a list

    LV---using YOUR code, it only took me 0.593 seconds to separate and count the words in Genesis. (13082 words in the NIV).

    NOW, to experiment with others' suggestions, and THEN, the hard part, try to understand the logic and code ya'll presented.

    SOB......ooops, I Mean, Sam

  12. #52
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    18,572

    Re: Fast count of words in a list

    Quote Originally Posted by SamOscarBrown View Post
    LV---using YOUR code, it only took me 0.593 seconds to separate and count the words in Genesis. (13082 words in the NIV).

    NOW, to experiment with others' suggestions, and THEN, the hard part, try to understand the logic and code ya'll presented.
    Don't think you'll get much faster using a listbox, a tad, if uber-optimizing. Since the items are already sorted, looping thru once is as fast as you can hope for. Probably more time taken, adding the words to the listbox

    Any massive improvement will be thru the use of other methods than the sorted listbox.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  13. #53
    PowerPoster
    Join Date
    Aug 2011
    Location
    B.C., Canada
    Posts
    2,887

    Re: Fast count of words in a list

    Quote Originally Posted by LaVolpe View Post
    Sam, or is it SOB?

    Trying to wade thru all these posts. Let me see if I got this right?

    1) Both lists are sorted
    2) You want to count how many times a word is repeated in List1
    3) Then you want to add that word, with number of repeats, to List2, deleting the previous List2 entry (if one exists?)

    If I got the above correct, I see one flaw in your original routine & correct me if I'm wrong:
    Code:
    If LCase(List2.List(i)) = LCase(List2.List(totListItms)) Then...
    Since each item in List2 has an occurrence count appended to the word, are you only removing the entry if they have the same occurrence count as before? In other words, lets say List2 already has an entry: "The - 5 Occurrences". And after processing List1, you have "The - 3 Occurrences". What is suppose to happen. Is the previous entry to be removed, or only removed if the previous and new entry match exactly. Equal sign in your code is a problem if trying to match just on the word in List2

    Either way, I see this being done looping thru List1 just once. And using APIs, not looping thru List2 at all.

    Edited: Here's an example using the API to locate a match just on the word, not the entire entry in List2. It could be what you are looking for a close enough to modify. The API usage prevents us from having to loop thru List2 every iteration until we found a 'match' or exhausted the list
    Code:
    Private Declare Function SendMessage Lib "user32.dll" Alias "SendMessageA" (ByVal hwnd As Long, ByVal wMsg As Long, ByVal wParam As Long, ByRef lParam As Any) As Long
    Private Const LB_FINDSTRING As Long = &H18F
    
    Private Sub Command1_Click()
        Dim l1 As Long, l2 As Long, nrRepeat As Long
        Dim lLast As Long, sEntry as String
        
        nrRepeat = 1
        For l1 = 1 To List1.ListCount - 1
            If StrComp(List1.List(l1), List1.List(lLast), vbTextCompare) = 0 Then ' match
                nrRepeat = nrRepeat + 1
            Else
                sEntry = List1.List(lLast) & " -"
                l2 = SendMessage(List2.hwnd, LB_FINDSTRING, -1, ByVal sEntry)
                sEntry = sEntry & nrRepeat & " Occurrences"
                If l2 = -1 Then             ' no match found on the word, add it
                    List2.AddItem sEntry
                Else                        ' match found, replace it
                    List2.List(l2) = sEntry
                End If
                nrRepeat = 1
                lLast = l1
            End If
        Next
        sEntry = List1.List(lLast) & " - "
        l2 = SendMessage(List2.hwnd, LB_FINDSTRING, -1, ByVal sEntry)
        sEntry = sEntry & nrRepeat & " Occurrences"
        If l2 = -1 Then             ' no match found on the word, add it
            List2.AddItem sEntry
        Else                        ' match found, replace it
            List2.List(l2) = sEntry
        End If
    End Sub
    I didn't add a check to see if List1 is empty. You should probably do that
    If an exact match is what you need, then replace that API constant with this one
    Code:
    Private Const LB_FINDSTRINGEXACT As Long = &H1A2
    It takes 2 seconds (lb_findstring) or 3 seconds (lb_findstringexact) to complete (modified a bit Lavolpe code).
    I know this can probably look better but I am in a hurry this morning

    Code:
    Private Sub PopList4()
    Dim l1 As Long
    Dim l2 As Long
    Dim sTinme As Long
    Dim nrRepeat As Long
    Dim sEntry As String
    Dim S() As String
    
      
        sTime = Timer
        
        ReDim S(List1.ListCount - 1)
        
        For i = 0 To List1.ListCount - 1
          S(i) = List1.List(i)
        Next
        
        List1.Visible = False
        List2.Visible = False
        
        nrRepeat = 0
        Do While List1.ListCount > 0
          DoEvents
          For l1 = List1.ListCount - 1 To 0 Step -1
            If List1.List(l1) = " " Or List1.List(l1) = "" Then
              If l1 < List1.ListCount Then List1.RemoveItem l1
              Exit For
            End If
            sEntry = LCase(List1.List(l1))
            Do Until l2 = -1
              DoEvents
              l2 = SendMessage(List1.hwnd, LB_FINDSTRINGEXACT, -1, ByVal sEntry)
              If l2 = -1 Then             ' no match found on the word, add it
                If Not sEntry = "" Then List2.AddItem sEntry & " - " & nrRepeat & " Occurance"
                nrRepeat = 0
                l2 = 0
                Exit Do
              Else                        ' match found, replace it
                List1.RemoveItem l2
                nrRepeat = nrRepeat + 1
              End If
            Loop
          Next
        Loop
        
        List1.Visible = True
        List2.Visible = True
        
        For i = 0 To UBound(S)
          List1.AddItem S(i)
        Next
        
        Debug.Print Timer - sTime
    End Sub
    list1 contains the bible13.txt...... split by space (Split(text1.text, " "))
    list2 is sorted = true (not necessary)

  14. #54
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    18,572

    Re: Fast count of words in a list

    Max, a couple things.

    1) When Sam said 1/2 second, I'm sure he wasn't counting the time taken to get the words in the list
    2) DoEvents is probably slowing your routine down
    3) Do While is a tad slower. It has to re-evaluate List1.ListCount each iteration
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  15. #55
    PowerPoster
    Join Date
    Aug 2011
    Location
    B.C., Canada
    Posts
    2,887

    Re: Fast count of words in a list

    Lavolpe, I added DoEvents (i always add it for testing) because if the do-loop crashes i lose all the code... it can be removed, there was no errors or inifinite looping in my example.

    I did this on the fly, but yes it can be done a lot better, the look and the speed.
    just wanted to add the looping to find/add every word in one go

    Edit:
    You do save .1 to .2 seconds... 100ms to 200ms speed gain by removing the doevents

  16. #56

    Thread Starter
    PowerPoster SamOscarBrown's Avatar
    Join Date
    Aug 2012
    Location
    NC, USA
    Posts
    7,268

    Re: Fast count of words in a list

    Correct, I was not counting the pop of list1...however, that was very rapid as well (2.63 secs)....(for a total of 3.223 seconds for both lists to be popped)
    Now, when I don't use a list1 at all, but populate an array instead (was 'instantaneous' in populating that array instead of2.63 seconds to pop the list1)
    and use the array in lieu of list1 to pop list2, the total time do pop the array (0) and list2 was 5.787 seconds....very interesting...I thought that perhaps using an array in lieu of list1 would be faster populating list2....but no, a total of ~2.5 seconds MORE.
    So, back to using two lists, I am MORE THAN PLEASED with the total of 3.222 seconds when MY times originally were in the MINUTES.

  17. #57
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    18,572

    Re: Fast count of words in a list

    Sam, the 5.787 seconds might have been an anomaly? Did you try it more than once without unloading the form? Maybe some initialization was going between VB and List2, something that happens just once?
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  18. #58
    PowerPoster
    Join Date
    Feb 2012
    Location
    West Virginia
    Posts
    13,023

    Re: Fast count of words in a list

    When you were using the array were you by chance using redim preserve in a loop? It so that could explain the slow down.

  19. #59

    Thread Starter
    PowerPoster SamOscarBrown's Avatar
    Join Date
    Aug 2012
    Location
    NC, USA
    Posts
    7,268

    Re: Fast count of words in a list

    LV, DM: Here's the code (LV's---only thing changed was I substituted wordArray() for list1) (word array, in Genesis, has 13081 'words')
    No redimming, simply looping through the array once.
    Times are pretty much consistent between 5.7 and 6.6 seconds, (first time, second click, etc) as compared to using list1, ~ 3.2 seconds.

    Code:
     StartTiming
     List2.Clear
        Dim l1 As Long, l2 As Long, nrRepeat As Long
        Dim lLast As Long, sEntry As String
        
        nrRepeat = 1
        For l1 = 1 To UBound(wordArray()) - 1
            If StrComp(wordArray(l1), wordArray(lLast), vbTextCompare) = 0 Then ' match
                nrRepeat = nrRepeat + 1
            Else
                sEntry = wordArray(lLast) & " -"
                l2 = SendMessage(List2.hwnd, LB_FINDSTRING, -1, ByVal sEntry)
                sEntry = sEntry & nrRepeat & " Occurrences"
                If l2 = -1 Then             ' no match found on the word, add it
                    List2.AddItem sEntry
                Else                        ' match found, replace it
                    List2.List(l2) = sEntry
                End If
                nrRepeat = 1
                lLast = l1
            End If
        Next
        sEntry = wordArray(lLast) & " - "
        l2 = SendMessage(List2.hwnd, LB_FINDSTRING, -1, ByVal sEntry)
        sEntry = sEntry & nrRepeat & " Occurrences"
        If l2 = -1 Then             ' no match found on the word, add it
            List2.AddItem sEntry
        Else                        ' match found, replace it
            List2.List(l2) = sEntry
        End If
        MsgBox GetTime

  20. #60
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    18,572

    Re: Fast count of words in a list

    That's a head-scratcher. You'd figure that compare List1.List(l1) to List1.List(lLast) would be slower since calls to the listbox would be needed to extract the strings for comparison, unless VB is optimizing something under the covers. What also may be a reason is that by using an array, I'd imagine twice the bytes are being compared: wordArray() is a unicode string where List1.List(x) is an ANSI string? A head-scratcher
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  21. #61
    PowerPoster
    Join Date
    Aug 2011
    Location
    B.C., Canada
    Posts
    2,887

    Re: Fast count of words in a list

    Shouldn't you be looping from 0 to ubound(array)
    And not 1 to ubound(array) - 1?

    Just a thought

  22. #62
    PowerPoster
    Join Date
    Aug 2011
    Location
    B.C., Canada
    Posts
    2,887

    Re: Fast count of words in a list

    What is in the StartTiming and GetTime function i would use
    Code:
    Dim sTime as Long
    sTime = Timer
    'execute code here
    Debug.Print Timer - sTime

  23. #63
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    18,572

    Re: Fast count of words in a list

    Quote Originally Posted by Max187Boucher View Post
    Shouldn't you be looping from 0 to ubound(array)
    And not 1 to ubound(array) - 1?
    Nope. The way it is written, the 1st "word" starts out with occurrence count of 1. No need to process it, one less string comparison
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  24. #64
    Fanatic Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    948

    Re: Fast count of words in a list

    Describe better...The first item, in position 0, has a variable used as offset to start, named ILast. So the next will be 1, next 2, so there is a line to advance the offset to next one lLast=L1.

  25. #65

    Thread Starter
    PowerPoster SamOscarBrown's Avatar
    Join Date
    Aug 2012
    Location
    NC, USA
    Posts
    7,268

    Re: Fast count of words in a list

    Quote Originally Posted by Max187Boucher View Post
    What is in the StartTiming and GetTime function i would use
    Code:
    Dim sTime as Long
    sTime = Timer
    'execute code here
    Debug.Print Timer - sTime
    These are those functions....I was going to use timer but have done it this way before....
    Again, tho, the results are about the same ~6 seconds (WITH ARRAY VICE LISTBOX).

    Code:
    Private Declare Function GetTickCount Lib "kernel32" () As Long
    Private m_StartTime As Long
    Private Sub StartTiming()
        m_StartTime = GetTickCount()
    End Sub
    Private Function GetTime() As Single
        Dim StopTime As Long
        StopTime = GetTickCount()
        GetTime = (StopTime - m_StartTime) / 1000
    End Function
    I'm going to go ahead and mark as Resolves (as it works fast now), but will keep experimenting with those other options provided. Yes, curious how the same code using a listbox was faster than the array. Oh well.

    THANKS TO ALL for educating this old man....still learnin'

    "SAM"

  26. #66
    Fanatic Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    948

    Re: [RESOLVED] Fast count of words in a list

    I think windows process in high priority messages for controls...so the search in a listbox is very fast.

  27. #67

    Thread Starter
    PowerPoster SamOscarBrown's Avatar
    Join Date
    Aug 2012
    Location
    NC, USA
    Posts
    7,268

    Re: [RESOLVED] Fast count of words in a list

    Actually, PROBABLY, most likely, is that I did not have the array sorted first, like I did list1. THAT is probably the reason for the extra time....I may go in and sort the array first, and then test the speed. OPINIONS?

  28. #68
    Fanatic Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    948

    Re: [RESOLVED] Fast count of words in a list

    I think a sorted list 1 give a sorted list2. If you have list 2 auto sorted then a non sorted input maybe is a lost of speed, but not much. So use not sorted list 2 and sorted list 1.

  29. #69

    Thread Starter
    PowerPoster SamOscarBrown's Avatar
    Join Date
    Aug 2012
    Location
    NC, USA
    Posts
    7,268

    Re: [RESOLVED] Fast count of words in a list

    Yup, as suspected....If I sort my array first (as I had list1 sorted first), the time to count the words and populate list2 was reduced from ~ 5 seconds to various instances between 0.007 and 0.99 secs...big difference.

  30. #70

    Thread Starter
    PowerPoster SamOscarBrown's Avatar
    Join Date
    Aug 2012
    Location
    NC, USA
    Posts
    7,268

    Re: [RESOLVED] Fast count of words in a list

    So, I suppose, if I eliminate BOTH listboxes, and use two arrays, it will be much faster to populate that second array as to populate list2.
    If I try that, I still need to show on the screen, but what I may do is take that second array and populate a msflexgrid with two columns, the first being the count, the second being the word, and sort it by count (highest first). That would make the most sense to me if I wanted to see the most common words. I then will probably have a search routine to find the grid's row for any given word. Sounds like a fun little challenge.
    Last edited by SamOscarBrown; Dec 23rd, 2014 at 01:57 PM.

  31. #71
    Fanatic Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    948

    Re: [RESOLVED] Fast count of words in a list

    No, if you use an array as the second list you slow the process...because you have to implement the search routine...So use ms dictionary...to use a fast search...

  32. #72

    Thread Starter
    PowerPoster SamOscarBrown's Avatar
    Join Date
    Aug 2012
    Location
    NC, USA
    Posts
    7,268

    Re: [RESOLVED] Fast count of words in a list

    Well, you may be right....what I did was kept the second listbox, split the text values of that listbox into word and number of occurrences, then populated a table in my database, then retrieved the information back into a grid with an sql which brought back the most occurrences first, etc. WHOLE process (including reading the data in the first time, until the grid was fully populated) took only 1.15 seconds.

    Of course now my code is all commented out in many, many places, so now have to go back and clean it up...and, once again, fully understand how the populating of the second listbox actually occurred so much more rapidly than my first attempt.

  33. #73
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    18,572

    Re: [RESOLVED] Fast count of words in a list

    Quote Originally Posted by SamOscarBrown View Post
    Yup, as suspected....If I sort my array first (as I had list1 sorted first), the time to count the words and populate list2 was reduced from ~ 5 seconds to various instances between 0.007 and 0.99 secs...big difference.
    That could explain the speed hit. Not only that, the routines are dependent on a sorted source list. Without it being sorted, you are not only getting false occurrence counts, but are updating List2 way too often
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  34. #74
    Fanatic Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    948

    Seven Examples...

    I do some practice...with examples..to optimize them. I see that making big arrays from the start is worst scenario from making them at steps. So here is the 7 example program...I conform to the proposal for later sorting on Olaf's super speed vbRichClient, so now has the full speed with later appropriate sorting.
    The 7th example is the myDoc class (for documents) that has "on board" the binary search routine. So if we have a document (also mydoc is combined with the text viewer - an editor I have in M2000 - look code), with a setting to a reference the text viewer just use the data from mydoc). I make some functions such as to count in ms the number of words (very fast), and we can get all the words in an array or in a listbox, or we can count the unique words and if we want we can use later the saved words to fill a listbox or an array.

    Binary search routine count the words in the same time that sort the words. Now has comparable speed (the stand alone routine need 1.3 seconds to find from 792367 words 13231 unique words and the number of appearance. Ms Dictionary with a stage for sorting need 1.118, Olaf's routine 0.574 sec (very fast, including second sort), and 1.156 if we apply sort and include the time to final (this is for the Ms Dictionary -clicking in the left listbox, with books). (we must hide all other windows to get a good measure...). Binary search 1.330, Hashtable 1.515 including the sort routine. The slow routine 10.500 (including the sorting routine). Finally the 7th routine (has the same binary routine but get the data from Document class and run in the space of the class, 1.480ms. So except the slow routine all routines are between 0.574ms and 1.515. The binary search routine in the Document class is more versatile, because works for unicode (and sort for that). We can pass the Licd, so we can sort for any country (although I have no manage to fit the sub regions..is something too much...for now).

    That's all...
    (Now I place the new Document class myDoc.class to M2000, so I make new commands, to count words and read the number of appearance for any word.)
    Attached Files Attached Files

Page 2 of 2 FirstFirst 12

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Featured


Click Here to Expand Forum to Full Width