Results 1 to 13 of 13

Thread: [RESOLVED] Need a clever way to manipulate strings

  1. #1

    Thread Starter
    Frenzied Member TheBionicOrange's Avatar
    Join Date
    Apr 2001
    Location
    Cardiff, UK
    Posts
    1,818

    Resolved [RESOLVED] Need a clever way to manipulate strings

    OK heres where I am at.
    I have a list of string texts stored in a db table.
    I am reading an Excel workbook, and trying to see if what I am reading from Excel matches any of the strings in my db table.
    The problem I have is I am kind of finding a match, but the wording is slightly different.
    For example ...

    Spreadsheet text is "Contractors to read and sign safety instructions in branch log book"

    DB text equivalent is : "Contractors to read and sign safety instructions in the front of the branch log book"

    As it stands I am using the INSTR function to search the DB text with the spreadsheet text. As its not quite there due to a slight wording change the function is returning zero.

    What I need to do is try and be a bit cleverer with how I manipulate the strings. I thought of maybe comparing only the first 15 or 20 characters, but the wording could be slightly different at the start, in the middle, or towards the end, so that would only get me so far.

    Can anyone think of a more structured way I could tackle this ?
    Maybe someone has done something similar before ?

    Thanks in advance.

    p.s. The INSTR function is the last thing you need when nursing the mother of all hangovers
    Last edited by Hack; Jun 30th, 2006 at 05:46 AM. Reason: Added [RESOLVED] to thread title Last edited by TheBionicOrange : Today at 05:50 AM.

  2. #2
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Need a clever way to manipulate strings

    Some things:
    • If you constantly call the Excel object to get strings, a lot more things happen than you might think. Store the strings into a string array instead for a faster access.
    • Do LCase$ for both strings in the new string array and for the strings you're about to use from database.
    • You could split the DB strings by space character and then compare word by word how many of the words are found and in which order.
    • InStr is actually fast in what it does. The problems in speed come from other things.


    That should get you started on getting your problem solved.

  3. #3

    Thread Starter
    Frenzied Member TheBionicOrange's Avatar
    Join Date
    Apr 2001
    Location
    Cardiff, UK
    Posts
    1,818

    Re: Need a clever way to manipulate strings

    I'm actually using UCASE, so I know why you suggested LCASE.
    I thought about using SPLIT, in fact, comparing word counts is about as rigid as I can get I guess. Its still not going to be perfect, but I'll see where it gets me.

    I am also already storing all Excel text in a string array. I know how slow Excel is/can be !
    I made a project once using .ACTIVECELL.OFFSET(x,y).SELECT statements everywhere ! Not something you would ever do twice !
    (That one is now re-written).

    Cheers Merri.

  4. #4

    Thread Starter
    Frenzied Member TheBionicOrange's Avatar
    Join Date
    Apr 2001
    Location
    Cardiff, UK
    Posts
    1,818

    Re: Need a clever way to manipulate strings

    ha ha ha .. just as I was going to compare Agilaz statement to Merris ... he deleted his reply !

    Spoilsport !

    That was going to be a practical exampe of me practicing more string comparisons ))

  5. #5
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Need a clever way to manipulate strings

    Do you want to know of a technique that allows you to access string data as integer array? That way you could create a very fast and clean character by character checking function that could work pretty much the way you wish. No need for split or any slow string handling.

    The drawback is that the technique is crash prone if you hit stop or pause and if you make an error in handling array space, you get a crash as well. Basically you'd need to create the function in a separate project and test it real well before adding it into your main project. You also need a good bit of time to understand what you're actually doing. But if you're interested, I can post up some code (although I don't have VB6 on this computer).

  6. #6

    Thread Starter
    Frenzied Member TheBionicOrange's Avatar
    Join Date
    Apr 2001
    Location
    Cardiff, UK
    Posts
    1,818

    Re: Need a clever way to manipulate strings

    Well thats a bit too time consuming for what I am doing right now (I'm supposed to be finished with this today), but I am all for learning new things, and if its going to be faster I'm all for that too.
    I could always amend it at a later date.

    If you get the time, that would be much appreciated.

  7. #7
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Need a clever way to manipulate strings

    VB Code:
    1. ' in Form1.frm
    2. Option Explicit
    3.  
    4. Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (ByRef lpvDest As Any, ByRef lpvSrc As Any, ByVal cbLen As Long)
    5. ' a regular VarPtr can't take an array variable so we need to add a support for that using this
    6. Private Declare Function VarPtrArray Lib "msvbvm60.dll" Alias "VarPtr" (Var() As Any) As Long
    7.  
    8. ' structure of a one dimensional safearray
    9. Private Type SafeArray1D
    10.     Dimensions As Integer       ' dimensions, should always be 1 in this case
    11.     FeatureFlags As Integer     ' features (zero is good enough in basic use)
    12.     ElementSize As Long         ' size of one element in bytes (1 = Byte, 2 = Integer or Boolean...)
    13.     LockCount As Long           ' locks (no idea what this is)
    14.     DataPtr As Long             ' pointer to location in memory where the data resides
    15.     Elements1D As Long          ' number of elements in the first dimension
    16.     LBound1D As Long            ' lower bound of the first dimension (zero is good and fast)
    17. End Type
    18.  
    19. ' declare SA variable that contains the information of our own array that we create
    20. Dim SA As SafeArray1D
    21. ' declare an empty MyArray variable which we fake to use our own safearrayheader
    22. Dim MyArray() As Integer

    VB Code:
    1. ' Form1.frm
    2. Private Sub Form_Load()
    3.     ' initialize our own safearray
    4.     With SA
    5.         ' one dimension
    6.         .Dimensions = 1
    7.         ' one element is two bytes (we use Integer)
    8.         .ElementSize = 2
    9.         ' number of elements: insane (biggest positive Long value)
    10.         .Elements1D = &H7FFFFFFF
    11.         ' we set number of elements in this fashion se we do not need to change it constantly
    12.     End With
    13.     ' set MyArray to use our own safearrayheader!
    14.     CopyMemory ByVal VarPtrArray(MyArray), VarPtr(SA), 4&
    15. End Sub
    16.  
    17. Private Sub Form_Unload(Cancel As Integer)
    18.     ' restore the original state of MyArray variable, otherwise VB will crash
    19.     ' important! zero must be a Long, & character must be there or you'll crash!
    20.     CopyMemory ByVal VarPtrArray(Taulukko), 0&, 4&
    21.     ' we didn't initialize MyArray before we set a custom header for it (ie. by using a ReDim)
    22.     ' thus it didn't point anywhere in the memory: the pointer value was 0
    23. End Sub

    VB Code:
    1. ' Form1.frm
    2. Private Sub Command1_Click()
    3.     Dim strTesti As String
    4.     Dim lngA As Long
    5.    
    6.     ' set some text to our test variable
    7.     strTesti = "BBB! Terve!"
    8.    
    9.     ' show situatation before any handling
    10.     MsgBox strTesti, , "Before"
    11.    
    12.     ' note: one character is always two bytes, that's why SA.ElementSize is two!
    13.    
    14.     ' now we cheat a bit: set our MyArray to point into this string!
    15.     SA.DataPtr = StrPtr(strTesti)
    16.    
    17.     ' lets change all B characters to A characters
    18.     For lngA = 0 To Len(strTesti) - 1
    19.         ' B's character code is 66, A is 65
    20.         If MyArray(lngA) = 66 Then MyArray(lngA) = 65
    21.     Next lngA
    22.    
    23.     ' lets see the end result!
    24.     MsgBox strTesti, , "After"
    25. End Sub

    Translated the code from Finnish, hope I didn't miss anything
    I had written an article with it, but I don't have the time to translate that right now.

  8. #8

    Thread Starter
    Frenzied Member TheBionicOrange's Avatar
    Join Date
    Apr 2001
    Location
    Cardiff, UK
    Posts
    1,818

    Re: Need a clever way to manipulate strings

    Brill. Thanks a lot for that. I'll have look thru it as soon as I get 5 minutes.

    Thanks again

  9. #9
    Lively Member Agilaz's Avatar
    Join Date
    Jun 2006
    Posts
    98

    Re: Need a clever way to manipulate strings

    Quote Originally Posted by TheBionicOrange
    ha ha ha .. just as I was going to compare Agilaz statement to Merris ... he deleted his reply !

    Spoilsport !

    That was going to be a practical exampe of me practicing more string comparisons ))
    well, after hitting submit i saw that Merri already suggested the same (splitting and count matching words)

    you can probably increase the accuracy by checking whether the matching words are in the same order.

  10. #10

    Thread Starter
    Frenzied Member TheBionicOrange's Avatar
    Join Date
    Apr 2001
    Location
    Cardiff, UK
    Posts
    1,818

    Re: Need a clever way to manipulate strings

    Thanks both for your help. Now putting this thread to bed.
    Agilaz ... I would have left it. Sometimes its good that people reiterate a point. Helps when other people read it

  11. #11
    Lively Member Agilaz's Avatar
    Join Date
    Jun 2006
    Posts
    98

    Re: [RESOLVED] Need a clever way to manipulate strings

    OK, i'll consider that next time

    and sorry for reactivating and hijacking your thread but Merri's "Integer String" impressed me and while playing arount with it i had some ideas...

    @Merri, first of all...great idea, hats off

    as you already mentioned the problem is that you easily risk a crash if you do something wrong. i think the biggest problem is that you get a access violation when you try to access an element of the array that is out of your apps memory space.


    VB Code:
    1. debug.print MyArray(&H7FFFFFFF)
    this is guaranteed to crash.

    to avoid that you have to update Elements1D with the length of the string every time you assign a string to the array...

    VB Code:
    1. SA.DataPtr = StrPtr(strTesti)
    2. SA.Elements1D = Len(strTesti)

    this way VB can handle the error and you only get a runtime error 9 (subscription out of range) when you try to access an element that is out of the scope.

    another advantage is that UBound(MyArray) will return the correct value.

    you also have to avoid accessing the array after manipulating the string (strTesti) directly, because the string can be relocated which will cause the array to point to random data or to an address outside of your apps memory.

    so every time you manipulate the string (not the array) you will have to repeat this immediately...

    VB Code:
    1. SA.DataPtr = StrPtr(strTesti)
    2. SA.Elements1D = Len(strTesti)

    i think the best idea whould be to wrap the hole thing up in a class to make it safe to use. that should be possible

  12. #12

    Thread Starter
    Frenzied Member TheBionicOrange's Avatar
    Join Date
    Apr 2001
    Location
    Cardiff, UK
    Posts
    1,818

    Re: [RESOLVED] Need a clever way to manipulate strings

    ... And you sound like JUST the man for the job
    If you'd like to give myself and Merri a shout on completion, I'm sure we could shower your post with gratitude

  13. #13
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: [RESOLVED] Need a clever way to manipulate strings

    You can use the technique safely even without "making things sure" that way; I've been using the technique speed in mind, thus setting anything that is not really required is just too much/useless. For someone getting used to the technique it might be good to set those extra things for safety.

    Basically it is all up to what you prefer more: ease of use or speed. Ease of use slows things down, but makes developing a bit faster. Aiming for speed makes things a bit more complicated, but sometimes you can blow eyeballs out of your head when you figure out a cool algorithm that works hand-to-hand with the technique


    Also, the technique can get very dangerous when you compile the program with all optimizations turned on, because then the code will never even check for valid array ranges. You can severely screw things up in memory.
    Last edited by Merri; Jun 30th, 2006 at 07:47 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width