Results 1 to 14 of 14

Thread: [RESOLVED] "Wild" Cards in String manipulation

  1. #1

    Thread Starter
    Hyperactive Member
    Join Date
    Oct 2006
    Posts
    403

    Resolved [RESOLVED] "Wild" Cards in String manipulation

    Is it possible please to introduce "Wild Card" characters into the INSTR and REPLACE methods?

    For example If Instr(String,"\****\") then ...................

    Here I would want to test for any group of 6 characters of which the first and last are "\" and the middle 4 are anything.

    or Replace(STRING,"\****\","")

    Here I would want to delete any 6 character set which starts and ends with a "\".

    Is there a character which I can use in place of the * of the above code lines which will be interpreted as any character at all? I am currently solving this by using Mid$ to look 5 characters ahead of a "\" and, if this is a "\", to parse out the 6 character group, place this in a second string variable and then work with this. However this method is rather cumbersome. A "Wild Card" character would solve the problem much more elegantly.

    camoore

    Wales, UK

  2. #2
    PowerPoster RhinoBull's Avatar
    Join Date
    Mar 2004
    Location
    New Amsterdam
    Posts
    24,132

    Re: "Wild" Cards in String manipulation

    Try using Like operator - here is one way of many:
    Code:
    Private Sub Command1_Click()
    Dim someText As String
    Dim pos1 As Long
    Dim pos2 As Long
    
        someText = "abc\jkasdfj\defg"
        If someText Like "*\*\*" Then
            pos1 = InStr(1, someText, "\")
            pos2 = InStrRev(someText, "\")
            someText = VBA.Left(someText, pos1 - 1) & VBA.Mid(someText, pos2 + 1)
        End If
        Debug.Print someText
    
    End Sub

  3. #3
    PowerPoster RhinoBull's Avatar
    Join Date
    Mar 2004
    Location
    New Amsterdam
    Posts
    24,132

    Re: "Wild" Cards in String manipulation

    Here is more information:
    [quote]
    Like Operator

    Used to compare two strings.

    Syntax

    result = string Like pattern

    The Like operator syntax has these parts:

    Part Description
    result Required; any numericvariable.
    string Required; anystring expression.
    pattern Required; any string expression conforming to the pattern-matching conventions described in Remarks.


    Remarks

    If string matches pattern, result is True; if there is no match, result is False. If either string or pattern isNull, result is Null.

    The behavior of the Like operator depends on the Option Compare statement. The defaultstring-comparison method for eachmodule is Option Compare Binary.

    Option Compare Binary results in string comparisons based on asort order derived from the internal binary representations of the characters. Sort order is determined by the code page. In the following example, a typical binary sort order is shown:

    A < B < E < Z < a < b < e < z < À < Ê < Ø < à < ê < ø

    Option Compare Text results in string comparisons based on a case-insensitive, textual sort order determined by your system'slocale. When you sort The same characters using Option Compare Text, the following text sort order is produced:

    (A=a) < (À=à) < (B=b) < (E=e) < (Ê=ê) < (Z=z) < (Ø=ø)

    Built-in pattern matching provides a versatile tool for string comparisons. The pattern-matching features allow you to use wildcard characters, character lists, or character ranges, in any combination, to match strings. The following table shows the characters allowed in pattern and what they match:

    Characters in pattern Matches in string
    ? Any single character.
    * Zero or more characters.
    # Any single digit (0–9).
    [charlist] Any single character in charlist.
    [!charlist] Any single character not in charlist.


    A group of one or more characters (charlist) enclosed in brackets ([ ]) can be used to match any single character in string and can include almost anycharacter code, including digits.

    Note To match the special characters left bracket ([), question mark (?), number sign (#), and asterisk (*), enclose them in brackets. The right bracket (]) can't be used within a group to match itself, but it can be used outside a group as an individual character.

    By using a hyphen (–) to separate the upper and lower bounds of the range, charlist can specify a range of characters. For example, [A-Z] results in a match if the corresponding character position in string contains any uppercase letters in the range A–Z. Multiple ranges are included within the brackets without delimiters.

    The meaning of a specified range depends on the character ordering valid atrun time (as determined by Option Compare and thelocale setting of the system the code is running on). Using the Option Compare Binary example, the range [A–E] matches A, B and E. With Option Compare Text, [A–E] matches A, a, À, à, B, b, E, e. The range does not match Ê or ê because accented characters fall after unaccented characters in the sort order.

    Other important rules for pattern matching include the following:

    An exclamation point (!) at the beginning of charlist means that a match is made if any character except the characters in charlist is found in string. When used outside brackets, the exclamation point matches itself.


    A hyphen (–) can appear either at the beginning (after an exclamation point if one is used) or at the end of charlist to match itself. In any other location, the hyphen is used to identify a range of characters.


    When a range of characters is specified, they must appear in ascending sort order (from lowest to highest). [A-Z] is a valid pattern, but [Z-A] is not.


    The character sequence [] is considered a zero-length string ("").
    In some languages, there are special characters in the alphabet that represent two separate characters. For example, several languages use the character "æ" to represent the characters "a" and "e" when they appear together. The Like operator recognizes that the single special character and the two individual characters are equivalent.

    When a language that uses a special character is specified in the system locale settings, an occurrence of the single special character in either pattern or string matches the equivalent 2-character sequence in the other string. Similarly, a single special character in pattern enclosed in brackets (by itself, in a list, or in a range) matches the equivalent 2-character sequence in string
    [/code]

  4. #4

    Thread Starter
    Hyperactive Member
    Join Date
    Oct 2006
    Posts
    403

    Re: "Wild" Cards in String manipulation

    Thank you Rhino Bull for such a rapid and comprehensive reply. You have pointed me towards the LIKE operator. I looked this up in my several textbooks and found reference to it under SQL. However, while mention is made of the ability to use "?" to represent ANY character, there is no sample syntax provided as to how to do so.

    Suppose I wish ; STRING = Replace(STRING,"\????\","") ie. I want to delete any subset within STRING comprising of 6 characters of which the first and last are "\". Can you please indicate how to introduce the LIKE operator into this code? As above written, it will only delete a set of six which literally are "\????\" rather then treat the "?"'s as wildcards.

    Sorry if I seem slow to understand this technique. I have not used it before and the scant reference to it in my books lacks meat and detail.


    camoore

    Wales, UK

  5. #5
    PowerPoster
    Join Date
    Jun 2001
    Location
    Trafalgar, IN
    Posts
    4,141

    Re: "Wild" Cards in String manipulation

    Using Regular Expression is probably the easiest way to do this.
    Code:
    Private Sub Command1_Click()
    Dim objReg
    Dim strFilter As String
    Dim strTestString As String
    Dim strReplaceString As String
    
        Set objReg = CreateObject("VBScript.RegExp")
        objReg.Global = True
        objReg.IgnoreCase = True
    
        'Matches a backslash follow by 4
        'characters and then a backslash
        objReg.Pattern = "\\....\\"
        
        strTestString = "testing\rege\xpression"
        strReplaceString = ""
           
        Debug.Print objReg.Replace(strTestString, strReplaceString)
        
        Set objReg = Nothing
    End Sub

  6. #6

  7. #7
    PowerPoster
    Join Date
    Jun 2001
    Location
    Trafalgar, IN
    Posts
    4,141

    Re: "Wild" Cards in String manipulation

    Quote Originally Posted by RhinoBull View Post
    It might be as long as you can define proper pattern.
    Getting the pattern right is always the hard part with regular expression but in this case it is pretty easy one. Is the pattern I'm using the best one to use for this example...not sure I'm far form an expert on regexp.

  8. #8

    Thread Starter
    Hyperactive Member
    Join Date
    Oct 2006
    Posts
    403

    Re: "Wild" Cards in String manipulation

    Mark T. Thank you for the code example. However I can not get it to work, despite trying several obvious variations of it. I notice that you Dim StrFilter as a string, but then do not seem to use it?

    Could you please have another look at it. My source string will be CONTENT and suppose I want to delete from this any character group consisting of \????\ where ? represents ANY character.

    camoore

    Wales, UK

  9. #9
    PowerPoster
    Join Date
    Jun 2001
    Location
    Trafalgar, IN
    Posts
    4,141

    Re: "Wild" Cards in String manipulation

    Can you post an exampe of the content string?

  10. #10

    Thread Starter
    Hyperactive Member
    Join Date
    Oct 2006
    Posts
    403

    Re: "Wild" Cards in String manipulation

    What I am doing is writing a program to parse out the valid text from a .dbx (e-mail) file. The result is displayed in a large RichTextBox, and the text content is also stored in a string variable called CONTENT.

    The program is almost working.

    What I am left with is that at random locations within the text I get sequences of up to 58 gash characters such as this :

    Mary had a little lamb\'14\'f3\'00\'00\'00\'00$\'00\'00its feet were white as snow.

    The gash character sets always start and end with \'. Most of the sets are character pairs, but some are 3 characters (as 00$ in this example).

    So what I want to do is to run through the text (CONTENT) looking for \'??\' or \'???\' and then to set them all to null (where ? means any character). This will then leave me with \'00 at the end. These gash character sets always end with "\'00" so then I can detect and remove just that sequence by a replace statement Content = Replace(Content,"\'00", "")

    I hope I have described this little problem adequately.

    camoore

  11. #11
    PowerPoster
    Join Date
    Jun 2001
    Location
    Trafalgar, IN
    Posts
    4,141

    Re: "Wild" Cards in String manipulation

    Ok you have from your original post it sounded like you were look in to replace anything that started with a \, had 4 random characters and then ended with a \ with an empty string. Based on the string you provided
    Mary had a little lamb\'14\'f3\'00\'00\'00\'00$\'00\'00its feet were white as snow.

    would become
    Mary had a little lamb\'14\'f3\'00\'00\'00'00\'00its feet were white as snow.

    based on the post above I don't think this is what you are looking for. So based on the string provided, what should the string look like after doing the replace on it.

  12. #12

    Thread Starter
    Hyperactive Member
    Join Date
    Oct 2006
    Posts
    403

    Re: "Wild" Cards in String manipulation

    Hi MarkT.

    I am looking to write a routine which would reduce the quoted specimen string to just \'00 (ie, just the last character set which is not contained within a \' group). I can then use REPLACE to eliminate that final \'00 group. Hence I need to look through CONTENT for character sets of the form \'??\' or \'???\' and "simply" get rid of them (where ? represents ANY character). I realise that I may need to take two runs through CONTENT, one to deal with a '?? group and another to deal with a '??? group.

    Thank you very much for your assistance in this fairly trivial issue. However I hope that you will see how a "wild card" approach might solve it - hence my feeling that the answer could be of general interest to Forum readers.

    camoore

    Wales, UK

  13. #13
    PowerPoster
    Join Date
    Jun 2001
    Location
    Trafalgar, IN
    Posts
    4,141

    Re: "Wild" Cards in String manipulation

    How about this?
    Code:
    Private Sub Command1_Click()
    Dim objReg
    Dim strTestString As String
    Dim strReplaceString As String
    Dim i As Integer
    
        Set objReg = CreateObject("VBScript.RegExp")
        objReg.Global = True
        objReg.IgnoreCase = True
        
        strTestString = "Mary had a little lamb\'14\'f3\'00\'00\'00\'00$\'00\'00its feet were white as snow."
        strReplaceString = "\"
        
        For i = 3 To 4
            objReg.Pattern = "\\" & String(i, ".") & "\\"
            Do While objReg.test(strTestString)
                strTestString = objReg.Replace(strTestString, strReplaceString)
            Loop
        Next i
        
    '    'Remove comments to have the \'00 removed also
    '    objReg.Pattern = "\\'00"
    '    strReplaceString = " "
    '    strTestString = objReg.Replace(strTestString, strReplaceString)
        
        Debug.Print strTestString
        
        Set objReg = Nothing
    End Sub

  14. #14

    Thread Starter
    Hyperactive Member
    Join Date
    Oct 2006
    Posts
    403

    Re: "Wild" Cards in String manipulation

    Thank you again Mark T. I have used your code to write a subroutine called "CLEANUP1". My code follows. I have included lots of comments so that it will serve as a set of notes in my records as to how to implement a "wildcard" function in string manipulation and the replace method.


    I can find no reference whatsoever to RegExp / objreg anywhere in any of my VB6 textbooks. The more one uses VB6 the more "hidden" attributes of it one discovers, thanks largely to this Forum.

    In my code, I wanted to make the test for a character set to be deleted rather more discerning than \???\ or \????\ in case such a sequence were to crop up as valid text. Therefore I have, as you will see, refined the search criteria to \'??\' or \'???\' It is less likely that such groups will ever appear in valid text (unless I were writing a description of this very technique!!).

    The problem is solved. I hope that your code and my slight modification of it may be of use to other Forum users to implement "Wildcards". This code now works fine for me. I will mark the thread RESOLVED.

    Am greatly obliged to you.


    camoore

    Wales,
    UK



    Private Sub CLEANUP1()

    'This sub cleans out of string CONTENT any character set of form \'??\' or \'???\'.The variables CONTENT and CHAR have been dimensioned as Strings at Option Explicit. Finally it cleans out any remaining \'00 group


    Dim objReg

    Dim i As Integer

    Set objReg = CreateObject("VBScript.RegExp")

    objReg.Global = True

    objReg.IgnoreCase = True

    CHAR = "\'" 'set CHAR to \'

    For i = 2 To 3

    objReg.Pattern = "\\'" & String(i, ".") & "\\'" 'ie. we are looking for
    '"\'??\'" or "\'???\'"

    Do While objReg.test(CONTENT)

    CONTENT = objReg.Replace(CONTENT, CHAR)

    'if we have found either "\'??\'" or "\'???\'" we replace them by just "\'"

    Loop

    Next i

    'Now to have the final remaining \'00 group removed also

    objReg.Pattern = "\\'00"

    CHAR = ""

    CONTENT = objReg.Replace(CONTENT, CHAR)

    Debug.Print CONTENT

    Set objReg = Nothing

    RichTextBox1.Text = CONTENT


    End Sub
    Last edited by camoore; Aug 8th, 2009 at 04:46 AM. Reason: Typo

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width