Results 1 to 29 of 29

Thread: [RESOLVED] Need help with replace issue

  1. #1

    Thread Starter
    PowerPoster Nitesh's Avatar
    Join Date
    Mar 2007
    Location
    Death Valley
    Posts
    2,556

    Resolved [RESOLVED] Need help with replace issue

    Hi guys,

    say I have the following data in 4 array elements:

    Code:
    www.forum.co.za
    
    www.vbforums.com
    
    http://www.vbforums.com
    
    [email protected]
    I loop through each item in the array like this:

    Code:
    strEmailIn = Split(stremail, ",")
         
         Call BubbleSort1(strEmailIn)
         
         For C = 0 To UBound(strEmailIn())
    
           If C = 0 Then
            
             If Not InStr(strEmailIn(C), "@") = 0 Then
                strHTML = Replace(strHTML, strEmailIn(C), "<a href=""mailto:" & strEmailIn(C) & """> " & strEmailIn(C) & "</a>")
             Else
                strHTML = Replace(strHTML, strEmailIn(C), "<a  href=""http://" & Replace(strEmailIn(C) & """ target=""_blank""", "http://", "") & """> " & Replace(strEmailIn(C), "http://", "") & "</a> ")
             End If
           Else
              
                If (strEmailIn(C) <> strEmailIn(C - 1)) Then
                    If Not InStr(strEmailIn(C), "@") = 0 Then
                        strHTML = Replace(strHTML, strEmailIn(C), "<a href=""mailto:" & strEmailIn(C) & """> " & strEmailIn(C) & "</a>")
                    Else
                        strHTML = Replace(strHTML, strEmailIn(C), "<a  href=""http://" & Replace(strEmailIn(C) & """ target=""_blank""", "http://", "") & """> " & Replace(strEmailIn(C), "http://", "") & "</a> ")
    
                       End If
                End If
           
           
          End If
          
          
         Next C
    im doing an rtb to html conversion. Problem is for example:

    www.vbforums.co.za has been replaced with:

    Code:
    <a  href="http://www.vbforums.com" target="_blank""> www.vbforums.com</a>
    but when the loop reaches:

    http://www.vbforums.com it replaces the previous www.vbforums.com with another:

    Code:
    <a  href="http://www.vbforums.com" target="_blank""> www.vbforums.com</a>
    and thus my html goes off.

    How can I work around this

  2. #2
    Fanatic Member
    Join Date
    Jan 2007
    Location
    Middletown, CT
    Posts
    948

    Re: Need help with replace issue

    I'm not sure, but I think is is because when you replace a string with a larger string and continue searching the string from the last character place, you're actually searching the replacement.

    For example, let's say I want to replace the "test.com" with "www.test.com" in the string "http://test.com".
    Using your method, I would search through the string one character at a time and arrive at "test.com" and position 8. Replacing "test.com" with "www.test.com" I would get this string: "http://www.test.com". The loop would increment by one, so we'd be looking for "test.com" starting at position 9. Position 9, however, would be the second "w" of the "www" portion. Therefore, when we get to position 13, we're at the start of the next "test.com," which is actually the middle of the replacement.

    To fix this, there are a few options. You can set up a do loop block outside the for next loop and loop until a flag gets set to true. When you find the string and replace it, you can exit the for loop. You wouldn't set the flag until you found no further replacements. The beginning of the for next loop would have to be a variable that would be changed when you replace text to be the position after the text. By doing this, you'd be exiting the for next loop then reinitializing it with the new parameters.

    I'm not sure of any other way to do that as modyfying the array in your code would change the ubound, and thus it would change the parameters of the for next loop, which would have to be reinitialized.

  3. #3
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: Need help with replace issue

    Before doing the replace, tag first the text you're gonna process later with a loop, e.g. nest as a comment with array index as tag value or some other unique representation or key that would not be affected by the data in the array.

    Replace at loop would then become:

    strHTML = Replace(strHTML, "<!--" & C & "-->", "<a href=""mailto:" & strEmailIn(C) & """> " & strEmailIn(C) & "</a>")

    And if you end up with "<!--" & C & "-->" tags after processing then there's something wrong with the logic as it missed replacing these items.
    Last edited by leinad31; Jun 11th, 2008 at 09:56 PM.

  4. #4

    Thread Starter
    PowerPoster Nitesh's Avatar
    Join Date
    Mar 2007
    Location
    Death Valley
    Posts
    2,556

    Re: Need help with replace issue

    thanks for the suggestions guys. Leinad31, please explain your post to me. It seems like a great idea but i'm confused

  5. #5
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: Need help with replace issue

    First pass through array you convert instances of text to <!--1-->, <!--2-->, etc. Second pass through array converts these comments to links, since they are in comment format you won't accidentally replace other text (except if <!--1--> and similar already exists in strHTML before you began any processing).

  6. #6
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Need help with replace issue

    leinad, you always give great advice, but sometimes an example is worth a thousand words.

  7. #7

    Thread Starter
    PowerPoster Nitesh's Avatar
    Join Date
    Mar 2007
    Location
    Death Valley
    Posts
    2,556

    Re: Need help with replace issue

    correct me if i'm wrong. But I replaced all instances of "http://" with "" before running my loops and it works great

  8. #8
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: Need help with replace issue

    strHTML = "www.forum.co.za___www.vbforums.com___http://[email protected]"

    Note that strHTML contains strings in your array. You iterate through your array and replace with comment format. You end up with

    strHTML = "<!--0-->___<!--1-->___<!--2-->___<!--3-->"

    You iterate again through the array to replace comment format with link format. Such as

    strHTML = Replace(strHTML, "<!--" & C & "-->", "<a href=""mailto:" & strEmailIn(C) & """> " & strEmailIn(C) & "</a>")

    If C was zero then it would replace <!--0--> and you'll get

    strHTML = "<a href=""mailto:www.forum.co.za"">www.forum.co.za</a>___<!--1-->___<!--2-->___<!--3-->"

    Just continue the process.

    You can make other variations (or use of other string tokens instead of comment)... Just bear in mind central idea which is use of tokens. I used comment form of token to keep sample simple.

    Cons of using comment form of token is if its not converted to a link successfully then the text is no longer visible on page. An alternative would be use token "<a>" & strEmailIn(C) & "</a>" and search for that when replacing.

  9. #9
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: Need help with replace issue

    Quote Originally Posted by Nitesh
    correct me if i'm wrong. But I replaced all instances of "http://" with "" before running my loops and it works great
    Will work until a set of data you didn't anticipate for comes along. Consider www.google.com.ph followed later by www.google.com.uk, finally followed by www.google.com which affects links of previous two.

    Use tokens.
    Last edited by leinad31; Jun 12th, 2008 at 08:27 PM.

  10. #10
    Fanatic Member
    Join Date
    Jan 2007
    Location
    Middletown, CT
    Posts
    948

    Re: Need help with replace issue

    What's a token?

  11. #11
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: Need help with replace issue

    An atomic object in parsing http://en.wikipedia.org/wiki/Token

  12. #12
    Head Hunted anhn's Avatar
    Join Date
    Aug 2007
    Location
    Australia
    Posts
    3,669

    Re: Need help with replace issue

    The method suggested by leinad is the best way for your case. With this method you don't need to use sorting.

    You may clearly see the problem when one item is a substring of other one or more items.

    For example, with
    str = "ADBCDEF"
    now you want to replace "CD" with "xCDy" and replace "D" with "Dx" and want to have final as: str = "ADxBxCDyEF"

    Method 1 (2 steps):
    str = Replace(str, "CD", "xCDy") '-- "ADBxCDyEF"
    str = Replace(str, "D", "Dx") '-- "ADxBxCDxyEF" : wrong!

    Method 2 (2 steps):
    str = Replace(str, "D", "Dx") '-- "ADxBCDxEF"
    str = Replace(str, "CD", "xCDy") '-- "ADxBxCDyxEF" : wrong!

    Method 3 (4 steps): with #1 and #2 as temp tokens
    str = Replace(str, "CD", "#1") '-- "ADB#1EF"
    str = Replace(str, "D", "#2") '-- "A#2B#1EF"
    str = Replace(str, "#1", "xCDy") '-- "A#2BxCDyEF"
    str = Replace(str, "#2", "Dx") '-- "ADxBxCDyEF" : correct!
    • Don't forget to use [CODE]your code here[/CODE] when posting code
    • If your question was answered please use Thread Tools to mark your thread [RESOLVED]
    • Don't forget to RATE helpful posts

    • Baby Steps a guided tour
    • IsDigits() and IsNumber() functions • Wichmann-Hill Random() function • >> and << functions for VB • CopyFileByChunk

  13. #13

    Thread Starter
    PowerPoster Nitesh's Avatar
    Join Date
    Mar 2007
    Location
    Death Valley
    Posts
    2,556

    Re: Need help with replace issue

    this is getting hectic. Say I have these within a string:

    Code:
    this is a test
    www.vbforums.com
    http://www.vbforums.com
    using my regexp function I extract www.vbforums.comand http://www.vbforums.com
    I now add these two strings to an array seperated by commas.

    Then I loop through the array and replace the first item with its array index so:

    www.vbforums.com becomes <--0--> and
    http://www.vbforums.com becomes <--1-->

    Now how will I go about replacing that.

    Please help

  14. #14
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: Need help with replace issue

    Quote Originally Posted by Nitesh
    this is getting hectic. Say I have these within a string:

    Code:
    this is a test
    www.vbforums.com
    http://www.vbforums.com
    using my regexp function I extract www.vbforums.comand http://www.vbforums.com
    I now add these two strings to an array seperated by commas.

    Then I loop through the array and replace the first item with its array index so:

    www.vbforums.com becomes <--0--> and
    http://www.vbforums.com becomes <--1-->

    Now how will I go about replacing that.

    Please help
    If you can Replace() A's with B's then you can Replace() B's with A's. Convert instances of domain to token:

    strHTML = Replace(strHTML, strEmailIn(C), "<!--" & C & "-->")

    Then as already explained convert tokens to links. Don't convert directly from domains into links.

  15. #15

    Thread Starter
    PowerPoster Nitesh's Avatar
    Join Date
    Mar 2007
    Location
    Death Valley
    Posts
    2,556

    Re: Need help with replace issue

    sorry for my ignorance,

    but using this text

    www.google.com

    www.google.com.au

    and www.google.com.ph

    and this code:

    Code:
     For B = 0 To UBound(strEmailIn())
           
            tempstr = stremailin(b)
    
            strHTML = Replace(strHTML, tempStr, "<-- " & B & " -->")
    
    
         Next B
    www.google.com becomes <--0-->

    www.google.com.au becomes <--0-->.au

    etc. is this right

  16. #16
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: Need help with replace issue

    Yup that would happen, really depends on the data. Please post your regex code. I think it would be best to implement introduction of token there.

  17. #17

    Thread Starter
    PowerPoster Nitesh's Avatar
    Join Date
    Mar 2007
    Location
    Death Valley
    Posts
    2,556

    Re: Need help with replace issue

    but then that takes me back to square 1 .

    this is my regex code: please advise

    Code:
    Public Function rgxExtract(Optional ByVal Target As Variant, Optional Pattern As String = "", Optional ByVal Item As Long = 0, Optional CaseSensitive As Boolean = False, Optional FailOnError As Boolean = True, Optional Persist As Boolean = False) As Variant
    
       Dim arrEmails() As String
    
    
      Const rgxPROC_NAME = "rgxExtract"
    
      Static oRE As Object 'VBScript_RegExp_55.RegExp
    
        'Static declaration means we don't have to create
    
        'and compile the RegExp object every single time
    
        'the function is called.
    
      Dim oMatches As Object 'VBScript_RegExp_55.MatchCollection
    
       
    
      On Error GoTo ErrHandler
    
      rgxExtract = Null 'Default return value
    
        'NB: if FailOnError is false, returns Null on error
    
     
      If IsMissing(Target) Then
    
        'This is the signal to dispose of oRE
    
        Set oRE = Nothing
    
        Exit Function 'with default value
    
      End If
    
      'Create the RegExp object if necessary
    
      If oRE Is Nothing Then
    
        Set oRE = CreateObject("VBScript.Regexp")
    
      End If
    
     
    
      With oRE
    
        'Check whether the current arguments (other than Target)
    
        'are different from those stored in oRE, and update them
    
        '(thereby recompiling the regex) only if necessary.
    
        If CaseSensitive = .IgnoreCase Then
    
          .IgnoreCase = Not .IgnoreCase
    
        End If
    
        .Global = True
        
        .MultiLine = True
    
    
        If Pattern <> .Pattern Then
    
          .Pattern = Pattern
    
        End If
    
     
    
      'Finally, execute the match
    
        If IsNull(Target) Then
    
          rgxExtract = Null
    
        Else
    
          Set oMatches = oRE.Execute(Target)
    
          If oMatches.count > 0 Then
            retstring = ""
            For j = 0 To oMatches.count - 1
                retstring = retstring & oMatches(j) & ","
            
            Next j
            
            If retstring <> "" Then
              retstring = Left(retstring, Len(retstring) - 1)
              rgxExtract = retstring
              Exit Function
        End If
            
            
    
          Else
    
            rgxExtract = Null
    
          End If
    
        End If
    
      End With
    
     
    
      'Tidy up and normal exit
    
      If Not Persist Then Set oRE = Nothing
    
      Exit Function
    
     
    
    ErrHandler:
    
    
        Set oRE = Nothing
    
    '  End If
    
    End Function
    this is how I call it:

    Code:
        stremail = rgxExtract(strHTML, "(https?://)?(([0-9a-z_!~*'().&=+$%-]+: )?[0-9a-z_!~*'().&=+$%-]+@)?(([0-9]{1,3}\.){3}[0-9]{1,3}|([0-9a-z_!~*'()-]+\.)*([0-9a-z][0-9a-z-]{0,61})?[0-9a-z]\.[a-z]{2,6})(:[0-9]{1,4})?((/?)|(/[0-9a-z_!~*'().;?:@&=+$,%#-]+)+/?)*", , True, False, False)

  18. #18

    Thread Starter
    PowerPoster Nitesh's Avatar
    Join Date
    Mar 2007
    Location
    Death Valley
    Posts
    2,556

    Re: Need help with replace issue

    ok im finally understanding.

    but now this is my output

    www.google.com

    www.google.com and then .au seperately.

    How can I fix that

  19. #19
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: Need help with replace issue

    My VB regex is kinda fuzzy and I won't have time to research... but doesn't it have a property that returns the start position and length of the match? You can then get the text before, after and between the matches (or befoer, after and between the domains) and rebuild strHTML with the tokens inserted with string concatenation (or better alternative to concat) rather than using replace.

    Are there performance issues such as processing lots of HTML text?

  20. #20

    Thread Starter
    PowerPoster Nitesh's Avatar
    Join Date
    Mar 2007
    Location
    Death Valley
    Posts
    2,556

    Re: Need help with replace issue

    Hi again,

    Thank you so much for helping me . Im almost there. I haven't had any performance issues so far. I usually send fairly small bits of text. I will try the regex functions u spoke about and let you know what happens

  21. #21

    Thread Starter
    PowerPoster Nitesh's Avatar
    Join Date
    Mar 2007
    Location
    Death Valley
    Posts
    2,556

    Re: Need help with replace issue

    Hi Leinad31,

    Regex doesn't have those properties

  22. #22
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: Need help with replace issue

    You might be refering to match collection, instead of match object. Did some research http://www.regular-expressions.info/vbscript.html

    For this I would use a specialized called procedure rather than a generic regex wrapper... yes, you'll always pass a pattern, execute, and get match collection but what you do next depends on what your trying to accomplish... a CSV return will not always be applicable. You shouldn't be only returning matches... you also need to return string with the matches replaced as tokens.

    Logic would have been something like this.

    - Get match collection and resize a string array, say arrTmp, from 0 to count

    - Check if there are matches via matchColl.Count property. If so then following is code under IF

    - If there are matches then ; initialize to 1 a variable, say lastPos, for tracking end of last match then iterate thru match collection For arrIdx = 0 To matchColl.Count - 1

    - In loop - based on lastPos and matchColl(arrIdx).FirstIndex property you can Mid(strInput, lastPos, matchColl(arrIdx).FirstIndex - lastPos +1) the part of the string that didn't match. Store this in arrTmp(arrIdx) with token "<!--" & arrIdx & "-->" concatenated. EDIT: actually you can already concat link instead
    - In loop - after storing non-matched string with trailing token to array, also update your CSV if you want to retain that method: retstring = retstring & oMatches(j) & ","
    - In loop - don't forget to update lastPos to character position after match or matchColl(arrIdx).FirstIndex + 1 + matchColl(arrIdx).Length, so next iteration extracts next non-matched text into array arrTmp.
    - After loop, with matches - Concatenate trailing non-matched text to last array element in arrTmp. Check first if lastPos <= strInput length before trying to do a Mid() or Right().

    - This is the ELSE part , or no matches were found for pattern - simply assign strInput to arrTmp(0)

    - END IF

    - strOutput or string with tokens in place would be Join(arrTmp, ""), also return your CSV list of matches.
    Last edited by leinad31; Jun 13th, 2008 at 01:43 PM.

  23. #23
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: Need help with replace issue

    On second thought, don't use rgxExtract()... from what I described above you can already place the links rather than doing a Replace() later. A one size fits all approach your trying with rgxExtract() isn't applicable... or it would have been better off as a wrapper that returns match collection object since further processing of match collection is case to case depending on what your trying to accomplish. Also use of rgxExtract() shifted focus back to basic string manipulation functions on its return value, match collection and match object were forgotten and the information their properties provided were not taken advantage.
    Last edited by leinad31; Jun 13th, 2008 at 01:46 PM.

  24. #24
    Head Hunted anhn's Avatar
    Join Date
    Aug 2007
    Location
    Australia
    Posts
    3,669

    Re: Need help with replace issue

    We have gone a bit too far. There is an easy way and making sure error free:
    * Sort domain names by their length, the longest first. This will make sure if domain name A is a substring of domain B, B will be processed first.
    * Replace each domain name in sorted order with a "token" as mentioned such as "[!@#$%--" & c & "--!@#$%]" where c is the index.
    * After finish all domain names, replace all "tokens" with corespondent link.
    • Don't forget to use [CODE]your code here[/CODE] when posting code
    • If your question was answered please use Thread Tools to mark your thread [RESOLVED]
    • Don't forget to RATE helpful posts

    • Baby Steps a guided tour
    • IsDigits() and IsNumber() functions • Wichmann-Hill Random() function • >> and << functions for VB • CopyFileByChunk

  25. #25
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: Need help with replace issue

    Quote Originally Posted by anhn
    We have gone a bit too far. There is an easy way and making sure error free:
    * Sort domain names by their length, the longest first. This will make sure if domain name A is a substring of domain B, B will be processed first.
    * Replace each domain name in sorted order with a "token" as mentioned such as "[!@#$%--" & c & "--!@#$%]" where c is the index.
    * After finish all domain names, replace all "tokens" with corespondent link.
    He's using regex after all, so might as well take advantage of that fact. We can consider the results of regex as tokens themselves.

  26. #26

    Thread Starter
    PowerPoster Nitesh's Avatar
    Join Date
    Mar 2007
    Location
    Death Valley
    Posts
    2,556

    Re: Need help with replace issue

    Thanks guys,

    I've added this code to my module. Please check it for me. I'm sorting by length. Please point out any possible issues for me.

    Code:
    Public Sub SortByLen(DomArray As Variant)
    Dim j As Long
    Dim jMin As Long
    Dim jMax As Long
    Dim temp As Variant
    Dim blnSwap As Boolean
    
    jMin = LBound(DomArray)
    jMax = UBound(DomArray) - 1
    
    Do
    
    blnSwap = False
    
    For j = jMin To jMax
    
    If Len(DomArray(j)) < Len(DomArray(j + 1)) Then
        
        temp = DomArray(j)
        DomArray(j) = DomArray(j + 1)
        DomArray(j + 1) = temp
        
        blnSwap = True
        
    End If
    
    jMax = jMax - 1
    Next j
    
    
    Loop Until Not blnSwap
    
    End Sub
    It worked well with the sample in my previous post.

  27. #27

    Thread Starter
    PowerPoster Nitesh's Avatar
    Join Date
    Mar 2007
    Location
    Death Valley
    Posts
    2,556

    Re: Need help with replace issue

    Thanks everyone who helped me. I really appreciate it.

  28. #28
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: [RESOLVED] Need help with replace issue

    As long as there are no cases of domains with and without leading www (if http:// was not included) such www.xyz.com, xyz.com

  29. #29
    Head Hunted anhn's Avatar
    Join Date
    Aug 2007
    Location
    Australia
    Posts
    3,669

    Re: [RESOLVED] Need help with replace issue

    Glad to see it works for you now.
    • Don't forget to use [CODE]your code here[/CODE] when posting code
    • If your question was answered please use Thread Tools to mark your thread [RESOLVED]
    • Don't forget to RATE helpful posts

    • Baby Steps a guided tour
    • IsDigits() and IsNumber() functions • Wichmann-Hill Random() function • >> and << functions for VB • CopyFileByChunk

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width