Results 1 to 15 of 15

Thread: is Replace() faster than Regex.replace() ?

  1. #1

    Thread Starter
    Frenzied Member wengang's Avatar
    Join Date
    Mar 2000
    Location
    Beijing, China
    Posts
    1,604

    is Replace() faster than Regex.replace() ?

    hi all.
    I've been implementing regular expressions throughout my project this week, and I've been surprised by how much faster it processes string tasks.

    But this time I found one where Replace is working much faster.

    I have an array of 5000 different characters that will be replaced by 5000 other characters.

    The loop looks like:
    VB Code:
    1. For t = 1 To 5000
    2.             'strText = Replace(strText, Old(t), New(t))
    3.             'strText = strText.Replace(Old(t), New(t))
    4.             strText = Regex.Replace(strText, Old(t), New(t))
    5.   Next
    With the first replace, it's a bit slow (1 minute for a 1mb document)
    The second one does the same file in about 20 seconds
    but with the Regex.replace, after more than 1 minute, I stop the app and it is on t=240 or so.
    Why?
    everywhere else I've tried, the regex.replace was faster.
    Could it be the character encoding?

    Second question....
    I am trying to do a regex.replace (different one) where the find string may span across more than one line of text
    so I tried:
    sText = Regex.Replace(sText, "<ut .+?ut>", "", RegexOptions.Multiline)
    but that does not do it.
    I tried:
    sText = Replace(sText, Chr(10), "")
    sText = Replace(sText, Chr(13), "")
    sText = Regex.Replace(sText, "<ut .+?ut>", "")
    and that does work. But I read that regex could search and replace across lines, and I assumed this parameter was it.
    How can I get this to replace when the target Text looks like this:
    <ut sdfdssd>fdsfsdfsdfsdfsdfsdfsdfsdf
    </ut>dsfsdfsdfsdfsd

    ?

    Thanks
    Wengang
    Last edited by wengang; Jun 5th, 2005 at 09:53 PM.
    Wen Gang, Programmer
    VB6, QB, HTML, ASP, VBScript, Visual C++, Java

  2. #2
    type Woss is new Grumpy; wossname's Avatar
    Join Date
    Aug 2002
    Location
    #!/bin/bash
    Posts
    5,682

    Re: is Replace() faster than Regex.replace() ?

    You should stop using legacy methods like Replace(,,).



    Anyway, Regex is .Net compliant and is a highly sophisticated string processing system, performance is usually good.

    Strings are 'immutable' meaning that if you alter a string, then you are actually making a NEW version of that string (with changes) in memory and replacing the old one with it. This is very costly in CPU time and resources. Regex is pretty quick in most cases if you know how to get the best out of it.

    Further performance gains my be gotten if you use StringBuilder instead of String to replace characters. StringBuilder objects are not immutable, they are mutable so all changes occur to the same object ... much faster.
    I don't live here any more.

  3. #3

    Thread Starter
    Frenzied Member wengang's Avatar
    Join Date
    Mar 2000
    Location
    Beijing, China
    Posts
    1,604

    Re: is Replace() faster than Regex.replace() ?

    I'm trying to make the switch on string methods, just takes getting used to (anyway, conceptually, substring is harder to reason out than right,left, mid)

    clearly I'll have to read up on stringbuilder

    but can you see the problem I have right now?
    The opener and closer for the target string are on different lines and regex is not replacing them
    current workaround is:
    VB Code:
    1. sText = sText.Replace(Chr(10), "*chr10*")
    2. sText = sText.Replace(Chr(13), "*chr13*")
    3. sText = Regex.Replace(sText, "<ut .+?ut>", "")
    4. sText = sText.Replace("*chr10*", Chr(10))
    5. sText = sText.Replace("*chr13*", Chr(13))

    but that just solves the symptom, not the actual problem
    Wen Gang, Programmer
    VB6, QB, HTML, ASP, VBScript, Visual C++, Java

  4. #4
    type Woss is new Grumpy; wossname's Avatar
    Join Date
    Aug 2002
    Location
    #!/bin/bash
    Posts
    5,682

    Re: is Replace() faster than Regex.replace() ?

    You'll need to modify your regex pattern then.

    O'Reilly's Mastering Regular Expressions is a quality book, and it supports .Net too. I recommend that anyway.
    I don't live here any more.

  5. #5
    Lively Member skv_noida's Avatar
    Join Date
    May 2005
    Location
    Noida, India
    Posts
    76

    Re: is Replace() faster than Regex.replace() ?

    is Replace() faster than But i think we should use Regex.replace()




  6. #6
    type Woss is new Grumpy; wossname's Avatar
    Join Date
    Aug 2002
    Location
    #!/bin/bash
    Posts
    5,682

    Re: is Replace() faster than Regex.replace() ?

    Quote Originally Posted by skv_noida
    is Replace() faster than But i think we should use Regex.replace()
    Something tells me you aren't really MS certified like your signature suggests.

    Replace(,,) is deprecated as it is a legacy VB6 function and should be treated as obsolete.

    Use either String.replace() or Regex.
    I don't live here any more.

  7. #7
    Frenzied Member maged's Avatar
    Join Date
    Nov 2002
    Location
    Egypt
    Posts
    1,040

    Re: is Replace() faster than Regex.replace() ?

    Something tells me you aren't really MS certified like your signature suggests
    Agree

  8. #8

    Thread Starter
    Frenzied Member wengang's Avatar
    Join Date
    Mar 2000
    Location
    Beijing, China
    Posts
    1,604

    Re: is Replace() faster than Regex.replace() ?

    I have that book. Haven't started on it yet :P
    but I will
    Wen Gang, Programmer
    VB6, QB, HTML, ASP, VBScript, Visual C++, Java

  9. #9
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339

    Re: is Replace() faster than Regex.replace() ?

    Why do you say Replace is a VB6 function? You don't need the VB6 Runtime files to call the .NET Replace function. Replace comes from the Microsoft.VisualBasic namespace not the Microsoft.VisualBasic.Compatibility namespace. If you reflect it then it just ends up calling the .NET String commands.

    I prefer the newer style using String.Replace or Regex but I wouldn't call it legacy. Just my opinion.

    http://www.builderau.com.au/architec...0269716,00.htm

  10. #10
    Frenzied Member <ABX's Avatar
    Join Date
    Jul 2002
    Location
    Canada eh...
    Posts
    1,622

    Re: is Replace() faster than Regex.replace() ?

    I would think that System.String.Replace() would be much faster than Microsoft.VisualBasic.Strings.Replace() because System.String.Replace() is a internal call. While Microsoft.VisualBasic.Strings.Replace() calls a bunch of methods in Microsoft.VisualBasic.Strings before using String.IndexOf,Join,SubString, and a few other String Methods.

    If you used compiled RegEx I would expect the preformance of RegEx's to Equal or Exceed String.Replace.

    I think wossname is correct in that Microsoft.VisualBasic.* (except for CompilerServices) should be avoided, from my experience the functions they provide can be implemented much better (because they dont have to have the same signatures as VB6 methods) by yourself.

    I'm still very much stuck on MsgBox() but I hardly ever write anything that uses it in a production application.
    Tips:
    • Google is your friend! Search before posting!
    • Name your thread appropriately... "I Need Help" doesn't cut it!
    • Always post your code!!!! We can't read your mind!!! (well, at least most of us!)
    • Allways Include the Name and Line of the Exception (if one is occuring!)
    • If it is relevant state the version of Visual Studio/.Net Framwork you are using (2002/2003/2005)


    If you think I was helpful, rate my post
    IRC Contact: Rizon/xous ChakraNET/xous Freenode/xous

  11. #11

    Thread Starter
    Frenzied Member wengang's Avatar
    Join Date
    Mar 2000
    Location
    Beijing, China
    Posts
    1,604

    Re: is Replace() faster than Regex.replace() ?

    it was faster, about 3 times faster. (ms.vb.replace() vs string.replace) at least in this case.

    The thing about these new string methods is that they all regard the first character in a string as position 0. Coming in from the old right,left,mid,instr methods, it's insanity.

    I finally wrote some public functions called leftstr,rightstr,midstr, etc. that do the .NET string methods but allow me to write them in the old fashion. For sanity's sake, no more counting from zero~!!!!!

    btw, what would you be using in place of msgbox?

    And off topic, I'm still looking for the .net equivalent to UnloadMode to tell me in the Form_CLosing event if the form was closed by a click on the X or not. Anybody?
    VB Code:
    1. Private Sub Form_QueryUnload(Cancel As Integer, UnloadMode As Integer)
    2. If UnloadMode = 0 Then Cancel = 1'prevent closing by X
    3. End Sub
    Wen Gang, Programmer
    VB6, QB, HTML, ASP, VBScript, Visual C++, Java

  12. #12
    Frenzied Member Asgorath's Avatar
    Join Date
    Sep 2004
    Location
    Saturn
    Posts
    2,036

    Re: is Replace() faster than Regex.replace() ?

    Hi
    Set e.Cancel = True in the Form.Closing event.

    Regards
    Jorge
    "The dark side clouds everything. Impossible to see the future is."

  13. #13
    Frenzied Member <ABX's Avatar
    Join Date
    Jul 2002
    Location
    Canada eh...
    Posts
    1,622

    Re: is Replace() faster than Regex.replace() ?

    MessageBox.Show("Message HERE")

    whats wrong with zero-based numbering? I don't mind it, but I guess you get use to it.
    Tips:
    • Google is your friend! Search before posting!
    • Name your thread appropriately... "I Need Help" doesn't cut it!
    • Always post your code!!!! We can't read your mind!!! (well, at least most of us!)
    • Allways Include the Name and Line of the Exception (if one is occuring!)
    • If it is relevant state the version of Visual Studio/.Net Framwork you are using (2002/2003/2005)


    If you think I was helpful, rate my post
    IRC Contact: Rizon/xous ChakraNET/xous Freenode/xous

  14. #14
    Frenzied Member <ABX's Avatar
    Join Date
    Jul 2002
    Location
    Canada eh...
    Posts
    1,622

    Re: is Replace() faster than Regex.replace() ?

    Just out of curiousity I made a test in NUnit (more for a reason to try it out then anything)

    For the test I used an msn log of 35,706 bytes.

    The tests were almost the same except for the use of the two different versions

    VB Code:
    1. <TestFixture()> _
    2. Public Class TestNet
    3.  
    4.     Private Shared Value As String
    5.  
    6.     <SetUp()> _
    7.     Public Sub Init()
    8.         Dim sr As New System.IO.StreamReader("c:\TestSample.txt")
    9.  
    10.         Value = sr.ReadToEnd
    11.  
    12.         sr.Close()
    13.  
    14.     End Sub
    15.  
    16.     <Test(Description:="Implementation of the .Net Version String.Replace()")> _
    17.     Public Sub RunTest()
    18.         For I As Integer = 0 To 254
    19.             Value = Value.Replace(Chr(I), Chr(I + 1))
    20.         Next
    21.     End Sub
    22.  
    23.  
    24.  
    25. End Class
    26.  
    27. <TestFixture()> _
    28. Public Class TestVB6
    29.  
    30.     Private Shared Value As String
    31.  
    32.     <SetUp()> _
    33.     Public Sub Init()
    34.         Dim sr As New System.IO.StreamReader("c:\TestSample.txt")
    35.  
    36.         Value = sr.ReadToEnd
    37.  
    38.         sr.Close()
    39.  
    40.     End Sub
    41.  
    42.     <Test(Description:="Implementation of the VB6 Version Strings.Replace()")> _
    43.     Public Sub RunTest()
    44.         For I As Integer = 0 To 254
    45.             Value = Replace(Value, Chr(I), Chr(I + 1))
    46.         Next
    47.     End Sub
    48.  
    49. End Class

    According to NUnit the results were:
    TestVB6: 199.6588056 seconds
    .Net: 0.1093764 seconds

    Why is this test so biosed ?????

    This can't be correct...

    Edit: tried with a timer class that uses QueryPreformanceCounter and it gave similar values

    .Net time: 0.0708956026534099
    vb6 time: 184.826250263651
    Last edited by <ABX; Jun 7th, 2005 at 03:43 AM.
    Tips:
    • Google is your friend! Search before posting!
    • Name your thread appropriately... "I Need Help" doesn't cut it!
    • Always post your code!!!! We can't read your mind!!! (well, at least most of us!)
    • Allways Include the Name and Line of the Exception (if one is occuring!)
    • If it is relevant state the version of Visual Studio/.Net Framwork you are using (2002/2003/2005)


    If you think I was helpful, rate my post
    IRC Contact: Rizon/xous ChakraNET/xous Freenode/xous

  15. #15

    Thread Starter
    Frenzied Member wengang's Avatar
    Join Date
    Mar 2000
    Location
    Beijing, China
    Posts
    1,604

    Re: is Replace() faster than Regex.replace() ?

    Hey Asgorath,
    thanks, but I have the cancel part no problem
    what I'm trying to determine is whether the form is closing because a subroutine closed it or because the user clicked the form's X.
    in vb6 unloadMode was one of the parameters of the QueryUnload and you could distinguish the two
    but in .net, I haven't found a way to distinguish it.
    Wen Gang, Programmer
    VB6, QB, HTML, ASP, VBScript, Visual C++, Java

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width