is Replace() faster than Regex.replace() ?
hi all.
I've been implementing regular expressions throughout my project this week, and I've been surprised by how much faster it processes string tasks.
But this time I found one where Replace is working much faster.
I have an array of 5000 different characters that will be replaced by 5000 other characters.
The loop looks like:
VB Code:
For t = 1 To 5000
'strText = Replace(strText, Old(t), New(t))
'strText = strText.Replace(Old(t), New(t))
strText = Regex.Replace(strText, Old(t), New(t))
Next
With the first replace, it's a bit slow (1 minute for a 1mb document)
The second one does the same file in about 20 seconds
but with the Regex.replace, after more than 1 minute, I stop the app and it is on t=240 or so.
Why?
everywhere else I've tried, the regex.replace was faster.
Could it be the character encoding?
Second question....
I am trying to do a regex.replace (different one) where the find string may span across more than one line of text
so I tried:
sText = Regex.Replace(sText, "<ut .+?ut>", "", RegexOptions.Multiline)
but that does not do it.
I tried:
sText = Replace(sText, Chr(10), "")
sText = Replace(sText, Chr(13), "")
sText = Regex.Replace(sText, "<ut .+?ut>", "")
and that does work. But I read that regex could search and replace across lines, and I assumed this parameter was it.
How can I get this to replace when the target Text looks like this:
<ut sdfdssd>fdsfsdfsdfsdfsdfsdfsdfsdf
</ut>dsfsdfsdfsdfsd
?
Thanks
Wengang
Re: is Replace() faster than Regex.replace() ?
You should stop using legacy methods like Replace(,,). :)
Anyway, Regex is .Net compliant and is a highly sophisticated string processing system, performance is usually good.
Strings are 'immutable' meaning that if you alter a string, then you are actually making a NEW version of that string (with changes) in memory and replacing the old one with it. This is very costly in CPU time and resources. Regex is pretty quick in most cases if you know how to get the best out of it.
Further performance gains my be gotten if you use StringBuilder instead of String to replace characters. StringBuilder objects are not immutable, they are mutable so all changes occur to the same object ... much faster.
Re: is Replace() faster than Regex.replace() ?
I'm trying to make the switch on string methods, just takes getting used to (anyway, conceptually, substring is harder to reason out than right,left, mid)
clearly I'll have to read up on stringbuilder
but can you see the problem I have right now?
The opener and closer for the target string are on different lines and regex is not replacing them
current workaround is:
VB Code:
sText = sText.Replace(Chr(10), "*chr10*")
sText = sText.Replace(Chr(13), "*chr13*")
sText = Regex.Replace(sText, "<ut .+?ut>", "")
sText = sText.Replace("*chr10*", Chr(10))
sText = sText.Replace("*chr13*", Chr(13))
but that just solves the symptom, not the actual problem
Re: is Replace() faster than Regex.replace() ?
You'll need to modify your regex pattern then.
O'Reilly's Mastering Regular Expressions is a quality book, and it supports .Net too. I recommend that anyway.
Re: is Replace() faster than Regex.replace() ?
is Replace() faster than But i think we should use Regex.replace()
Re: is Replace() faster than Regex.replace() ?
Quote:
Originally Posted by skv_noida
is Replace() faster than But i think we should use Regex.replace()
Something tells me you aren't really MS certified like your signature suggests.
Replace(,,) is deprecated as it is a legacy VB6 function and should be treated as obsolete.
Use either String.replace() or Regex.
Re: is Replace() faster than Regex.replace() ?
Quote:
Something tells me you aren't really MS certified like your signature suggests
Agree
Re: is Replace() faster than Regex.replace() ?
I have that book. Haven't started on it yet :P
but I will
Re: is Replace() faster than Regex.replace() ?
Why do you say Replace is a VB6 function? You don't need the VB6 Runtime files to call the .NET Replace function. Replace comes from the Microsoft.VisualBasic namespace not the Microsoft.VisualBasic.Compatibility namespace. If you reflect it then it just ends up calling the .NET String commands.
I prefer the newer style using String.Replace or Regex but I wouldn't call it legacy. Just my opinion.
http://www.builderau.com.au/architec...0269716,00.htm
Re: is Replace() faster than Regex.replace() ?
I would think that System.String.Replace() would be much faster than Microsoft.VisualBasic.Strings.Replace() because System.String.Replace() is a internal call. While Microsoft.VisualBasic.Strings.Replace() calls a bunch of methods in Microsoft.VisualBasic.Strings before using String.IndexOf,Join,SubString, and a few other String Methods.
If you used compiled RegEx I would expect the preformance of RegEx's to Equal or Exceed String.Replace.
I think wossname is correct in that Microsoft.VisualBasic.* (except for CompilerServices) should be avoided, from my experience the functions they provide can be implemented much better (because they dont have to have the same signatures as VB6 methods) by yourself.
I'm still very much stuck on MsgBox() but I hardly ever write anything that uses it in a production application.
Re: is Replace() faster than Regex.replace() ?
it was faster, about 3 times faster. (ms.vb.replace() vs string.replace) at least in this case.
The thing about these new string methods is that they all regard the first character in a string as position 0. Coming in from the old right,left,mid,instr methods, it's insanity.
I finally wrote some public functions called leftstr,rightstr,midstr, etc. that do the .NET string methods but allow me to write them in the old fashion. For sanity's sake, no more counting from zero~!!!!!
btw, what would you be using in place of msgbox?
And off topic, I'm still looking for the .net equivalent to UnloadMode to tell me in the Form_CLosing event if the form was closed by a click on the X or not. Anybody?
VB Code:
Private Sub Form_QueryUnload(Cancel As Integer, UnloadMode As Integer)
If UnloadMode = 0 Then Cancel = 1'prevent closing by X
End Sub
Re: is Replace() faster than Regex.replace() ?
Hi
Set e.Cancel = True in the Form.Closing event.
Regards
Jorge
Re: is Replace() faster than Regex.replace() ?
MessageBox.Show("Message HERE")
whats wrong with zero-based numbering? I don't mind it, but I guess you get use to it.
Re: is Replace() faster than Regex.replace() ?
Just out of curiousity I made a test in NUnit (more for a reason to try it out then anything)
For the test I used an msn log of 35,706 bytes.
The tests were almost the same except for the use of the two different versions
VB Code:
<TestFixture()> _
Public Class TestNet
Private Shared Value As String
<SetUp()> _
Public Sub Init()
Dim sr As New System.IO.StreamReader("c:\TestSample.txt")
Value = sr.ReadToEnd
sr.Close()
End Sub
<Test(Description:="Implementation of the .Net Version String.Replace()")> _
Public Sub RunTest()
For I As Integer = 0 To 254
Value = Value.Replace(Chr(I), Chr(I + 1))
Next
End Sub
End Class
<TestFixture()> _
Public Class TestVB6
Private Shared Value As String
<SetUp()> _
Public Sub Init()
Dim sr As New System.IO.StreamReader("c:\TestSample.txt")
Value = sr.ReadToEnd
sr.Close()
End Sub
<Test(Description:="Implementation of the VB6 Version Strings.Replace()")> _
Public Sub RunTest()
For I As Integer = 0 To 254
Value = Replace(Value, Chr(I), Chr(I + 1))
Next
End Sub
End Class
According to NUnit the results were:
TestVB6: 199.6588056 seconds
.Net: 0.1093764 seconds
Why is this test so biosed ?????
This can't be correct...
Edit: tried with a timer class that uses QueryPreformanceCounter and it gave similar values
.Net time: 0.0708956026534099
vb6 time: 184.826250263651
Re: is Replace() faster than Regex.replace() ?
Hey Asgorath,
thanks, but I have the cancel part no problem
what I'm trying to determine is whether the form is closing because a subroutine closed it or because the user clicked the form's X.
in vb6 unloadMode was one of the parameters of the QueryUnload and you could distinguish the two
but in .net, I haven't found a way to distinguish it.