|
-
Jun 5th, 2005, 09:46 PM
#1
Thread Starter
Frenzied Member
is Replace() faster than Regex.replace() ?
hi all.
I've been implementing regular expressions throughout my project this week, and I've been surprised by how much faster it processes string tasks.
But this time I found one where Replace is working much faster.
I have an array of 5000 different characters that will be replaced by 5000 other characters.
The loop looks like:
VB Code:
For t = 1 To 5000
'strText = Replace(strText, Old(t), New(t))
'strText = strText.Replace(Old(t), New(t))
strText = Regex.Replace(strText, Old(t), New(t))
Next
With the first replace, it's a bit slow (1 minute for a 1mb document)
The second one does the same file in about 20 seconds
but with the Regex.replace, after more than 1 minute, I stop the app and it is on t=240 or so.
Why?
everywhere else I've tried, the regex.replace was faster.
Could it be the character encoding?
Second question....
I am trying to do a regex.replace (different one) where the find string may span across more than one line of text
so I tried:
sText = Regex.Replace(sText, "<ut .+?ut>", "", RegexOptions.Multiline)
but that does not do it.
I tried:
sText = Replace(sText, Chr(10), "")
sText = Replace(sText, Chr(13), "")
sText = Regex.Replace(sText, "<ut .+?ut>", "")
and that does work. But I read that regex could search and replace across lines, and I assumed this parameter was it.
How can I get this to replace when the target Text looks like this:
<ut sdfdssd>fdsfsdfsdfsdfsdfsdfsdfsdf
</ut>dsfsdfsdfsdfsd
?
Thanks
Wengang
Last edited by wengang; Jun 5th, 2005 at 09:53 PM.
Wen Gang, Programmer
VB6, QB, HTML, ASP, VBScript, Visual C++, Java
-
Jun 6th, 2005, 03:03 AM
#2
Re: is Replace() faster than Regex.replace() ?
You should stop using legacy methods like Replace(,,). 
Anyway, Regex is .Net compliant and is a highly sophisticated string processing system, performance is usually good.
Strings are 'immutable' meaning that if you alter a string, then you are actually making a NEW version of that string (with changes) in memory and replacing the old one with it. This is very costly in CPU time and resources. Regex is pretty quick in most cases if you know how to get the best out of it.
Further performance gains my be gotten if you use StringBuilder instead of String to replace characters. StringBuilder objects are not immutable, they are mutable so all changes occur to the same object ... much faster.
I don't live here any more.
-
Jun 6th, 2005, 04:27 AM
#3
Thread Starter
Frenzied Member
Re: is Replace() faster than Regex.replace() ?
I'm trying to make the switch on string methods, just takes getting used to (anyway, conceptually, substring is harder to reason out than right,left, mid)
clearly I'll have to read up on stringbuilder
but can you see the problem I have right now?
The opener and closer for the target string are on different lines and regex is not replacing them
current workaround is:
VB Code:
sText = sText.Replace(Chr(10), "*chr10*")
sText = sText.Replace(Chr(13), "*chr13*")
sText = Regex.Replace(sText, "<ut .+?ut>", "")
sText = sText.Replace("*chr10*", Chr(10))
sText = sText.Replace("*chr13*", Chr(13))
but that just solves the symptom, not the actual problem
Wen Gang, Programmer
VB6, QB, HTML, ASP, VBScript, Visual C++, Java
-
Jun 6th, 2005, 04:30 AM
#4
Re: is Replace() faster than Regex.replace() ?
You'll need to modify your regex pattern then.
O'Reilly's Mastering Regular Expressions is a quality book, and it supports .Net too. I recommend that anyway.
I don't live here any more.
-
Jun 6th, 2005, 04:34 AM
#5
Lively Member
Re: is Replace() faster than Regex.replace() ?
is Replace() faster than But i think we should use Regex.replace()
-
Jun 6th, 2005, 04:40 AM
#6
Re: is Replace() faster than Regex.replace() ?
 Originally Posted by skv_noida
is Replace() faster than But i think we should use Regex.replace()
Something tells me you aren't really MS certified like your signature suggests.
Replace(,,) is deprecated as it is a legacy VB6 function and should be treated as obsolete.
Use either String.replace() or Regex.
I don't live here any more.
-
Jun 6th, 2005, 06:49 AM
#7
Frenzied Member
Re: is Replace() faster than Regex.replace() ?
Something tells me you aren't really MS certified like your signature suggests
Agree
-
Jun 6th, 2005, 08:23 PM
#8
Thread Starter
Frenzied Member
Re: is Replace() faster than Regex.replace() ?
I have that book. Haven't started on it yet :P
but I will
Wen Gang, Programmer
VB6, QB, HTML, ASP, VBScript, Visual C++, Java
-
Jun 7th, 2005, 12:37 AM
#9
Re: is Replace() faster than Regex.replace() ?
Why do you say Replace is a VB6 function? You don't need the VB6 Runtime files to call the .NET Replace function. Replace comes from the Microsoft.VisualBasic namespace not the Microsoft.VisualBasic.Compatibility namespace. If you reflect it then it just ends up calling the .NET String commands.
I prefer the newer style using String.Replace or Regex but I wouldn't call it legacy. Just my opinion.
http://www.builderau.com.au/architec...0269716,00.htm
-
Jun 7th, 2005, 01:22 AM
#10
Re: is Replace() faster than Regex.replace() ?
I would think that System.String.Replace() would be much faster than Microsoft.VisualBasic.Strings.Replace() because System.String.Replace() is a internal call. While Microsoft.VisualBasic.Strings.Replace() calls a bunch of methods in Microsoft.VisualBasic.Strings before using String.IndexOf,Join,SubString, and a few other String Methods.
If you used compiled RegEx I would expect the preformance of RegEx's to Equal or Exceed String.Replace.
I think wossname is correct in that Microsoft.VisualBasic.* (except for CompilerServices) should be avoided, from my experience the functions they provide can be implemented much better (because they dont have to have the same signatures as VB6 methods) by yourself.
I'm still very much stuck on MsgBox() but I hardly ever write anything that uses it in a production application.
Tips:
- Google is your friend! Search before posting!
- Name your thread appropriately... "I Need Help" doesn't cut it!
- Always post your code!!!! We can't read your mind!!! (well, at least most of us!)
- Allways Include the Name and Line of the Exception (if one is occuring!)
- If it is relevant state the version of Visual Studio/.Net Framwork you are using (2002/2003/2005)
If you think I was helpful, rate my post  IRC Contact: Rizon/xous ChakraNET/xous Freenode/xous
-
Jun 7th, 2005, 02:26 AM
#11
Thread Starter
Frenzied Member
Re: is Replace() faster than Regex.replace() ?
it was faster, about 3 times faster. (ms.vb.replace() vs string.replace) at least in this case.
The thing about these new string methods is that they all regard the first character in a string as position 0. Coming in from the old right,left,mid,instr methods, it's insanity.
I finally wrote some public functions called leftstr,rightstr,midstr, etc. that do the .NET string methods but allow me to write them in the old fashion. For sanity's sake, no more counting from zero~!!!!!
btw, what would you be using in place of msgbox?
And off topic, I'm still looking for the .net equivalent to UnloadMode to tell me in the Form_CLosing event if the form was closed by a click on the X or not. Anybody?
VB Code:
Private Sub Form_QueryUnload(Cancel As Integer, UnloadMode As Integer)
If UnloadMode = 0 Then Cancel = 1'prevent closing by X
End Sub
Wen Gang, Programmer
VB6, QB, HTML, ASP, VBScript, Visual C++, Java
-
Jun 7th, 2005, 03:17 AM
#12
Re: is Replace() faster than Regex.replace() ?
Hi
Set e.Cancel = True in the Form.Closing event.
Regards
Jorge
"The dark side clouds everything. Impossible to see the future is."
-
Jun 7th, 2005, 03:21 AM
#13
Re: is Replace() faster than Regex.replace() ?
MessageBox.Show("Message HERE")
whats wrong with zero-based numbering? I don't mind it, but I guess you get use to it.
Tips:
- Google is your friend! Search before posting!
- Name your thread appropriately... "I Need Help" doesn't cut it!
- Always post your code!!!! We can't read your mind!!! (well, at least most of us!)
- Allways Include the Name and Line of the Exception (if one is occuring!)
- If it is relevant state the version of Visual Studio/.Net Framwork you are using (2002/2003/2005)
If you think I was helpful, rate my post  IRC Contact: Rizon/xous ChakraNET/xous Freenode/xous
-
Jun 7th, 2005, 03:33 AM
#14
Re: is Replace() faster than Regex.replace() ?
Just out of curiousity I made a test in NUnit (more for a reason to try it out then anything)
For the test I used an msn log of 35,706 bytes.
The tests were almost the same except for the use of the two different versions
VB Code:
<TestFixture()> _
Public Class TestNet
Private Shared Value As String
<SetUp()> _
Public Sub Init()
Dim sr As New System.IO.StreamReader("c:\TestSample.txt")
Value = sr.ReadToEnd
sr.Close()
End Sub
<Test(Description:="Implementation of the .Net Version String.Replace()")> _
Public Sub RunTest()
For I As Integer = 0 To 254
Value = Value.Replace(Chr(I), Chr(I + 1))
Next
End Sub
End Class
<TestFixture()> _
Public Class TestVB6
Private Shared Value As String
<SetUp()> _
Public Sub Init()
Dim sr As New System.IO.StreamReader("c:\TestSample.txt")
Value = sr.ReadToEnd
sr.Close()
End Sub
<Test(Description:="Implementation of the VB6 Version Strings.Replace()")> _
Public Sub RunTest()
For I As Integer = 0 To 254
Value = Replace(Value, Chr(I), Chr(I + 1))
Next
End Sub
End Class
According to NUnit the results were:
TestVB6: 199.6588056 seconds
.Net: 0.1093764 seconds
Why is this test so biosed ?????
This can't be correct...
Edit: tried with a timer class that uses QueryPreformanceCounter and it gave similar values
.Net time: 0.0708956026534099
vb6 time: 184.826250263651
Last edited by <ABX; Jun 7th, 2005 at 03:43 AM.
Tips:
- Google is your friend! Search before posting!
- Name your thread appropriately... "I Need Help" doesn't cut it!
- Always post your code!!!! We can't read your mind!!! (well, at least most of us!)
- Allways Include the Name and Line of the Exception (if one is occuring!)
- If it is relevant state the version of Visual Studio/.Net Framwork you are using (2002/2003/2005)
If you think I was helpful, rate my post  IRC Contact: Rizon/xous ChakraNET/xous Freenode/xous
-
Jun 7th, 2005, 08:45 AM
#15
Thread Starter
Frenzied Member
Re: is Replace() faster than Regex.replace() ?
Hey Asgorath,
thanks, but I have the cancel part no problem
what I'm trying to determine is whether the form is closing because a subroutine closed it or because the user clicked the form's X.
in vb6 unloadMode was one of the parameters of the QueryUnload and you could distinguish the two
but in .net, I haven't found a way to distinguish it.
Wen Gang, Programmer
VB6, QB, HTML, ASP, VBScript, Visual C++, Java
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|