|
-
Sep 14th, 2005, 01:59 PM
#61
Re: instr Count
Here you have something that can take a very big string and still be fast (the ones by you and penagate keep getting slower and slower with bigger strings much more easily):
Code:
Option Explicit
Private Declare Sub RtlMoveMemory Lib "ntdll.dll" (ByRef lpvDest As Any, ByRef lpvSrc As Any, ByVal cbLen As Long)
'Private Declare Function VarPtrArray Lib "msvbvm50.dll" Alias "VarPtr" (Var() As Any) As Long
Private Declare Function VarPtrArray Lib "msvbvm60.dll" Alias "VarPtr" (Var() As Any) As Long
Private BufStrHeader(5) As Long
Private BufFindHeader(5) As Long
Private BufStr() As Integer
Private BufFind() As Integer
Private OldStr As Long
Private OldFind As Long
Public Sub SisicInitialize()
BufStrHeader(0) = 1
BufStrHeader(1) = 2
BufStrHeader(4) = &H7FFFFFFF
BufFindHeader(0) = 1
BufFindHeader(1) = 2
BufFindHeader(4) = &H7FFFFFFF
OldStr = 0
OldFind = 0
End Sub
Public Sub SisicTerminate()
RtlMoveMemory ByVal VarPtrArray(BufStr), 0&, 4
RtlMoveMemory ByVal VarPtrArray(BufFind), 0&, 4
End Sub
Public Function SisicM(pStr As Long, pFind As Long, lenStr As Long, lenFind As Long) As Long
Dim i As Long
Dim j As Long
Dim k As Long
Dim l As Long
Dim Flag As Long
If OldStr <> pStr Then
BufStrHeader(3) = pStr
RtlMoveMemory ByVal VarPtrArray(BufStr), VarPtr(BufStrHeader(0)), 4
OldStr = pStr
End If
If OldFind <> pFind Then
BufFindHeader(3) = pFind
RtlMoveMemory ByVal VarPtrArray(BufFind), VarPtr(BufFindHeader(0)), 4
OldFind = pFind
End If
If lenFind = 1 Then
j = BufFind(0)
For i = lenStr - 1 To 0 Step -1
k = BufStr(i)
If k = j Then SisicM = SisicM + 1
Next i
Else
lenFind = lenFind - 1
For i = lenStr - 1 To lenFind Step -1
For j = lenFind To 0 Step -1
k = BufFind(j)
l = BufStr(i - (lenFind - j))
If Not (k = l) Then Flag = 1: Exit For
Next j
If Flag = 0 Then SisicM = SisicM + 1 Else Flag = 0
Next i
End If
End Function
Usage:
VB Code:
SisicInitialize
Do
SisicM StrPtr(SEARCHSTRING), StrPtr(KEYWORD), Len(SEARCHSTRING), Len(KEYWORD)
Loop
SisicTerminate
I haven't even done my main optimizations yet
-
Sep 14th, 2005, 07:25 PM
#62
Re: instr Count
 Originally Posted by |2eM!x
Why couldnt you use isntr to count the number of times a letter or word repeats? Seems the fastest to me..
god im so retarded..i meant use Split() function..damn i wasted a whole bunch of peopls time posting that.
Why couldnt you use SPLIT() to count the number of times a letter or word repeats? Seems the fastest to me..
-
Sep 14th, 2005, 09:13 PM
#63
Re: instr Count
It isn't. Split isn't of the fastest functions around. Otherwise we would use it.
-
Sep 14th, 2005, 09:18 PM
#64
Re: instr Count
Okay, i just thought i was crazy smart and the only one to think of it. Nevermind then
-
Sep 15th, 2005, 02:46 AM
#65
Frenzied Member
Re: instr Count
 Originally Posted by Merri
Here you have something that can take a very big string and still be fast (the ones by you and penagate keep getting slower and slower with bigger strings much more easily):
Code:
Option Explicit
Private Declare Sub RtlMoveMemory Lib "ntdll.dll" (ByRef lpvDest As Any, ByRef lpvSrc As Any, ByVal cbLen As Long)
'Private Declare Function VarPtrArray Lib "msvbvm50.dll" Alias "VarPtr" (Var() As Any) As Long
Private Declare Function VarPtrArray Lib "msvbvm60.dll" Alias "VarPtr" (Var() As Any) As Long
Private BufStrHeader(5) As Long
Private BufFindHeader(5) As Long
Private BufStr() As Integer
Private BufFind() As Integer
Private OldStr As Long
Private OldFind As Long
Public Sub SisicInitialize()
BufStrHeader(0) = 1
BufStrHeader(1) = 2
BufStrHeader(4) = &H7FFFFFFF
BufFindHeader(0) = 1
BufFindHeader(1) = 2
BufFindHeader(4) = &H7FFFFFFF
OldStr = 0
OldFind = 0
End Sub
Public Sub SisicTerminate()
RtlMoveMemory ByVal VarPtrArray(BufStr), 0&, 4
RtlMoveMemory ByVal VarPtrArray(BufFind), 0&, 4
End Sub
Public Function SisicM(pStr As Long, pFind As Long, lenStr As Long, lenFind As Long) As Long
Dim i As Long
Dim j As Long
Dim k As Long
Dim l As Long
Dim Flag As Long
If OldStr <> pStr Then
BufStrHeader(3) = pStr
RtlMoveMemory ByVal VarPtrArray(BufStr), VarPtr(BufStrHeader(0)), 4
OldStr = pStr
End If
If OldFind <> pFind Then
BufFindHeader(3) = pFind
RtlMoveMemory ByVal VarPtrArray(BufFind), VarPtr(BufFindHeader(0)), 4
OldFind = pFind
End If
If lenFind = 1 Then
j = BufFind(0)
For i = lenStr - 1 To 0 Step -1
k = BufStr(i)
If k = j Then SisicM = SisicM + 1
Next i
Else
lenFind = lenFind - 1
For i = lenStr - 1 To lenFind Step -1
For j = lenFind To 0 Step -1
k = BufFind(j)
l = BufStr(i - (lenFind - j))
If Not (k = l) Then Flag = 1: Exit For
Next j
If Flag = 0 Then SisicM = SisicM + 1 Else Flag = 0
Next i
End If
End Function
Usage:
VB Code:
SisicInitialize
Do
SisicM StrPtr(SEARCHSTRING), StrPtr(KEYWORD), Len(SEARCHSTRING), Len(KEYWORD)
Loop
SisicTerminate
I haven't even done my main optimizations yet 
I think you'll find that your algorithm (which is almost identical to penegates, and mine) has just the same linear performance problem associated with longer strings.
"As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality." - Albert Einstein
It's turtles! And it's all the way down
-
Sep 15th, 2005, 05:00 AM
#66
Re: instr Count
No, because your versions used CopyMemory to copy the whole string in memory from place A to B. This doesn't copy anything, it just moves an array starting point to point to the string and start working from there. If you think about it: is it first faster to copy 50 MB of string data to byte array or start directly handling the 50 MB?
The algorithm itself was almost identical because I based it on your code. I just wanted to get it working. The next step would be to apply some nice optimizations and probably try adding Boyer-Moore and a support for TextCompare.
-
Sep 15th, 2005, 05:30 AM
#67
Re: instr Count
Here are the "some nice optimizations". Enjoy.
VB Code:
Public Function SisicM(ByRef pStr As Long, ByRef pFind As Long, ByRef lenStr As Long, ByRef lenFind As Long) As Long
Dim lngA As Long, lngB As Long, lngC As Long
Dim intFind As Integer, intStr As Integer
Dim intFirst As Integer, intLast As Integer, lngCounter As Long, lngFlag As Long
If OldStr <> pStr Then
BufStrHeader(3) = pStr
RtlMoveMemory ByVal VarPtrArray(BufStr), VarPtr(BufStrHeader(0)), 4
OldStr = pStr
End If
If OldFind <> pFind Then
BufFindHeader(3) = pFind
RtlMoveMemory ByVal VarPtrArray(BufFind), VarPtr(BufFindHeader(0)), 4
OldFind = pFind
End If
If lenFind = 1 Then
intFirst = BufFind(0)
For lngA = lenStr - 1 To 0 Step -1
intStr = BufStr(lngA)
If intFirst = intStr Then lngCounter = lngCounter + 1
Next lngA
ElseIf lenFind = 2 Then
lenFind = 1
intFirst = BufFind(0)
intLast = BufFind(lenFind)
For lngA = lenStr - 1 To lenFind Step -1
intStr = BufStr(lngA)
If intLast = intStr Then
intStr = BufStr(lngA - lenFind)
If intFirst = intStr Then lngCounter = lngCounter + 1: lngA = lngA - lenFind
End If
Next lngA
Else
lenFind = lenFind - 1
intFirst = BufFind(0)
intLast = BufFind(lenFind)
For lngA = lenStr - 1 To lenFind Step -1
intStr = BufStr(lngA)
If intLast = intStr Then
intStr = BufStr(lngA - lenFind)
If intFirst = intStr Then
lngC = lngA - 1
For lngB = lenFind - 1 To 1 Step -1
intFind = BufFind(lngB)
intStr = BufStr(lngC)
If Not (intFind = intStr) Then lngFlag = 1: Exit For
lngC = lngC - 1
Next lngB
If lngFlag = 1 Then lngFlag = 0 Else lngCounter = lngCounter + 1: lngA = lngC
End If
End If
Next lngA
End If
SisicM = lngCounter
End Function
Increases the speed remarkably with longer keywords. Didn't benchmark the difference, I only know it thanks to previous experience.
Edit Updated variable names ... and a bug in counting two character long keywords.
Last edited by Merri; Sep 15th, 2005 at 05:52 AM.
-
Sep 15th, 2005, 06:42 AM
#68
Frenzied Member
Re: instr Count
Hmmm. Pretty good stuff.
I wouldn't use the Boyer-Moore algorithm, though. This is good for long strings - especially ones that have multiple searches applied.
But for short ones which are only going to be searched once, then the creation of the distance table make the use of the algorithm inefficient.
I suppose you could optimise further by having some sort of threshold where the Boyer-Moore would take over . . .
(BTW - Nice touch avoiding the copy!)
"As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality." - Albert Einstein
It's turtles! And it's all the way down
-
Sep 15th, 2005, 08:31 AM
#69
Re: instr Count
Boyer-Moore for keywords longer than four or five characters might be ok. I also made quickly a InBArrBMCount (ridiculous function name, but oh well) and it gets faster in keywords of that length.
Edit Oh... and Boyer-Moore for TextCompare is all good starting from three characters long keywords, I guess you can figure out why.
Last edited by Merri; Sep 15th, 2005 at 08:39 AM.
-
Sep 15th, 2005, 08:33 AM
#70
Frenzied Member
Re: instr Count
All you need to do now is to slot the algorithm into a fully inherited object hierarchy and you too can lose all of that perfomance
"As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality." - Albert Einstein
It's turtles! And it's all the way down
-
Sep 15th, 2005, 08:40 AM
#71
Re: instr Count
Object Orientated Inheritable ASM would rock!
... NOT!
-
Sep 15th, 2005, 08:41 AM
#72
Frenzied Member
Re: instr Count
I think someone's already tried that. If I recall correctly they called it MSIL . . .
"As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality." - Albert Einstein
It's turtles! And it's all the way down
-
Sep 15th, 2005, 09:48 AM
#73
Re: instr Count
I love power failures. They have the magical touch of striking right before you have saved the last major changes you've done. Looks like I won't work out Boyer-Moore after all, doing it the second time today would be bothersome.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|