Re: The 1001 questions about vbRichClient5 (2020-01-12)
I know you reported that VBPCRE2 is slower than VBScript's RegExp, but I think that's due to inefficiencies in my wrapper code that is designed to emulate VBScript's RegExp objects as a drop-in replacement. I will try to revisit this to speed things up, but I've got a lot of other stuff to do right now.
That said, if you use the declares in MPcre2.bas and make calls directly against PCRE2 without doing stuff I'm doing in my wrapper (like building objects for matches, raising events for matches, etc...) the performance should be much better. Might be worth a try.
Re: The 1001 questions about vbRichClient5 (2020-01-12)
Originally Posted by dreammanor
I need a high-performance RegExp engine
Below is a minimal implementation of my PCRE2 wrapper without all the bells and whistles that are slowing things down in the full VBPCRE2 project. Right now it only has a RegexMatch method (so no splitting on a regex), but I think it would be useful for benchmarking against identical regex matching with VBSCRIPT and JSCRIPT9.
To use it you call the RegexMatch method and pass the text to search and regex to match. It will return a "Match" UDT with the following member:
FoundMatch (Boolean) - True if a match was found, otherwise False.
Match (String) - The matched text (if found)
SubMatchCount (Long) - 0 for no sub-matches, otherwise the count of sub-matches.
SubMatches (String Array) - 1-based array of sub-matched strings. For example, .SubMatches(1) returns the first sub-match, .SubMatches(2) returns the second, etc...
I'd be very interested to see the results of your benchmarks - perhaps you could post the benchmarking code too so we can run comparisons on different systems.
Example Usage:
Code:
With RegexMatch("123abcdefghijklmnop456", "[a-z]+(efg)[a-z]+")
Debug.Print .FoundMatch ' Should print TRUE
Debug.Print .Match ' Should print "abcdefghijklmnop"
Debug.Print .SubMatchCount ' Should print 1
Debug.Print .SubMatches(1) ' Should print "efg"
End With
NOTE: You'll of course need to have pcre2-16.dll in your path for the above code to work (VBPCRE2.dll is not required). You'll now also the RC5 installed since I am doing speed comparisions between PCRE2, VBScript, and JSCRIPT9 via RC5 ActiveScript.
Last edited by jpbro; Jan 13th, 2020 at 02:37 PM.
Reason: Added tests
Re: The 1001 questions about vbRichClient5 (2020-01-12)
I've just updated the minimal PCRE2 source code demo in post #162 with a new RegexSplit function. I've had to modify the Match UDT to include MatchStart and MatchLen parameters to accommodate this new feature.
NOTE: This is all very lightly tested, so there may be bugs. Please report any issues and I will take a look as soon as I can.
Example:
Code:
Debug.Print UBound(RegexSplit("abc123def345ghi", "[0-9]+")) ' Should print 2 for string array with elements "abc", "def", and "ghi"
REMINDER PS: This is getting off-topic for the thread since RC5 is not required/used by the minimal PCRE2 implementation, so maybe the benchmarking stuff should be moved to a new thread and the conversation can be continued there.
Re: The 1001 questions about vbRichClient5 (2020-01-12)
Hi jpbro, I'm writing test code to carefully test and compare your VBPCRE2 and Olaf's JScript9. I'll upload the test results to this thread tomorrow. Much appreciated !
Re: The 1001 questions about vbRichClient5 (2020-01-12)
Looks like I've got a memory leak, so you might want to hold off on any testing for now. I'm trying to figure out where I've gone wrong now and I will post update sources ASAP.
Re: The 1001 questions about vbRichClient5 (2020-01-12)
OK I fixed the leak and updated post #162- I wasn't cleaning up properly in the minimal/standard module version because the full Class based version took care of this in Class_Terminate. That oversight has been corrected.
I've also added a moduel called MTest with a method called TestSpeed. It will run a Regex speed test for 3 seconds for PCRE2 and VBSCRIPT and output the matches/s for each.
The 2 libraries seem to be almost identical in speed for short regexes, but VBSCRIPT starts to lag behind when the strings to search get larger (up to half the speed of PCRE2 in some tests). It looks like the comparisons between libraries may depend quite a bit upon the regexes & search strings you are using.
PS: If you are going to test my code, make sure to compile the project for a more accurate comparison.
Re: The 1001 questions about vbRichClient5 (2020-01-12)
Updated MTest in post #162 to include a RC5 JSCRIPT9 regex test. I'm not sure I'm using RC5 CActiveScript in the most efficient way for this test, it's my first time using it. Perhaps Olaf will chime in with a better test.
Consider the following search string:
String$(1000, "a") & "123def456ghi"
And the following regex string:
"[0-9]+"
I get the following matches/second when compiled for PCRE2, VBScript, and JScript9 via RC5 ActiveScript via
Re: The 1001 questions about vbRichClient5 (2020-01-12)
Hi jpbro, thank you for your test program. Yes, JSScript9 is the fastest, which is strange and exciting. Maybe the JavaScript RegExp parser in Chrome would be faster.
In addition, I encountered some difficulties when writing the RegExp test program. I want to implement the following features in the test program:
Use RegExp to remove all extra spaces, blank lines, and carriage returns and line breaks in JavaScript code to achieve the purpose of compacting JavaScript code.
But no simple solution has yet been found.
Re: The 1001 questions about vbRichClient5 (2020-01-12)
In my test above PCRE2 is clearly the fastest (higher number of matches/s is better). So I think it depends on the workload, although I may not be using JSCRIPT9 efficiently. What sample text and regex patterns are you using that show JSCRIPT9 as the fastest? Can you post some test code?
Re: The 1001 questions about vbRichClient5 (2020-01-12)
Originally Posted by jpbro
In my test above PCRE2 is clearly the fastest (higher number of matches/s is better). So I think it depends on the workload, although I may not be using JSCRIPT9 efficiently. What sample text and regex patterns are you using that show JSCRIPT9 as the fastest? Can you post some test code?
Sorry, I misread the result. I mistakenly thought the biggest value was the slowest. On my computer, the result looks like this:
Edit:
In addition, I'm writing a test program that will use RegExp_Split and RegExp_Replace more. If PCRE2 can perform better in RegExp_Split and RegExp_Replace, then PCRE2 will be the champion.
The test program will use RegExp to remove all extra spaces, blank lines, and carriage returns and line breaks in JavaScript code to achieve the purpose of compressing JavaScript code.
The test program may take several days to complete.
Last edited by dreammanor; Jan 15th, 2020 at 07:44 PM.
Re: The 1001 questions about vbRichClient5 (2020-01-12)
Hi jpbro, I'm sorry that my test program has been delayed for a week. Today I finally took the time to complete a test program.
The test program mainly completed three steps:
(1) Remove the comment lines in the JS code
(2) Remove empty lines in the JS code
(3) Replace the keyword "let" in the JS code with "var"
(4) Remove extra spaces in the JS code (this step has not been completed)
The test results are as follows:
(1) The speed of JScript9 seems to be close to VBScript.RegExp.
(2) JScript9 doesn't remove empty lines. I don't knwo why?
(3) Your VBPCRE2 got an error while processing the regular expression of the comment lines.
Edit:
The test code moved to the post #175.
Edit2:
After adding the parameter "/mg" to RegExp, now JScript9 can remove empty lines.
Last edited by dreammanor; Jan 22nd, 2020 at 10:12 PM.
Re: The 1001 questions about vbRichClient5 (2020-01-12)
Originally Posted by dreammanor
(3) Your VBPCRE2 got an error while processing the regular expression of the comment lines.
This updated MRegex.bas should fix the error you were getting:
Code:
Option Explicit
Public Type Matches
FoundMatch As Boolean
MatchStart As Long
MatchLen As Long
Match As String
SubMatchCount As Long
SubMatches() As String
End Type
Private Declare Sub win32_CopyMemory Lib "kernel32.dll" Alias "RtlMoveMemory" (ByRef Destination As Any, ByRef Source As Any, ByVal Length As Long)
Public Function RegexSplit(ByVal p_TextToSplit As String, Optional ByVal p_RegexToMatch As String) As String()
Dim la_Split() As String
Dim l_NextIndex As Long
Do
With RegexMatch(p_TextToSplit, p_RegexToMatch)
If .FoundMatch Then
' Found a match
' Make sure we have enough space for text in our result arrya
If l_NextIndex = 0 Then
ReDim la_Split(99)
ElseIf l_NextIndex > UBound(la_Split) Then
ReDim Preserve la_Split(l_NextIndex * 2)
End If
la_Split(l_NextIndex) = Left$(p_TextToSplit, .MatchStart - 1)
p_TextToSplit = Mid$(p_TextToSplit, .MatchStart + .MatchLen)
l_NextIndex = l_NextIndex + 1
Else
' No match found, exit loop
Exit Do
End If
End With
Loop
If l_NextIndex = 0 Then
ReDim la_Split(0)
la_Split(0) = p_TextToSplit
Else
If Len(p_TextToSplit) Then
If UBound(la_Split) < l_NextIndex Then ReDim Preserve la_Split(l_NextIndex)
la_Split(l_NextIndex) = p_TextToSplit
l_NextIndex = l_NextIndex + 1
End If
ReDim Preserve la_Split(l_NextIndex - 1)
End If
RegexSplit = la_Split
End Function
Public Function RegexMatch(ByVal p_TextToSearch As String, Optional ByVal p_RegexToMatch As String) As Matches
' Returns a Match UDT
' If .Matched = False then no matches were found
' If .Matched = True then:
' A match was found (with possible submatches depending on the regex).
' The full matched text will be stored in .Match as a string
' If there are sub-matches, then SubMatch count will be > 0
' You can retrieve sub-matches from the .SubMatches member using one-based indexing
' so .SubMatches(1) will return sub-match #1, .SubMatches(2) will return sub-match #2, etc...
' If .SubMatchCount = 0 then .SubMatches will not be dimensioned, so do not try to access it.
Dim l_CompiledContextHandle As Long
Dim l_CompiledRegexHandle As Long
Dim l_MatchDataHandle As Long
Dim l_MatchContextHandle As Long
Dim l_ErrorNumber As Long
Dim l_ErrorDesc As String
Dim l_MatchCount As Long
Dim l_OvectorPtr As Long
Dim la_Ovector() As Long
Dim l_StrPtr As Long
Dim l_ErrorCode As Long
Dim l_ErrorPosition As Long
Dim l_MatchStart As Long
Dim l_MatchLen As Long
Dim ii As Long ' Loop counter
On Error GoTo ErrorHandler
l_CompiledContextHandle = pcre2_compile_context_create(0)
If l_CompiledContextHandle = 0 Then Err.Raise "Could not compile PCRE context! Last DLL Error: " & Err.LastDllError
l_CompiledRegexHandle = pcre2_compile(StrPtr(p_RegexToMatch), Len(p_RegexToMatch), PCRE_CO_MULTILINE, l_ErrorCode, l_ErrorPosition, l_CompiledContextHandle)
If l_CompiledRegexHandle = 0 Then Err.Raise vbObjectError, , "Could not compile regex! Regex: " & p_RegexToMatch & vbNewLine & "Errorcode: " & l_ErrorCode & ", Error Position: " & l_ErrorPosition
l_MatchDataHandle = pcre2_match_data_create_from_pattern(l_CompiledRegexHandle, 0)
If l_MatchDataHandle = 0 Then Err.Raise vbObjectError, , "Could not allocate match data! Last DLL Error: " & Err.LastDllError
l_StrPtr = StrPtr(p_TextToSearch)
If l_StrPtr = 0 Then l_StrPtr = StrPtr("")
l_MatchCount = pcre2_match(l_CompiledRegexHandle, l_StrPtr, Len(p_TextToSearch), 0, 0, l_MatchDataHandle, l_MatchContextHandle)
Select Case l_MatchCount
Case PCRE2_ERROR_NOMATCH
' No matches, that's normal :)
Case Is > 0
' Number of matches, store information about matches
l_OvectorPtr = pcre2_get_ovector_pointer(l_MatchDataHandle)
If l_OvectorPtr = 0 Then
' Shouldn't happen!
Err.Raise vbObjectError, , "Ovector pointer could not be retrieved!"
End If
win32_CopyMemory l_MatchStart, ByVal l_OvectorPtr, 4
win32_CopyMemory l_MatchLen, ByVal (l_OvectorPtr + 4), 4
l_MatchLen = l_MatchLen - l_MatchStart
ReDim la_Ovector(2 * l_MatchCount - 1)
win32_CopyMemory la_Ovector(0), ByVal l_OvectorPtr, 2 * l_MatchCount * 4
With RegexMatch
.FoundMatch = l_MatchCount
.MatchStart = la_Ovector(0) + 1
.MatchLen = la_Ovector(1) - la_Ovector(0)
.Match = Mid$(p_TextToSearch, .MatchStart, .MatchLen)
.SubMatchCount = l_MatchCount - 1
If l_MatchCount > 1 Then
ReDim .SubMatches(1 To l_MatchCount - 1)
For ii = 1 To l_MatchCount - 1
l_MatchStart = la_Ovector(ii * 2) + 1
l_MatchLen = la_Ovector(ii * 2 + 1) - l_MatchStart + 1
If l_MatchStart > 0 And l_MatchLen > 0 Then
.SubMatches(ii) = Mid$(p_TextToSearch, l_MatchStart, l_MatchLen)
End If
Next ii
End If
End With
Case Else
' Uhoh! We need to handle these
Err.Raise vbObjectError - l_MatchCount, , "PCRE Match Error: " & l_MatchCount
End Select
Cleanup:
'On Error Resume Next
' Free match data if necessary
If l_MatchContextHandle <> 0 Then pcre2_match_context_free l_MatchContextHandle: l_MatchContextHandle = 0
If l_MatchDataHandle <> 0 Then pcre2_match_data_free l_MatchDataHandle: l_MatchDataHandle = 0
If l_CompiledRegexHandle <> 0 Then pcre2_code_free l_CompiledRegexHandle: l_CompiledRegexHandle = 0
'Free compile context before exiting
If l_CompiledContextHandle <> 0 Then pcre2_compile_context_free l_CompiledContextHandle: l_CompiledContextHandle = 0
If l_ErrorNumber <> 0 Then
If IsPcre2ErrorCode(l_ErrorNumber) Then
l_ErrorDesc = l_ErrorDesc & vbNewLine & "PCRE2 Error Message: " & GetPcre2ErrorMessage(l_ErrorNumber)
Else
If IsPcre2ErrorCode(vbObjectError - l_ErrorNumber) Then
l_ErrorDesc = l_ErrorDesc & vbNewLine & "PCRE2 Error Message: " & GetPcre2ErrorMessage(vbObjectError - l_ErrorNumber)
End If
End If
On Error GoTo 0
Err.Raise l_ErrorNumber, , l_ErrorDesc
End If
Exit Function
ErrorHandler:
l_ErrorNumber = Err.Number
l_ErrorDesc = Err.Description
Debug.Assert False
Resume Cleanup
End Function
Private Function IsPcre2ErrorCode(ByVal p_ErrorCode As Long) As Boolean
IsPcre2ErrorCode = (p_ErrorCode <= [_PCRE_RC_ERROR_FIRST] And p_ErrorCode >= [_PCRE_RC_ERROR_LAST])
End Function
Private Function GetPcre2ErrorMessage(ByVal p_ErrorCode As Long) As String
Dim l_BufferLength As Long
Dim l_Buffer As String
Dim l_MessageLength As Long
l_BufferLength = 256
Do
l_Buffer = Space$(l_BufferLength)
l_MessageLength = pcre2_get_error_message(p_ErrorCode, StrPtr(l_Buffer), l_BufferLength)
If l_MessageLength < 0 Then
Select Case l_MessageLength
Case PCRE_RC_ERROR_NOMEMORY
' Buffer too small
l_BufferLength = l_BufferLength * 2
Case PCRE_RC_ERROR_BADDATA
' Bad error code
Exit Do
Case Else
Debug.Assert False
Exit Do
End Select
End If
Loop While l_MessageLength < 0
If l_MessageLength < 0 Then
GetPcre2ErrorMessage = "Unknown error #" & p_ErrorCode & ", PCRE2 error message result #" & l_MessageLength
Else
GetPcre2ErrorMessage = Left$(l_Buffer, l_MessageLength)
End If
End Function
I also added the PCRE_CO_MULTILINE flag to the regex compile so that it will detect and remove empty lines. JSCRIPT9 regex probably has some way to configure it for multiline mode too.
VBPCRE2 is slower than the other two engines. I guess the reason is that I used the Join function for the returned string array. I'll try to test without the Join function.
In addition, the performance of RC5.ActiveScript has been improved. I guess the reason is that all three steps are completed inside ActiveScript and output the results at one time. Maybe Olaf could further optimize my test code.
Re: The 1001 questions about vbRichClient5 (2020-01-12)
Originally Posted by dreammanor
VBPCRE2 is slower than the other two engines. I guess the reason is that I used the Join function for the returned string array. I'll try to test without the Join function.
Try again using the following wrapper for the pcre2_substitute method with the PCRE2_REPLACE_GLOBAL flag set (so all the work takes place inside pcre2_16.dll):
Code:
Public Function RegexSubstitute(ByVal p_TextToSearch As String, ByVal p_ReplaceWithText As String, ByVal p_RegexToMatch As String) As String
' Returns a string with requested substitutions made (if found).
Dim l_CompiledContextHandle As Long
Dim l_CompiledRegexHandle As Long
Dim l_MatchDataHandle As Long
Dim l_MatchContextHandle As Long
Dim l_ErrorNumber As Long
Dim l_ErrorCode As Long
Dim l_ErrorDesc As String
Dim l_ErrorPosition As Long
Dim l_OutputBuffer As String
Dim l_OutputBufferLength As Long
Dim l_ReplaceResult As Long
Dim l_StrPtrSearch As Long
Dim l_StrPtrReplace As Long
On Error GoTo ErrorHandler
l_CompiledContextHandle = pcre2_compile_context_create(0)
If l_CompiledContextHandle = 0 Then Err.Raise "Could not compile PCRE context! Last DLL Error: " & Err.LastDllError
l_CompiledRegexHandle = pcre2_compile(StrPtr(p_RegexToMatch), Len(p_RegexToMatch), PCRE_CO_MULTILINE, l_ErrorCode, l_ErrorPosition, l_CompiledContextHandle)
If l_CompiledRegexHandle = 0 Then Err.Raise vbObjectError, , "Could not compile regex! Regex: " & p_RegexToMatch & vbNewLine & "Errorcode: " & l_ErrorCode & ", Error Position: " & l_ErrorPosition
l_MatchDataHandle = pcre2_match_data_create_from_pattern(l_CompiledRegexHandle, 0)
If l_MatchDataHandle = 0 Then Err.Raise vbObjectError, , "Could not allocate match data! Last DLL Error: " & Err.LastDllError
l_StrPtrSearch = StrPtr(p_TextToSearch)
If l_StrPtrSearch = 0 Then l_StrPtrSearch = StrPtr("")
' Prepare the output buffer (start at 2X size for a better chance to avoid insufficient space)
l_OutputBuffer = Space$(Len(p_TextToSearch) * 2)
l_OutputBufferLength = Len(l_OutputBuffer)
l_StrPtrReplace = StrPtr(p_ReplaceWithText)
If l_StrPtrReplace = 0 Then l_StrPtrReplace = StrPtr("")
' Attempt substitution
Do
l_ReplaceResult = pcre2_substitute(l_CompiledRegexHandle, l_StrPtrSearch, Len(p_TextToSearch), 0, PCRE2_SUBSTITUTE_GLOBAL, l_MatchDataHandle, 0, l_StrPtrReplace, Len(p_ReplaceWithText), StrPtr(l_OutputBuffer), l_OutputBufferLength)
Select Case l_ReplaceResult
Case PCRE_RC_ERROR_NOMEMORY
' Buffer too small - increase size.
l_OutputBufferLength = Len(l_OutputBuffer) * 2
If l_OutputBufferLength > 0 Then
' No replacements to make
l_OutputBuffer = Space$(l_OutputBufferLength)
End If
Case Is >= 0
' Finished
Case Else
Err.Raise vbObjectError - l_ReplaceResult, , "Replace error #" & l_ReplaceResult
End Select
Loop While (l_ReplaceResult = PCRE_RC_ERROR_NOMEMORY) And (l_OutputBufferLength > 0)
RegexSubstitute = Left$(l_OutputBuffer, l_OutputBufferLength)
NoErrorCleanup:
On Error Resume Next
l_ErrorNumber = 0
l_ErrorDesc = ""
Cleanup:
On Error Resume Next
' Free match data and context if necessary
If l_MatchContextHandle <> 0 Then pcre2_match_context_free l_MatchContextHandle: l_MatchContextHandle = 0
If l_MatchDataHandle <> 0 Then pcre2_match_data_free l_MatchDataHandle: l_MatchDataHandle = 0
If l_CompiledRegexHandle <> 0 Then pcre2_code_free l_CompiledRegexHandle: l_CompiledRegexHandle = 0
'Free compile context before exiting
If l_CompiledContextHandle <> 0 Then pcre2_compile_context_free l_CompiledContextHandle: l_CompiledContextHandle = 0
If l_ErrorNumber <> 0 Then
If IsPcre2ErrorCode(l_ErrorNumber) Then
l_ErrorDesc = l_ErrorDesc & vbNewLine & "PCRE2 Error Message: " & GetPcre2ErrorMessage(l_ErrorNumber)
Else
If IsPcre2ErrorCode(vbObjectError - l_ErrorNumber) Then
l_ErrorDesc = l_ErrorDesc & vbNewLine & "PCRE2 Error Message: " & GetPcre2ErrorMessage(vbObjectError - l_ErrorNumber)
End If
End If
On Error GoTo 0
Err.Raise l_ErrorNumber, , l_ErrorDesc
End If
Exit Function
ErrorHandler:
Debug.Assert False
l_ErrorNumber = Err.Number
l_ErrorDesc = Err.Description
Resume Cleanup
End Function
And the updated test code:
Code:
Private Function CompressJS_VBPCRE2() As String
Dim sSource As String
Dim arrSplits() As String
sSource = mSource
sSource = RegexSubstitute(sSource, "", mPattern_Comment) '--- remove comments
sSource = RegexSubstitute(sSource, "", mPattern_EmptyLine)
sSource = RegexSubstitute(sSource, "", mPattern_Let)
CompressJS_VBPCRE2 = sSource
End Function
I think you'll be pleasantly surprised
Last edited by jpbro; Jan 22nd, 2020 at 09:12 PM.
Reason: Code cleanup
Re: The 1001 questions about vbRichClient5 (2020-01-12)
Not too shabby! I thought you'd like that
There are 2 other things I'm investigating to see if they can improve things further. First, I've been meaning to understand how to create a TLB for a long time, now I have a good reason to dig into it. Second, PCRE2 supports a JIT feature that I think can improve performance. I have to do a bit more work to understand how to use it.
Re: The 1001 questions about vbRichClient5 (2020-01-12)
Is this thread still about RichEdit problems? If so, I can offer one more bug that seems to be present in all DLL versions: EM_STREAMIN is very, very slow when replacing an existing long text with a new long text.
Re: The 1001 questions about vbRichClient5 (2020-01-12)
Originally Posted by jj2007
Is this thread still about RichEdit problems? If so, I can offer one more bug that seems to be present in all DLL versions: EM_STREAMIN is very, very slow when replacing an existing long text with a new long text.
Hi jj2007, The RE I mentioned above means Regular-Expression, not RichEdit.
Last edited by dreammanor; Feb 8th, 2020 at 04:04 AM.
Re: The 1001 questions about vbRichClient5 (2020-02-08)
Just for the experiment I just impl a VbPeg based JS tokenizer w/ the requirements of the RE "competition" but also one that strips whitespace outside of string literals (minifies JS) and it turned quite fast
Re: Question 38: A strange problem with ActiveScript
Originally Posted by dreammanor
Question 38: A strange problem with ActiveScript
If a function in ActiveScript has no input parameters, the function cannot return the function value, but it can be obtained using CallByName. E.g:
Code:
Set SC = New_c.ActiveScript("JScript9", False, False)
SC.AddCode "function Test() { " & _
" return arguments.length;" & _
"}"
Set CO = SC.CodeObject
MsgBox CO.Test()
This not a problem I can fix easily in RC5...
It is a general problem with the CodeObject of the MS ActiveScripting-support in JScript-mode...
Also the MS-ScriptControl shows this behaviour in JScript-mode, as the following test shows...:
Code:
Private Sub Form_Load()
With CreateObject("ScriptControl")
.Language = "JScript"
.AddCode "function Test(){ return 42 }"
MsgBox .CodeObject.Test() 'this returns the "whole Function" instead of 42
MsgBox .Run("Test") 'whilst this will return the correct answer
End With
With CreateObject("ScriptControl")
.Language = "VBScript"
.AddCode "Function Test(): Test = 42: End Function"
MsgBox .CodeObject.Test() 'this will return the correct answer
MsgBox .Run("Test") 'this will return the correct answer
End With
End Sub
As an explanation might serve, that JS allows to "pass functions around as normal Objects",
when you leave out the parentheses...
Here some DemoCode, which shows that the "weird behaviour" (as seen in the JS-CodeObject-call without arguments),
makes perfect sense in that "function-passing-context"...
Code:
Private Sub Form_Load()
Dim SC As cActiveScript, CO As Object, MyObj As Object
Set SC = New_c.ActiveScript("JScript9", False, False)
Set CO = SC.CodeObject
'let's say, you have a predefined Object in your JSCode like this one (MyObj)
SC.AddCode "var MyObj = {}; " & _
" MyObj.OnClick=null;" & _
" MyObj.fireOnClick = function(info){ if (this.OnClick) return this.OnClick(info)}"
'it contains an EventHandler-slot (OnClick) which is not (yet) defined
'but also (in the last line) a pre-implemented Method which "fires the Event to the Handler" (if there is one)
Set MyObj = CO.MyObj 'now, for convenience at the VB-COM-side, we store this JS-Obj in a VB-Object-Variable
'...
'later on in your code, you might want to define an Event-Handler for MyObj.OnClick
'so you will add a JS-function which implements such an EventHandler with your own specific code
SC.AddCode "function MyOnClickHandler(info){ return 'from inside my handler: ' + info }"
'what you can do now (and where the previously seen "weird behaviour" comes into play),
'is a "direct assignment of the function itself" (via the CodeObject)
Set MyObj.OnClick = CO.MyOnClickHandler '<- so here the "no passed arguments return the whole function"-case makes sense
MsgBox MyObj.fireOnClick("Hello World") 'test it
End Sub
I've marked the line which shows "why this stuff is, as it is" in dark-red above...
Re: The 1001 questions about vbRichClient5 (2020-02-08)
Originally Posted by wqweto
Just for the experiment I just impl a VbPeg based JS tokenizer w/ the requirements of the RE "competition" but also one that strips whitespace outside of string literals (minifies JS) and it turned quite fast
Nice...
FWIW, here's a performance-improvement for the ActiveScript-JSCode based replacements
(which also corrects a few errors in the reg-expressions):
With these changes it is now faster than the (VB)Scripting-Regex-Code -
(also note, that "JScript" works - surprisingly - a bit faster than the newer "JScript9" engine).
Re: Question 38: A strange problem with ActiveScript
Originally Posted by Schmidt
This not a problem I can fix easily in RC5...
It is a general problem with the CodeObject of the MS ActiveScripting-support in JScript-mode...
Also the MS-ScriptControl shows this behaviour in JScript-mode, as the following test shows...:
...
...
As an explanation might serve, that JS allows to "pass functions around as normal Objects",
when you leave out the parentheses...
Here some DemoCode, which shows that the "weird behaviour" (as seen in the JS-CodeObject-call without arguments),
makes perfect sense in that "function-passing-context"...
...
...
I've marked the line which shows "why this stuff is, as it is" in dark-red above...
HTH
Olaf
I learned some valuable knowledge from you again. Much appreciated.
Last edited by dreammanor; Feb 9th, 2020 at 06:44 AM.
Re: The 1001 questions about vbRichClient5 (2020-02-08)
Originally Posted by wqweto
Just for the experiment I just impl a VbPeg based JS tokenizer w/ the requirements of the RE "competition" but also one that strips whitespace outside of string literals (minifies JS) and it turned quite fast
...
...
VbPeg produced a 500 lines recursive-descent parser as a VB6 class that is included in the attachment.
cheers,
</wqw>
It's wonderful, thank you very much, wqweto. I'll take a closer look at why your cCompressPEG has such high performance. I'll upload a new test program in a while.
Also, I'd like to know if your PEG could generate a RegExp parser instead of VBScript.RegExp? Thanks again.
Last edited by dreammanor; Feb 9th, 2020 at 06:43 AM.
Re: The 1001 questions about vbRichClient5 (2020-02-08)
Here is the new test program. The new test program uses Olaf's better and more accurate regular expression patterns,
Olaf's regular expression patterns not only improves the performance of RC5.ActiveScript, but also improves the performance of VBScript.RegExp. Now, the performance difference between RC5.ActiveScript and VBScript.RegExp is very small.
But it is strange that the speed of jpbro's method is reduced by 3 times after using the new regular expression patterns.
Re: The 1001 questions about vbRichClient5 (2020-02-08)
@dreammanor, I can't reproduce your timings (though I still come in last place unsurprisingly considering the competition).
Using your latest demo compiled (that uses pcre2_substitute method instead of the older, slower Split and loop approach that @wqweto's timings were based on) I'm getting the following:
Re: The 1001 questions about vbRichClient5 (2020-02-08)
Originally Posted by jpbro
@dreammanor, I can't reproduce your timings (though I still come in last place unsurprisingly considering the competition).
Using your latest demo compiled (that uses pcre2_substitute method instead of the older, slower Split and loop approach that @wqweto's timings were based on) I'm getting the following:
Re: The 1001 questions about vbRichClient5 (2020-02-08)
This is really strange. I restarted my Windows 10 computer and tested again with the same result as post #190. That is, after using the new regular expression patterns, your method is 3-4 times slower on my computer. I guess it's because the new regular expression patterns splits the Comment pattern into three separate patterns.
Note: I also tested it on an XP computer, and the test results were the same.
Re: The 1001 questions about vbRichClient5 (2020-02-08)
Originally Posted by dreammanor
Also, I'd like to know if your PEG could generate a RegExp parser instead of VBScript.RegExp?
I don't think it's possible as most regexp engines use back-tracking while PEG does not (although it can recurse). Usually to get rid of back-tracking one has to re-write some of the rules of the grammar in question. There are some whitepapers that attempt mapping regexp to PEG augmented grammars, that is PEG with some more features (not the original B. Ford implementation). VbPeg is already enhanced PEG generator as it understands custom actions in VB6 and allows special rules for error handling but it's not meant as regexp replacement (it cannot produce parsers at run-time).
Also note, that it's very hard to come up with a regexp that can remove whitespace *only* outside of string literals in JS. It is probably at this point when most folks give up on regexp and start exploring real lexers/parsers as their needs outgrow regexp capabilties.
Re: The 1001 questions about vbRichClient5 (2020-02-08)
Originally Posted by wqweto
I don't think it's possible as most regexp engines use back-tracking while PEG does not (although it can recurse). Usually to get rid of back-tracking one has to re-write some of the rules of the grammar in question. There are some whitepapers that attempt mapping regexp to PEG augmented grammars, that is PEG with some more features (not the original B. Ford implementation). VbPeg is already enhanced PEG generator as it understands custom actions in VB6 and allows special rules for error handling but it's not meant as regexp replacement (it cannot produce parsers at run-time).
Understand. Thank you for your detailed explanation.
Originally Posted by wqweto
Also note, that it's very hard to come up with a regexp that can remove whitespace *only* outside of string literals in JS. It is probably at this point when most folks give up on regexp and start exploring real lexers/parsers as their needs outgrow regexp capabilties.
Agree completely. I'm already trying to write a lexer/parser. But I still need to use some regular expressions in the lexer/parser.
Since VB6.Array has some limitations (details :Add an element to an arbitrary array), I need to encapsulate an Array object of my own. Now I have two options:
(1) Encapsulating an array of VB variables, for example: Private mItems () as Variant ...
(2) Encapsulate vbRichClient5.ArrayList, for example: Private mItems As vbRichClient5.cArrayList
This array object will be used a lot as a base object (to replace VB.Array). I wonder if this array object is built on the basis of vbRichClient5.ArrayList, will it take up a lot of memory? Hope to hear Olaf's suggestions, thanks.
MyArrayObject:
Code:
Option Explicit
Private mItems() As Variant
'Or
'Private mItems As vbRichClient5.cArrayList '???
Public Property Get Item(ByVal Index As Long) As Variant
Item = mItems(Index)
End Property
Public Property Let Item(ByVal Index As Long, NewVal As Variant)
mItems(Index) = NewVal
End Property
Public Function NewEnum() As IUnknown
Set NewEnum = mItems.[_NewEnum]
End Function
...
...
...
...
Private Sub Class_Initialize()
mItems = Array()
End Sub
Last edited by dreammanor; Feb 14th, 2020 at 03:24 AM.
Re: The 1001 questions about vbRichClient5 (2020-02-14)
Originally Posted by dreammanor
Question 39: VB.Array or RC5.ArrayList?
Since VB6.Array has some limitations..., I need to encapsulate an Array object of my own. Now I have two options:
(1) Encapsulating an array of VB variables, for example: Private mItems () as Variant ...
(2) Encapsulate vbRichClient5.ArrayList, for example: Private mItems As vbRichClient5.cArrayList
This array object will be used a lot as a base object (to replace VB.Array). I wonder if this array object is built on the basis of vbRichClient5.ArrayList, will it take up a lot of memory? Hope to hear Olaf's suggestions, thanks.
MyArrayObject:
Code:
Option Explicit
Private mItems() As Variant
'Or
'Private mItems As vbRichClient5.cArrayList '???
Public Property Get Item(ByVal Index As Long) As Variant
Item = mItems(Index)
End Property
Public Property Let Item(ByVal Index As Long, NewVal As Variant)
mItems(Index) = NewVal
End Property
Public Function NewEnum() As IUnknown
Set NewEnum = mItems.[_NewEnum]
End Function
Private Sub Class_Initialize()
mItems = Array()
End Sub
If it's only about additional "For Each support", then the cArrayList can now be used directly (without extra-wrapper-class),
because this is supported now in verion 5.0.75...
Otherwise (when you have to write your own wrapper anyways, due to some other missing methods on cArrayList),
then the wrapping of a "normal VB-Array" should do just fine, then saving a few extra-method-calls when accessing its contents).