[RESOLVED] Text parser written in VB6 are thousands of times slower than JavaScript.
I'm writing a text/lexical parser now. I translated a piece of JavaScript code into VB6 code, and I found that VB6 code was actually thousands of times slower than JavaScript code. I'm extremely shocked. Attached is the test project.
Note:
(1) SpeedTest_00 is translated from JavaScript code without any optimization.
(2) SpeedTest_01 has a simple optimization, which stores Len(oParser.Source) in a variable.
(4) The test program simply "loops" the chars in the text file. If I add regular expressions to the test project, it will be thousands of times slower than JavaScript.
I'd like to know how to make VB6 code reach or approach the speed of JavaScript without changing the program structure.
Any suggestions and solutions would be greatly appreciated.
Last edited by dreammanor; Mar 2nd, 2020 at 12:58 PM.
Re: Text parser written in VB6 are thousands of times slower than JavaScript.
I think a copy of your source text is being made from your class property for each CharCodeAt call. If you pass m_sSource instead of oParser.Source, then the speed improves dramatically.
Re: Text parser written in VB6 are thousands of times slower than JavaScript.
Originally Posted by jpbro
I think a copy of your source text is being made from your class property for each CharCodeAt call. If you pass m_sSource instead of oParser.Source, then the speed improves dramatically.
Strange, using m_sSource didn't speed things up when I tried it - had to use a local copy of the string instead (Edit: Because I was thinking about cParser.m_sSource, not the form's m_sSource). Either way, there's a definite problem with ByRef passing of the large string between the form and the class, and it's not obvious why there should be a performance hit. Even if the Property Get is going to return a value, surely there is some native way to return a string out of a class ByRef?
Re: Text parser written in VB6 are thousands of times slower than JavaScript.
Originally Posted by yereverluvinuncleber
Can we see the javascript too? Where do you run the .js?
Hi yereverluvinuncleber, here is a simple JavaScript code:
Code:
var Parser = function Parser(src) {
this.pos = 0;
this.source = String(src);
}
Parser.prototype.parse = function parse() {
while (this.pos < this.source.length) {
var ch = this.source.charCodeAt(this.pos);
switch (ch) {
case 32:
++this.pos;
break
case 13:
if (this.source.charCodeAt(this.pos + 1) === 10) {
++this.pos;
}
case 10:
++this.pos;
break
default:
++this.pos;
}
}
}
var s = readFile();
var p = new Parser(s);
p.parse();
I run this js in Chrome. JavaScript is extremely efficient at processing strings, far beyond our imagination.
Re: Text parser written in VB6 are thousands of times slower than JavaScript.
Originally Posted by jpbro
I think a copy of your source text is being made from your class property for each CharCodeAt call. If you pass m_sSource instead of oParser.Source, then the speed improves dramatically.
Oh, now I often forget some important knowledge points. Thanks for your reminding, jpbro.
After replacing oParser.Source with Form1.m_sSource, the speed did increase significantly.
Originally Posted by ahenry
Strange, using m_sSource didn't speed things up when I tried it - had to use a local copy of the string instead (Edit: Because I was thinking about cParser.m_sSource, not the form's m_sSource). Either way, there's a definite problem with ByRef passing of the large string between the form and the class, and it's not obvious why there should be a performance hit. Even if the Property Get is going to return a value, surely there is some native way to return a string out of a class ByRef?
After replacing oParser.Source with Form1.m_sSource, the speed did increase significantly. When cParser.m_sSource is read directly from outside the class, it is still a copy of the string.
Last edited by dreammanor; Mar 3rd, 2020 at 09:30 PM.
Re: Text parser written in VB6 are thousands of times slower than JavaScript.
Originally Posted by Schmidt
Yep, and a second time for each iteration... in SpeedTest_00 (in the Len-call)
Only VB-For-loops evaluate their "loop-bounds" only once,
whereas in Do/While loops, the conditions are re-evaluated every time.
What slows it down in addition, is the call-overhead of all the Method-calls in these loops.
It makes perfect sense, thank you, Olaf.
Originally Posted by Schmidt
Yep - the whole looping should better be placed in the Class itself (using its local vars directly).
The reason I put the whole looping outside the class is mainly to simulate the prototype programming of JavaScript. I want to see if the prototype programming is worth learning from. For example:
Code:
var Parser = function Parser(src) {
this.pos = 0;
this.source = String(src);
}
Parser.prototype.parse = function parse() {
while (this.pos < this.source.length) {
var ch = this.source.charCodeAt(this.pos);
switch (ch) {
case 32:
++this.pos;
break
case 13:
if (this.source.charCodeAt(this.pos + 1) === 10) {
++this.pos;
}
case 10:
++this.pos;
break
default:
++this.pos;
}
}
}
var s = readFile();
var p = new Parser(s);
p.parse();
IMO, the prototype of JavaScript can not only keep Class with only the most basic and core properties, but also leave unlimited possibilities for the future expansion of Class. This is an excellent programming model that is well worth learning. I'd like to know if there are some clever ways to simulate the prototype programming mode of JavaScript. I especially want to hear your suggestions on this question.
Last edited by dreammanor; Mar 3rd, 2020 at 12:01 PM.
Re: Text parser written in VB6 are thousands of times slower than JavaScript.
Originally Posted by Schmidt
If the Class remains as it is (only a Property-Store), then it's better to use an UDT instead.
Ok, ...just did that with:
Code:
Private Type tParser
Pos As Long
Source As String
End Type
at Form-Level...
and then simply replaced:
Dim oParser As tParser
in the test-routines... no speed-issues anymore...
Yes, using the type tParser seems to be a better option. I'm thinking about whether the structure of my text/lexical should change from "class + module" to "type + module".
Re: Text parser written in VB6 are thousands of times slower than JavaScript.
Originally Posted by dreammanor
I'd like to know if there are some clever ways to simulate the prototype programming mode of JavaScript.
It's really hard for "typed" (and compiled) languages, to compete with Scripting
(which allows much more "degrees of freedom" at runtime).
To be able to "dynamically extend Function-slots" with "some method, defined in a plain string" -
that's the realm of Scripting - not of (pre-)compiled languages.
As for your comment about the JS-StringPerformance...
You are quite right...
What the current two "big Browsers" (Firefox and Chrome) deliver there (via their recent "JITers"),
is "full native speed".
I thought, with decently optimized VB6-code (native-compiled, with all options),
we would beat the JS-code you've shown by at least "factor 5 or so" - but no...
Fully native compiled VB6-code only achieves nearly identical timings - not better ones.
Here's my performance-test-results (for a single run over the 100.txt file):
And here the VB6-Code (after moving the Parse-Method into the cParser-Class):
Code:
Option Explicit
Private Declare Sub MemCopy Lib "kernel32" Alias "RtlMoveMemory" (pDst As Any, pSrc As Any, ByVal CB&)
Private this_Pos As Long
Private this_Lines As Long
Private this_Words As Long
Private this_Source() As Integer
Private Sub Class_Initialize()
ReDim this_Source(0 To 0)
End Sub
Public Property Get Pos() As Long: Pos = this_Pos: End Property
Public Property Let Pos(ByVal RHS As Long): this_Pos = RHS: End Property
Public Property Get Lines() As Long: Lines = this_Lines: End Property
Public Property Let Lines(ByVal RHS As Long): this_Lines = RHS: End Property
Public Property Get Words() As Long: Words = this_Words: End Property
Public Property Let Words(ByVal RHS As Long): this_Words = RHS: End Property
Public Property Get Source() As String
If UBound(this_Source) = 0 Then Exit Property
Source = Space$(UBound(this_Source))
MemCopy ByVal StrPtr(Source), this_Source(0), LenB(Source)
End Property
Public Property Let Source(RHS As String)
ReDim this_Source(0 To Len(RHS)) 'note, that this allocates one char more than needed, to ensure a zero-char at the end
Dim pS As Long: pS = StrPtr(RHS)
If pS Then MemCopy this_Source(0), ByVal pS, LenB(RHS)
End Property
Public Property Get LenSource() As Long
LenSource = UBound(this_Source)
End Property
Public Sub Parse()
Dim Pos As Long 'a local loop-var is ca 20% faster than using the Private Var: this_Pos
For Pos = 0 To LenSource - 1
Select Case this_Source(Pos)
Case 32
this_Words = this_Words + 1
Case 13
this_Lines = this_Lines + 1
'skip the next linefeed-char, if there is one
If this_Source(Pos + 1) = 10 Then Pos = Pos + 1
Case 10 'here we only come in, when standalone LF-Chars were used
this_Lines = this_Lines + 1
Case Else 'here we do nothing so far...
End Select
Next
this_Pos = Pos 'reflect this again in this_Pos
End Sub
It implements Line- and (rough) Word-Counting (which I've added to the JS-code as well),
just to ensure that the optimizers were not able to completely "skip" a branch in the select cases.
Re: Text parser written in VB6 are thousands of times slower than JavaScript.
Originally Posted by Schmidt
It's really hard for "typed" (and compiled) languages, to compete with Scripting
(which allows much more "degrees of freedom" at runtime).
To be able to "dynamically extend Function-slots" with "some method, defined in a plain string" -
that's the realm of Scripting - not of (pre-)compiled languages.
As for your comment about the JS-StringPerformance...
You are quite right...
What the current two "big Browsers" (Firefox and Chrome) deliver there (via their recent "JITers"),
is "full native speed".
I understand. Thank you for the explanation, Olaf.
Originally Posted by Schmidt
I thought, with decently optimized VB6-code (native-compiled, with all options),
we would beat the JS-code you've shown by at least "factor 5 or so" - but no...
Fully native compiled VB6-code only achieves nearly identical timings - not better ones.
Here's my performance-test-results (for a single run over the 100.txt file):
Re: Text parser written in VB6 are thousands of times slower than JavaScript.
Originally Posted by Schmidt
And here the VB6-Code (after moving the Parse-Method into the cParser-Class):
Code:
Option Explicit
Private Declare Sub MemCopy Lib "kernel32" Alias "RtlMoveMemory" (pDst As Any, pSrc As Any, ByVal CB&)
Private this_Pos As Long
Private this_Lines As Long
Private this_Words As Long
Private this_Source() As Integer
Private Sub Class_Initialize()
ReDim this_Source(0 To 0)
End Sub
Public Property Get Pos() As Long: Pos = this_Pos: End Property
Public Property Let Pos(ByVal RHS As Long): this_Pos = RHS: End Property
Public Property Get Lines() As Long: Lines = this_Lines: End Property
Public Property Let Lines(ByVal RHS As Long): this_Lines = RHS: End Property
Public Property Get Words() As Long: Words = this_Words: End Property
Public Property Let Words(ByVal RHS As Long): this_Words = RHS: End Property
Public Property Get Source() As String
If UBound(this_Source) = 0 Then Exit Property
Source = Space$(UBound(this_Source))
MemCopy ByVal StrPtr(Source), this_Source(0), LenB(Source)
End Property
Public Property Let Source(RHS As String)
ReDim this_Source(0 To Len(RHS)) 'note, that this allocates one char more than needed, to ensure a zero-char at the end
Dim pS As Long: pS = StrPtr(RHS)
If pS Then MemCopy this_Source(0), ByVal pS, LenB(RHS)
End Property
Public Property Get LenSource() As Long
LenSource = UBound(this_Source)
End Property
Public Sub Parse()
Dim Pos As Long 'a local loop-var is ca 20% faster than using the Private Var: this_Pos
For Pos = 0 To LenSource - 1
Select Case this_Source(Pos)
Case 32
this_Words = this_Words + 1
Case 13
this_Lines = this_Lines + 1
'skip the next linefeed-char, if there is one
If this_Source(Pos + 1) = 10 Then Pos = Pos + 1
Case 10 'here we only come in, when standalone LF-Chars were used
this_Lines = this_Lines + 1
Case Else 'here we do nothing so far...
End Select
Next
this_Pos = Pos 'reflect this again in this_Pos
End Sub
It implements Line- and (rough) Word-Counting (which I've added to the JS-code as well),
just to ensure that the optimizers were not able to completely "skip" a branch in the select cases.
Olaf
Very nice example, which has extremely high performance. Much appreciated.
Now there is another question: If I put all functions and methods into the Parser class, will the Parser class appear too large? It may have tens of thousands of lines of code. If I distribute these codes into modules, I need to pass the entire Parser object as a function/method parameter, and a copy of the large-string properties will appear.
One of the benefits of JavaScript prototyping is that it is easy to split a huge object into small modules.
Last edited by dreammanor; Mar 3rd, 2020 at 11:05 PM.
Re: Text parser written in VB6 are thousands of times slower than JavaScript.
Originally Posted by dreammanor
Hi yereverluvinuncleber, here is a simple JavaScript code:
Code:
Parser.prototype.parse = function parse() {
while (this.pos < this.source.length) {
var ch = this.source.charCodeAt(this.pos);
switch (ch) {
case 32:
++this.pos;
break
case 13:
if (this.source.charCodeAt(this.pos + 1) === 10) {
++this.pos;
}
case 10:
++this.pos;
break
default:
++this.pos;
}
}
}
Doesn't this code have the chance of getting caught in an endless loop? If charCodeAt = 13 and then next charCodeAt <> 10 then the loop counter is never updated and you're stuck.
Re: Text parser written in VB6 are thousands of times slower than JavaScript.
Originally Posted by MarkT
Doesn't this code have the chance of getting caught in an endless loop? If charCodeAt = 13 and then next charCodeAt <> 10 then the loop counter is never updated and you're stuck.
No, it will get updated in the default-case - due to a "fall-through".
(case 13 does not contain a break-statement -
. admittedly easy to miss in these switch-constructs, which I dislike for that reason).
Here is the adapted JS-TestCode I've used in the different Browser-versions:
(as html-content ... just place this in a *.html file - and drag it onto the Browser in question):
Code:
<html>
<body>
<input type="file" id="fileinp" />
<script>
function readFile() {
var reader = new FileReader();
reader.onload = function(){ parsefilecontent(this.result) };
reader.readAsText(this.files[0]);
}
document.getElementById('fileinp').addEventListener('change', readFile);
function parsefilecontent(s){
var t = performance.now();
var p = new Parser(s);
p.parse();
alert(p.pos +' '+ p.words +' '+ p.lines +' '+ (performance.now()-t))
}
var Parser = function Parser(src) {
this.lines = 0;
this.words = 0;
this.pos = 0;
this.source = String(src);
}
Parser.prototype.parse = function parse() {
while (this.pos < this.source.length) {
var ch = this.source.charCodeAt(this.pos);
switch (ch) {
case 32:
++this.words; //increment the words-counter
break
case 13: //handles Mac- and Windows-lineseparators
++this.lines; //increment the lines-counter
//skip the next linefeed-char (if there is one)
if (this.source.charCodeAt(this.pos + 1) === 10) ++this.pos;
break
case 10: //handles Unix-lineseparators
++this.lines; //increment the lines-counter
break
default:
//nothing to do here, so far
}
++this.pos; //better placed outside the switch-block
}
}
</script>
</body>
</html>
Re: Text parser written in VB6 are thousands of times slower than JavaScript.
Originally Posted by Schmidt
No, it will get updated in the default-case - due to a "fall-through".
(case 13 does not contain a break-statement -
. admittedly easy to miss in these switch-constructs, which I dislike for that reason).