@LaVolpe: Does building in-memory index with rs.Fields(sSortField).Properties("Optimize").Value = True has any effect on rs.Sort = sSortField?
cheers,
</wqw>
It does, but just creating the index is a large speed hit on large recordsets. Additionally, if optimized, then the index seems to slow down new inserts; probably because the index needs to be modified/rebuilt. And here is the kicker. In a disconnected recordset, the optimized setting does not transfer to a clone, i.e., rs.Clone.
Edited: FYI. Setting Optimize on my tests recordset (500K records). It took nearly 12 seconds. However, searching afterwards was significantly improved using Find/Filter methods. I tried this method first, but was disappointed that the property didn't carry over to clones.
P.S. Don't want to set that before populating a large recordset
Without Optimize, inserting 500K records: 6 seconds
With Optimize set before inserting 500K records: 127 seconds
Update: And now I'll have to re-look at the Optimize property again. Creating a clone after the rs was optimized did not carry over the property. But using the clone for searching/filtering must use the optimized indexes too since there was significant improvement there too -- that's good news. Unfortunately, the bad news is that ADO is kicking out "Unspecified Error" when optimizing multiple fields and/or mixing in clones. That error breaks navigation (EOF & BOF become True while RecordCount=500K). The size of the recordset might be an issue also? Too many unknowns to confidently use the Optimize property.
Last edited by LaVolpe; Dec 31st, 2018 at 12:09 PM.
Insomnia is just a byproduct of, "It can't be done"
@Elroy and Dreammanor
When you guys find the time, can you run this against your very large project(s) and tell me the speed difference?
Test Steps
1. Create new folder and backup your current scanner code into that folder
2. Then download this zip and overwrite the files in that new folder
3. Open both projects
4. In your original project, add these lines. The replacement files in zip below already have this code. But don't overwrite your original scanner project files.
in frmMain.pvLoadProject just before the line: If m_Project.ParseFile(sFileName, Me) = False Then
Code:
Dim t As Double: t = Timer
in the same procedure just after the "End If" block, add this
Code:
Debug.Print "time: "; Timer - t; " seconds"
5. Now scan your large project using both versions. The immediate result will indicate the time taken
DO NOT run validations with the patched files below -- I need to make changes to them also. However, I want to know what kind of improvement we are looking at before I continue. In my tests, nearly 50% faster, but on much smaller projects. I don't own a 500+ code file project
Thanks to both of you in advance...
Last edited by LaVolpe; Dec 31st, 2018 at 09:27 PM.
Insomnia is just a byproduct of, "It can't be done"
Okay, I just re-downloaded the ZIP from post #1 as my starting place. And then, I followed all your directions in post #122. Here are my results from the two runs:
Code:
time: 100.393343750002 seconds (old) VBP
Time 81.6086250000008 seconds (new) VBP
So, you've made some improvement, but I'm not sure it's enough to get terribly excited about. Personally, either way, with the benefits your project provides, I don't see this as a huge hindrance. But maybe it's become a challenge for you.
Also, this was run on my main VBP file (which has many modules and is quite large). I've also got a VBG file which loads all the ActiveX DLLs as well. But I almost never load the project that way. But I suppose I could run your scanner against that VBG file if you'd like. Just let me know and I'll do it.
Happy New Year,
Elroy
EDIT: Hmmm, I'm not sure I can explain this. I went ahead and ran both of your scanner versions on my VBG file. This time, I ran the new version first (just because my Explorer was sitting in that folder). Each time, I just opened your VBP and then ran the scan (from the IDE). This time, here are the results:
Code:
Time 84.7072187500016 seconds (new) VBG
time: 86.6345937500009 seconds (old) VBG
I've got no idea why the old way on VBG was faster than old way on VBP. That makes no sense to me. I copied the form from one of the scans just to show you that it actually loaded all the projects:
Also, the difference between the two scanner versions was much less this time. I've got no explanation for that either.
Last edited by Elroy; Jan 1st, 2019 at 11:25 AM.
Any software I post in these forums written by me is provided “AS IS” without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. Please understand that I’ve been programming since the mid-1970s and still have some of that code. My contemporary VB6 project is approaching 1,000 modules. In addition, I have a “VB6 random code folder” that is overflowing. I’ve been at this long enough to truly not know with absolute certainty from whence every single line of my code has come, with much of it coming from programmers under my employ who signed intellectual property transfers. I have not deliberately attempted to remove any licenses and/or attributions from any software. If someone finds that I have inadvertently done so, I sincerely apologize, and, upon notice and reasonable proof, will re-attach those licenses and/or attributions. To all, peace and happiness.
Question. Is your original scanner project the most recent one I updated in post #1 last week? Reason why I ask is that I had to make many modifications for DBCS and that version may be slower than the older version you have if you haven't donwloaded it? Your timings are curious if you are using the the recent version, I did expect a significant improvement between that version & the modified routines I gave you in my previous post.
Insomnia is just a byproduct of, "It can't be done"
I've got two versions of your scanner stored away (as ZIP files) in my library somewhere. However, I didn't use either of those for my tests.
To start (at, let's say 9am Central time, a few minutes ago [actually, a couple of hours now]), I just re-downloaded the project in post #1. And, from there, I performed the tests you outlined in post #122.
Would you like me to do something different? You've given me AMPLE help through the years (including this project), that I'd be delighted to do more tests for you. Please just outline what you'd like.
Take Care,
Elroy
Last edited by Elroy; Jan 1st, 2019 at 12:10 PM.
Any software I post in these forums written by me is provided “AS IS” without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. Please understand that I’ve been programming since the mid-1970s and still have some of that code. My contemporary VB6 project is approaching 1,000 modules. In addition, I have a “VB6 random code folder” that is overflowing. I’ve been at this long enough to truly not know with absolute certainty from whence every single line of my code has come, with much of it coming from programmers under my employ who signed intellectual property transfers. I have not deliberately attempted to remove any licenses and/or attributions from any software. If someone finds that I have inadvertently done so, I sincerely apologize, and, upon notice and reasonable proof, will re-attach those licenses and/or attributions. To all, peace and happiness.
Yes, can you use the most recent version as the base-line test and then use it again (a copy) with the replacement files posted above as the potential improvement test. Thank you.
Insomnia is just a byproduct of, "It can't be done"
Last edited by LaVolpe; Dec 25th, 2018 at 11:55 AM.
I feel like that's what I did. By "most recent version", you mean the version in post #1, correct?
I'll do it again though. I'll make a subsequent post with the repeat results. Sorry if I'm mis-understanding something.
Any software I post in these forums written by me is provided “AS IS” without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. Please understand that I’ve been programming since the mid-1970s and still have some of that code. My contemporary VB6 project is approaching 1,000 modules. In addition, I have a “VB6 random code folder” that is overflowing. I’ve been at this long enough to truly not know with absolute certainty from whence every single line of my code has come, with much of it coming from programmers under my employ who signed intellectual property transfers. I have not deliberately attempted to remove any licenses and/or attributions from any software. If someone finds that I have inadvertently done so, I sincerely apologize, and, upon notice and reasonable proof, will re-attach those licenses and/or attributions. To all, peace and happiness.
Ok, maybe all of the scans in post #123 were against the VBG file, as it does appear before the VBP file.
However, this time, I was VERY careful. (I thought I was before, but it is New Years Day.)
This time, ALL of my scans were on my primary VBP file, with the following showing after the scan:
Now, this time, here's the order I did things:
1) Open the scanner VBP file from post #1 (with timing code inserted).
2) Run, and then open/scan my primary VBP file, and let it report timings.
3) Immediately re-scan this file, again reporting timings.
4) Close down that scanner project.
5) Open the scanner VBP file from post #1 with the files from post #122 unzipped and used as replacements.
6) Run, and then open/scan my primary VBP file, and let it report timings.
7) Immediately re-scan this file, again reporting timings.
Here are all the timings with my annotations:
Code:
time: 38.7498124999984 seconds (original, first scan)
time: 39.5207500000033 seconds (original, re-scan)
Time 38.5684374999983 seconds (modified, first scan)
Time 40.4113437499982 seconds (modified, re-scan)
I'm also going to try it with your originally posted version of the scanner. I'm hoping those timing lines will go in at the same spot.
Any software I post in these forums written by me is provided “AS IS” without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. Please understand that I’ve been programming since the mid-1970s and still have some of that code. My contemporary VB6 project is approaching 1,000 modules. In addition, I have a “VB6 random code folder” that is overflowing. I’ve been at this long enough to truly not know with absolute certainty from whence every single line of my code has come, with much of it coming from programmers under my employ who signed intellectual property transfers. I have not deliberately attempted to remove any licenses and/or attributions from any software. If someone finds that I have inadvertently done so, I sincerely apologize, and, upon notice and reasonable proof, will re-attach those licenses and/or attributions. To all, peace and happiness.
Ok, going back to the first version you posted (the one that shows the modules in the treeview as it's scanning), there's a definite improvement (and this time, I'm certain I scanned my VBP file).
Code:
time: 53.474374999998 seconds
And, even after it reported that, there was a good 30 second delay while it refreshed the treeview before it released the thread back to me (err, Windows).
If you want me to test the timing of some interim version (between your first version and the one found in post #1), you'd better attach it to a post, just so we know we're on the same page.
Take Care,
Elroy
Any software I post in these forums written by me is provided “AS IS” without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. Please understand that I’ve been programming since the mid-1970s and still have some of that code. My contemporary VB6 project is approaching 1,000 modules. In addition, I have a “VB6 random code folder” that is overflowing. I’ve been at this long enough to truly not know with absolute certainty from whence every single line of my code has come, with much of it coming from programmers under my employ who signed intellectual property transfers. I have not deliberately attempted to remove any licenses and/or attributions from any software. If someone finds that I have inadvertently done so, I sincerely apologize, and, upon notice and reasonable proof, will re-attach those licenses and/or attributions. To all, peace and happiness.
Hmm, confusing for me. You should have seen some improvement nonetheless. When I ran it on a relatively small group file (6 projects), it was nearly a 50% improvement. Simply removing the recordset sorting on string/text fields should be a no-brainer speed increase. Confused.
Also, you have 507 code files, dreammanor has 537 (I think) but about 3x the executable lines. Your project takes ~40 seconds, his takes 425 seconds. If we simply extrapolate 3x your 40 seconds, that's 120 for his project not 425. Even that cannot be a fair comparison, because extra procedure code doesn't necessarily slow down the process a lot in the pre-validation phase of the scan. Something else is in play here.
And yes, 40 seconds for 500+ files doesn't seem too unreasonable. Don't know how many total code files are in your vbg project, but having 7 projects, 80 seconds doesn't seem too bad either. Still looking at speeding up the validation process though.
Edited: think we were posting against each other. 53 seconds reduced to 39/40 seconds is a fair improvement. Thanks for the clarification. Regarding the 425 seconds for dreammanor's project, that still has me confused.
I do have another interim version, but it's significantly different code than any previous version. I'm still playing with it as a replacement that should really improve validation speed. That version wasn't in any of our conversation to this point
Last edited by LaVolpe; Jan 1st, 2019 at 12:54 PM.
Insomnia is just a byproduct of, "It can't be done"
And, even after it reported that, there was a good 30 second delay while it refreshed the treeview before it released the thread back to me (err, Windows).
Still? I thought I resolved that. If you run it again, don't expand any nodes and hit the IDE pause button, can you report back the following?
In the immediate window, do this: ? frmMain.tvItems(0).Nodes.Count
I'm expecting that to be a small value of 14-ish. And if so, why would Windows add a 30-second delay for 14 nodes? Before, when the entire tree was populated, you'd have potentially 1000's of nodes and then I can understand it, but not with a dozen or so.
Last edited by LaVolpe; Jan 1st, 2019 at 01:24 PM.
Insomnia is just a byproduct of, "It can't be done"
Still? I thought I resolved that. If you run it again, don't expand any nodes and hit the IDE pause button, can you report back the following?
In the immediate window, do this: ? frmMain.tvItems(0).Nodes.Count
I'm expecting that to be a small value of 14-ish. And if so, why would Windows add a 30-second delay for 14 nodes? Before, when the entire tree was populated, you'd have potentially 1000's of nodes and then I can understand it, but not with a dozen or so.
No no, in post #129, I reached back and grabbed the original-first-version of your project. You have resolved that in subsequent versions.
But, to be clear, everything in post #128 was done with your latest.
Also, regarding differences between dreammanor and me, I'm running on a pretty good machine:
CPU: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz.
GPU (if it matters for this): NVIDIA GeForce GTX 1070.
C drive: HGST WD Travelstar 1TB, 7200rpm, 12ms seek.
RAM: 16GB
That might explain at least some of the difference.
EDIT1: Also, I'll bundle up the main VBP and send it to you, if you like. It'd take a bit to get it to where you could load and execute it, but I don't think you'd need that to run your scanner. Just let me know.
Last edited by Elroy; Jan 1st, 2019 at 05:52 PM.
Any software I post in these forums written by me is provided “AS IS” without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. Please understand that I’ve been programming since the mid-1970s and still have some of that code. My contemporary VB6 project is approaching 1,000 modules. In addition, I have a “VB6 random code folder” that is overflowing. I’ve been at this long enough to truly not know with absolute certainty from whence every single line of my code has come, with much of it coming from programmers under my employ who signed intellectual property transfers. I have not deliberately attempted to remove any licenses and/or attributions from any software. If someone finds that I have inadvertently done so, I sincerely apologize, and, upon notice and reasonable proof, will re-attach those licenses and/or attributions. To all, peace and happiness.
Certain Windows API's return short type (vb integer), like GetKeyState instead of int (vb long).
I presume that project scanner currently has no actual checking against api declarations?
Correct and that False positive is explained in the description (help menu). It is more or less a warning/reminder to double check. There is no intent to identify all APIs, across all libraries, that legitimately use those in their signature. Problem with some code is that stuff that is copied from elsewhere is hardly ever checked by users for proper API parameter/return vartypes. Again, meant just as a reminder.
Insomnia is just a byproduct of, "It can't be done"
@Elroy and Dreammanor
When you guys find the time, can you run this against your very large project(s) and tell me the speed difference?
Test Steps
1. Create new folder and backup your current scanner code into that folder
2. Then download this zip and overwrite the files in that new folder
3. Open both projects
4. In your original project, add these lines. The replacement files in zip below already have this code. But don't overwrite your original scanner project files.
in frmMain.pvLoadProject just before the line: If m_Project.ParseFile(sFileName, Me) = False Then
Code:
Dim t As Double: t = Timer
in the same procedure just after the "End If" block, add this
Code:
Debug.Print "time: "; Timer - t; " seconds"
5. Now scan your large project using both versions. The immediate result will indicate the time taken
DO NOT run validations with the patched files below -- I need to make changes to them also. However, I want to know what kind of improvement we are looking at before I continue. In my tests, nearly 50% faster, but on much smaller projects. I don't own a 500+ code file project
Thanks to both of you in advance...
I'm back. Sorry for the late reply.
I added test time t2:
Code:
Private Sub pvLoadProject(Source As Variant)
' called from menu Load/Reload
Dim sFileName As String
If VarType(Source) = vbString Then ' else existing database imported, but not yet displayed
sFileName = Source
If LenB(sFileName) = 0 Then
Dim cBrowser As New CmnDialogEx
With cBrowser
.Filter = "Projects|*.vbp;*.vbg|VB Files|*.cls;*.ctl;*.dsr;*.bas;*.frm;*.pag;*.dob;*.vbp;*.vbg|" & _
"Classes|*.cls|Designers|*.dsr|Forms|*.frm|Modules|*.bas|Property Pages|*.pag|" & _
"User Controls|*.ctl|User Documents|*.dob|All Files|*.*"
.FlagsDialog = DLG__BaseOpenDialogFlags
End With
If cBrowser.ShowOpen(Me.hWnd, , , m_OpenGUID) = False Then Exit Sub
sFileName = cBrowser.FileName
Set cBrowser = Nothing
mnuFile(miRescan).Tag = sFileName
End If
End If
tvItems(0).Nodes.Clear ' clear
pvSetContextMenuItems False ' disable some context menu items
If mnuValidate(miToggleView).Checked = True Then _
Call mnuValidate_Click(miToggleView) ' show primary treeview
tvItems(0).Nodes.Clear: tvItems(0).Enabled = False ' make ready to populate
tvItems(1).Nodes.Clear: tvItems(1).Enabled = False
m_State = 0 ' reset
pvUpdateStatus chrsLoad, "Scanning Project"
If LenB(sFileName) <> 0 Then ' else recordset to be processed
m_State = 2
Set m_Project = New clsPrjFile
m_Project.ValidateEvents = mnuOpt(0).Checked
Dim t As Doublet = Timer
If m_Project.ParseFile(sFileName, Me) = False Then
Set m_Project = Nothing ' prevent options like saving/exporting
If sFileName = mnuFile(miRescan).Tag Then mnuFile(miRescan).Tag = vbNullString
pvUpdateStatus vbNullString, vbNullString
If (m_State And 4) = 0 Then
m_State = 0
Else
m_State = 0
Unload Me
End If
Exit Sub
End If
Debug.Print "Time " & Timer - t; " seconds"
m_State = m_State Xor 2
End If
tvItems(0).Enabled = True
tvItems(1).Enabled = True
Dim t2 As Doublet2 = Timer
Set m_Records = m_Project.Records
If Not m_Data Is Nothing Then m_Data.Close
Set m_Data = m_Records.Data.Clone
pvUpdateStatus "Project scanned...", "Preparing tree view"
pvItemsAdd_RootNodes (LenB(sFileName) = 0)
If tvItems(0).Nodes.Count <> 0 Then
Set tvItems(0).SelectedItem = tvItems(0).Nodes(1)
tvItems(0).SelectedItem.EnsureVisible
End If
pvUpdateStatus vbNullString, vbNullString
With m_Project.Records.Data
.Filter = SetQuery(constFldType, qryIs, ptSourceFile)
sFileName = UCase$(Right$(.Fields(constFldAttr2).Value, 3))
If sFileName = "VBG" Then
mnuCtx(miVbg).Visible = True
mnuCtx(miVbp).Caption = "Show VBP File"
Else
mnuCtx(miVbg).Visible = False
mnuCtx(miVbp).Caption = "Show " & sFileName & " File"
End If
mnuValidate(miVbgUnhide).Visible = mnuCtx(miVbg).Visible
.Filter = adFilterNone
End With
Debug.Print "Time2 " & Timer - t2; " seconds"
End Sub
My VBP is stored in the U-Disk (USB flash disk). I copied the VBP from the U-Disk to the hard disk and re-tested it. The test results did not change much:
In addition, there are 507 files in Elroy's project and 567 in my project. IMO, the time of scanning depends not only on the speed of the computer and the number of project files, but also on the number of lines of source code. Krool's ComCtlsDemo.vbp has 107 files,which is one-fifth of my project, but it takes only one-twentieth of the time of my project.
Last edited by dreammanor; Jan 4th, 2019 at 05:54 AM.
Hi LaVolpe, use Seek method instead, it is much faster (100 times faster or more).
If it is still not fast enough, then consider to switch to DAO.
Thanx for the thought. But if MSDN documentation is correct, this will not be an option as it requires server-side cursor where a disconnected recordset (no DB) is used in this project. A client-side cursor is required for disconnected recordsets.
Seek: This method is supported only with server-side cursors. Seek is not supported when the Recordset object's CursorLocation property value is adUseClient.
Insomnia is just a byproduct of, "It can't be done"
Thanx for the thought. But if MSDN documentation is correct, this will not be an option as it requires server-side cursor where a disconnected recordset (no DB) is used in this project. A client-side cursor is required for disconnected recordsets.
Ah. Then I would consider creating a temporary database file. It is not a lie that Seek is extremely faster.
Ah. Then I would consider creating a temporary database file. It is not a lie that Seek is extremely faster.
Already tried that. The overhead of inserting records into the DB was significantly slower than using the disconnected RS. A sample project took over 2 seconds vs just 1/2 second for a RS. And yes, I used SQL Insert to add records, not the recordset.AddNew method.
P.S. I did get a PM from dreamManor. The main speed issue is parsing the procedures in his case. When he skipped that part of the process, the time needed was about 1/2. However, that was just to find where the main speed hit was occurring. I do have a faster routine to get through the procedures for the initial scan, but haven't offered that up yet. I will shortly and ask dreamManor to be kind enough to run that one for me. It's not ready yet though.
Insomnia is just a byproduct of, "It can't be done"
My experience is that adding an index does not cause much overhead, and if the index is the primary key then the index is already there.
But I use DAO, I don't know with ADO.
DreamManor, I will continue to finish my latest version of this project.
But I don't think I'll be able to significantly improve the speed for you. In one of your recent posts, you mentioned that running Krool's 107 file ComCtlsDemo project, it took 21.20 seconds on your system. On my system, which is 10 years old, it took 8.2 seconds. I don't know how to resolve that large difference and do not think I can do it with code.
The improvements I made to prevent filtering on string/text recordset fields appears to work really well. In that same post, you mentioned that, with the previous version, it took 1.62 seconds after the project was scanned to the point of displaying the nodes. After you used the updated files, that time dropped to 0.19 seconds (8.5x faster). That part of the code is where most of the filtering is occurring.
Last edited by LaVolpe; Jan 5th, 2019 at 09:07 AM.
Insomnia is just a byproduct of, "It can't be done"
For grins, I took Krool's latest (here) and scanned them with the lastest scanner (in OP #1 of this thread, and with the timer lines from post #122 above), and got these results:
Code:
time: 6.03321875000256 seconds
EDIT: I re-scanned a few times and got the following:
Any software I post in these forums written by me is provided “AS IS” without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. Please understand that I’ve been programming since the mid-1970s and still have some of that code. My contemporary VB6 project is approaching 1,000 modules. In addition, I have a “VB6 random code folder” that is overflowing. I’ve been at this long enough to truly not know with absolute certainty from whence every single line of my code has come, with much of it coming from programmers under my employ who signed intellectual property transfers. I have not deliberately attempted to remove any licenses and/or attributions from any software. If someone finds that I have inadvertently done so, I sincerely apologize, and, upon notice and reasonable proof, will re-attach those licenses and/or attributions. To all, peace and happiness.
Elroy, obviously your PC is much faster than mine. Regarding faster times when re-running that parsing... Think that has to do with disc read/write smart caching. I see that all the time too.
I have identified the bottleneck in the code. Time needed is directly related to how many "event"-related items and procedures exist in the project. In Krool's project, there are 950 items tested for events: 730 controls, 98 classes, 90 Implementations, and 32 variables declared WithEvents. Among those 950 items there were 1,435 total events found.
This means 950 queries were executed using a "Like [itemName]_%" clause to find those 1,435 events, i.e., Form1_Load. Designing a different/faster way of looking for events, reducing number of queries executed, would significantly increase performance. Have to think about this because there is no direct/easy way to look at a procedure name and say "Hey, that's an event for some object/item", other than positively excluding a procedure, as an event, if it doesn't have an underscore in its name. The remaining procedures, those with an underscore, still need to be verified as an event to prevent stuff like Sub MyCustomProc_VersionA and Sub MyCustomProc_VersionZ being interpreted as events.
Edited: Of those 950 items, there are only 85 unique "items". For example, there could be 25 different textbox items in the project, but each are of the same class: VB.TextBox. Looking at designing a faster method that requires 85 queries vs 950 queries. Doable, but gonna need to track some additional stuff to get there and compare overall speed when the new overhead is introduced.
Last edited by LaVolpe; Jan 5th, 2019 at 01:16 PM.
Insomnia is just a byproduct of, "It can't be done"
DreamManor, I will continue to finish my latest version of this project.
But I don't think I'll be able to significantly improve the speed for you. In one of your recent posts, you mentioned that running Krool's 107 file ComCtlsDemo project, it took 21.20 seconds on your system. On my system, which is 10 years old, it took 8.2 seconds. I don't know how to resolve that large difference and do not think I can do it with code.
The improvements I made to prevent filtering on string/text recordset fields appears to work really well. In that same post, you mentioned that, with the previous version, it took 1.62 seconds after the project was scanned to the point of displaying the nodes. After you used the updated files, that time dropped to 0.19 seconds (8.5x faster). That part of the code is where most of the filtering is occurring.
LaVolpe, You have done a lot for us, much appreciate.
Your project has been great, and for the vast majority of people, its performance is good enough.
I got a lot from this forum and the only thing I can do for you is to do some testing for you. I'm very happy that my big projects can be good test samples. I like to do some extreme tests, for example: load a file containing 1 million lines of code into my golang code editor, although this won't happen in reality.
I always have a lot of work, I hope that one day I can make my own contribution to this forum just like you.
In addition, we have an internal development rule that the performance of the computer used by the developer must be lower than that of the users, so that the software we develop can run smoothly on the user's computer. This is why prjScan scans for a long time on the computer I used for development.
On my test computer (win10), prjScan scans Krool's latest ComCtlsDemo.vbp very fast: (re-scanned a few times)
Code:
Time 1.53790624999965 seconds
Time 1.34859375000087 seconds
Time 1.42934375000186 seconds
Time 1.40837499999907 seconds
Time 1.37512499999866 seconds
Edit:
jpbro's VB6SourceProcessor scans very fast on the computer I used for development, but I haven't had time to carefully research and compare the source code of VB6SourceProcessor and prjScan.
Last edited by dreammanor; Jan 5th, 2019 at 11:23 PM.
Hmmm, I just scanned Krool's latest ComCtlsDemo.vbp yet again, with the latest LaVolpe scanned (but before the replacement files in post #122), and got the following:
Everything looks fine on my end. It was my understanding that, when the replacement files were inserted, this was just to be used for a speed-test.
Last edited by Elroy; Jan 6th, 2019 at 09:28 AM.
Any software I post in these forums written by me is provided “AS IS” without warranty of any kind, expressed or implied, and permission is hereby granted, free of charge and without restriction, to any person obtaining a copy. Please understand that I’ve been programming since the mid-1970s and still have some of that code. My contemporary VB6 project is approaching 1,000 modules. In addition, I have a “VB6 random code folder” that is overflowing. I’ve been at this long enough to truly not know with absolute certainty from whence every single line of my code has come, with much of it coming from programmers under my employ who signed intellectual property transfers. I have not deliberately attempted to remove any licenses and/or attributions from any software. If someone finds that I have inadvertently done so, I sincerely apologize, and, upon notice and reasonable proof, will re-attach those licenses and/or attributions. To all, peace and happiness.
@dreamManor. If you still have that one line commented out to prevent processing procedures (previous PM), that would explain your speed increase from 24 seconds to 1 second. Commenting that line was only to help locate speed bumps in the code; not processing procedures is not really an option.
Regarding no statistics, there is a small bug which would prevent them from being added to the tree in some cases. I found that bug last week. However, in your case, I think the code version you are running has been modified (due to testing) too much. Not only do you not have statistics shown, you don't have the base path shown, nor any imported DLLs shown.
Originally Posted by dreamManor
jpbro's VB6SourceProcessor scans very fast on the computer I used for development, but I haven't had time to carefully research and compare the source code of VB6SourceProcessor and prjScan.
Can't really compare jpbro's project with this one. I haven't tried his project because it uses references I don't want to install and requires backing up files to prevent corruption. However, from reading his description of the project, it basically "parses" lines of code. That is about the only thing in common our two projects have together. Again, it is the process of identifying events that is the bottleneck. If we don't try to identify events, we would have amazing parsing times. But events are handled differently from other procedures during validation and not identifying them is not an option. Without identifying them, you would get so many false positive validation warnings, especially during zombie checks, but with other checks also. To see what I mean, jump to the bottom of clsPrjFile.pvProcessProject and comment out the 4 lines of code that call the 4 pvValidateEvents_[xxx] routines.
@All. I'm still going to post one newer version to this and pretty much call it done unless bugs are found down the road. I probably could get better speed for very large projects, but would likely require starting over on much of the core logic -- don't want to invest that much time. Same reason I don't think I'll move this into a VB add-in. If an add-in, I would be able to use VB IDE to locate information for me instead of parsing it out manually.
Do note that all this talk about speed issues are particular to very large projects, those with hundreds of source code files. For your average project, say less than 30 code files, I feel this project's speed is sufficient.
Last edited by LaVolpe; Jan 6th, 2019 at 12:30 PM.
Insomnia is just a byproduct of, "It can't be done"
Yes, the discussion about performance should stop, the performance of prjScan is good enough now.
If I have time in the future, I'll try to improve the scanning speed of prjScan for large projects. I'll optimize it from two aspects:
(1) Try to use Sqlite-MemDB
(2) Use a faster array or collection instead of VB.Collection
Looking forward to the new version of prjScan, thank you LaVolpe.
Last edited by dreammanor; Jan 7th, 2019 at 06:37 AM.
If I have time in the future, I'll try to improve the scanning speed of prjScan for large projects. I'll optimize it from two aspects:
(1) Try to use Sqlite-MemDB
(2) Use a faster array or collection instead of VB.Collection
Looking forward to the new version of prjScan, thank you LaVolpe.
I'll post some test code files in a day or two. Already fixed the bottleneck with identifying control events. I can now scan Krool's project in less than 4 seconds; that's 1/2 the time it took me before. Will get that a bit faster now that I know the new logic works well by applying that logic to the other 3 types of events: WithEvents, Implementation Events, and Class Events
P.S. VB.Collection isn't used often in that project. My updated pvValidateEvents_[xxxx] routines no longer use them. And elsewhere in the project where they are used, they are not that prominent, do not have many entries. I use them more for convenience for little stuff. When possible I prefer a sorted array for binary searching.
Insomnia is just a byproduct of, "It can't be done"
Thanks for putting this out LaVolpe; it caught a bunch of Variant declarations that I thought I had fixed years ago!
I'm trying to think of suggestions for further validations to do, but most things just fall under "style guide" type items. I suppose there are some internationalization / localization checks that might be possible (checking for string date literals), but none of the automated checks I can come up with would be particularly useful. Can you think of a way to scan for missing On Error handling in event handler routines?
Edit: It would be nice to have some total statistics for a project group.
Another thing; I don't remember this getting reported before:
I validated a project group with the "Variant vs. String Functions" check and without the Duplicate String Literals check. The Variant vs String results kept getting copied from one project to the next; the last project in the group showed the Variant vs String results of all of the previous projects. This didn't happen with any other validations.
Thanks for putting this out LaVolpe; it caught a bunch of Variant declarations that I thought I had fixed years ago!
I'm trying to think of suggestions for further validations to do, but most things just fall under "style guide" type items. I suppose there are some internationalization / localization checks that might be possible (checking for string date literals), but none of the automated checks I can come up with would be particularly useful. Can you think of a way to scan for missing On Error handling in event handler routines?
Edit: It would be nice to have some total statistics for a project group.
@aHenry. I'll take a look at the Variant vs String oddity you mentioned. See if I can find out why you experienced that. Each project, in a group, creates its own temporary recordset for those types of discrepancies. So, I think it must be a filter I'm using to display the discrepancies and maybe I forgot to ensure the filter included the specific project. That could explain it. Instead of showing them by project, it showed all of them one project to the next. Funny, but not really.
I did consider "On Error" handling reporting; however, I believe that's more of a personal style. Obviously not all procedures need error handling and I think such a check is truly only valuable to those that are in the habit of adding error handling to all routines, something like what MZTools did. The idea of group project statistics is a good idea -- this way we don't have to do mental math .
@All. Think I got that speed bump resolved. This is the issue we've been discussing for past several days. Scanning Krool's project (104 files now vs. 107), before I made changes, took 8.2 seconds on my PC. Now it takes 2.82 seconds, nearly 3x faster. On much smaller projects that took 1-2 seconds are done in under 1 second -- much improved with a different way of identifying event procedures. I'll post the new version before end of the month. Still gotta revamp some of the validation portion now that I've gone and rebuilt my core statement & word parser.
@dreamManor and Elroy. Tomorrow, I'll post another zip of replacement files for you guys to play with. When you find the time, please report back on your impression of the speed gain or no gain? Thank you in advance.
Insomnia is just a byproduct of, "It can't be done"
Can't really compare jpbro's project with this one. I haven't tried his project because it uses references I don't want to install and requires backing up files to prevent corruption.
FYI - I know you're firmly in the anti-closed source third party library camp, so I don't expect you to register anything, but I just do want to clarify the "requires backing up files to prevent corruption" comment. As my project currently stands, backups are not a requirement. The project doesn't do any writing back to disk. It does however allow users to modify their source code and subsequently write it back to disk, which is why I provided the warning as a precaution/reminder that you can muck things up if you aren't careful. In it's current form though, there is as close to 0 risk as possible with anything that touches your source files at all.
That said, I always recommend backing up your source files regularly, but especially before touching them with third party software.
Originally Posted by LaVolpe
However, from reading his description of the project, it basically "parses" lines of code. That is about the only thing in common our two projects have together.
That's correct - right now my project simply splits code files into single physical lines (although logical line support is 90% complete and coming soon). You can then perform regex (or regular InStr) matches against lines of source and make substitutions on those matches as required. The main goal is to create a "Source Object Model" for VB6 code I guess. I personally use it for doing things similar to your project (finding stylistic issues, gotchas, common mistakes, etc...) but the difference is with my project is that devs have to write their own rules/detection routines, and they can automate changes and write them back to disk.
Originally Posted by LaVolpe
Do note that all this talk about speed issues are particular to very large projects, those with hundreds of source code files. For your average project, say less than 30 code files, I feel this project's speed is sufficient.
I'd like to add that I don't think raw speed is really all that important with projects like this - IMO the idea is that you run them before doing final builds before public distribution to catch potential problems and fix them. Even if it takes 30 minutes, that's not really a big deal as detected problems will save you a tonne of time down the road.
@jpbro. Regarding logical lines, if you want any "lessons learned" from me, I can certainly offer a few. Looking over your source code I did see some logic that could fail with legitimate logical lines; although it would take someone with a pretty unusual coding style to break a better-than-basic parser.
Logical lines is the reason why I chose not to use InStr, RegEx and others forms of parsing. Not that those can't be used, but in some cases would require going back over the code time and again to catch all the legitimate variations that a logical line can written in, i.e., continuations, etc. I wanted a one-pass solution which is reason this project is parsed the way it is.
Insomnia is just a byproduct of, "It can't be done"
@dreamManor and Elroy. Tomorrow, I'll post another zip of replacement files for you guys to play with. When you find the time, please report back on your impression of the speed gain or no gain? Thank you in advance.
Ok, look forward to your new files. Thank you very much, LaVolpe.
Originally Posted by jpbro
That's correct - right now my project simply splits code files into single physical lines (although logical line support is 90% complete and coming soon). You can then perform regex (or regular InStr) matches against lines of source and make substitutions on those matches as required. The main goal is to create a "Source Object Model" for VB6 code I guess. I personally use it for doing things similar to your project (finding stylistic issues, gotchas, common mistakes, etc...) but the difference is with my project is that devs have to write their own rules/detection routines, and they can automate changes and write them back to disk.
Hi jpbro, for some time to come, I need to do some parsing of text files (such as HTML and JavaScript). I'm thinking about a question: Is the efficiency of regex higher than VB string functions (Instr, Replace, etc.)?
Last edited by dreammanor; Jan 7th, 2019 at 08:51 PM.