-
Feb 18th, 2021, 10:44 PM
#1
Thread Starter
Fanatic Member
In vbRichClient5 sqlite, the "indexed by" clause sometimes gives a runtime error
I use vbRichClient5.dll for sqlite.
Let's say I create and populate a temporary table like this:
Code:
sql = ""
sql = sql & "create temp table if not exists T1 ( "
sql = sql & "TID INTEGER Primary Key, "
sql = sql & "CUST_FNAME TEXT, "
sql = sql & "CUST_LNAME TEXT ) "
Cnn.Execute sql
sql = "create index if not exists idx1 on T1 ( CUST_FNAME ) "
Cnn.Execute sql
sql = "create index if not exists idx2 on T1 ( CUST_LNAME ) "
Cnn.Execute sql
sql = "Insert Into T1 ( CUST_FNAME, CUST_LNAME ) values ('John', 'Smith')"
Cnn.Execute sql
sql = "Insert Into T1 ( CUST_FNAME, CUST_LNAME ) values ('Robert', 'Ashton')"
Cnn.Execute sql
The above code successfully creates and populates the table.
But then the next query fails:
Code:
sql = "select TID, CUST_FNAME, CUST_LNAME from T1 indexed by idx2 where CUST_LNAME <> ''"
Set Rs = Cnn.OpenRecordset(sql)
Right on the OpenRecordset statement it gives me a runtime error "No query solution"
If I remove the indexed by clause, the query works fine.
Even an update query has the same problem:
Code:
sql = "update T1 indexed by idx2 set CUST_LNAME = upper(CUST_LNAME) where CUST_LNAME <> ''"
Cnn.Execute sql
Right on the Cnn.Execute statement it gives me the same runtime error "No query solution"
Again, if I remove the indexed by clause, the query works fine.
Most strangely of all: In the first query (select), if I limit the list of selected columns to only TID even with the indexed by clause in place (like this):
Code:
sql = "select TID from T1 indexed by idx2 where CUST_LNAME <> ''"
Set Rs = Cnn.OpenRecordset(sql)
The query works fine.
I really don't understand why the indexed by clause sometimes gives a runtime error.
Why does this happen?
And how can I avoid this?
Please help.
Thanks.
-
Feb 19th, 2021, 04:14 AM
#2
Re: In vbRichClient5 sqlite, the "indexed by" clause sometimes gives a runtime error
Originally Posted by IliaPreston
I really don't understand why the indexed by clause sometimes gives a runtime error.
Why does this happen?
And how can I avoid this?
Good question, let me try...
Here is a "boiled down" example, which tries to shed some more light on this:
Code:
Private Sub Form_Load()
With New_c.Connection(, DBCreateInMemory)
.Execute "Create Table T(ID Integer Primary Key, A Text, B Text)"
.Execute "Create Index iA On T(A)"
.Execute "Insert Into T Values(1, 'A 1', 'B 1')"
.Execute "Insert Into T Values(2, 'A 2', 'B 2')"
1 Caption = .OpenRecordset("Select ID From T Indexed By iA Where A <> ''").RecordCount
2 Caption = .OpenRecordset("Select A From T Indexed By iA Where A <> ''").RecordCount
3 Caption = .OpenRecordset("Select B From T Indexed By iA Where A >= ''").RecordCount
4 Caption = .OpenRecordset("Select B From T Indexed By iA Where A <= ''").RecordCount
5 Caption = .OpenRecordset("Select B From T Indexed By iA Where A > ''").RecordCount
6 Caption = .OpenRecordset("Select B From T Indexed By iA Where A < ''").RecordCount
7 Caption = .OpenRecordset("Select B From T Indexed By iA Where A = ''").RecordCount
8 Caption = .OpenRecordset("Select B From T Indexed By iA Where A <> ''").RecordCount
End With
End Sub
The above (blue) lines 1 to 7 will all work - whilst only the (red) line 8 will fail -
and here is why...
First, imagine an index (in this case the Index iA on Table T) as a:
- separate "Mini-Table"
- consisting of "one Column more, than the indexed Field" (in our case two columns, A and -> ID)
- Column 1 being the "ordered Key-Field" (in this case Field A of Table T)
- Column 2 being the RecordID (Primary Key) of the Table (in this case the ID-Field of Table T)
Secondly, when you execute a query using "Indexed By", you will always *force* SQLite, to use:
- the "MiniTable of the specified Index-Name", as the *primary lookup-table for a planned scan*
And such a "primary-lookup-table" has to follow the rule (in SQLite) - to "do this scan only once" -
(either upwards, or downwards - in case cross-referencing to the parent-table is involved).
So, here is the explanations, why 1 to 7 will work (using Index iA as the "primary-lookup-table" PLT), categorized in 3 groups:
FullTable-scans on the PLT:
1 - will work, because the PLT already contains both specified columns, Col A (in the Where Clause) and Col ID (in the FieldList)
2 - will work, basically for the same reason as 1 (A being contained in the PLT, allowing a FullTable-Scan directly on the PLT, no cross-referencing to T required)
One-Directional Scans on the PLT (cross-referencing via the ID-Field of the PLT to the main-table T):
3 to 6 will work, because the second criterion is fullfilled (that being: a "one-directional-scan" on the PLT - due to the > or < operators)
SubSelect Optimizations in case of an "equality-comparison on an indexed field"
7 - will work, because Where A='' is replaced by an internal optimization, which is more like: Where T.ID In (Select ID From PLT Where A='')
And finally...
8 will not work, because SQLite reads the <> comparison-condition as "forcing a two-directional scan on the PLT"
.. (everything larger than the searchstring "downwards" and everything smaller than the search-string "upwards")
And as said, a used index (a PLT) in SQLite wants to "move the cursor on a PLT" in only one direction.
Long story short - please avoid the <> Operator in Col-comparisons, in case you *forced* SQLite to use the PLT of that column.
HTH
Olaf
Last edited by Schmidt; Feb 19th, 2021 at 05:11 AM.
-
Feb 20th, 2021, 08:19 PM
#3
Thread Starter
Fanatic Member
Re: In vbRichClient5 sqlite, the "indexed by" clause sometimes gives a runtime error
Thanks for your help.
There are a couple of issues in here:
1- This type of problem does not exist in Microsoft Sql. It only exists in Sqlite.
I can force the same index on the same query in Microsoft Sql, and it will work without a hitch.
Does this mean that this is a bug or a design flaw in Sqlite?
Or, do you believe that this behaviour of Sqlite is justifiable?
2- Doesn't Microsoft Sql use the same kind of PLT?
If it does, then why this problem doesn't happen in Microsoft Sql?
3- And I don't really get the necessity of that "two-directional" scan:
8 will not work, because SQLite reads the <> comparison-condition as "forcing a two-directional scan on the PLT"
.. (everything larger than the searchstring "downwards" and everything smaller than the search-string "upwards")
And as said, a used index (a PLT) in SQLite wants to "move the cursor on a PLT" in only one direction.
Why does a <> operator force the Sqlite engine to believe that a two-directional scan is the ONLY possible scan?
I understand that in the case of a <> operator, ONE way of scanning is a two-directional scan (actually the worst way of scanning).
But, another way of scanning would be scanning the mini-table from the first to the last record (PERFECTLY one-directionally) and choosing any record that meets condition A <> ''
Just imagine if you had a hard-copy print of a list of names in your hand, and you were to choose all the names that were unequal to 'George' in that list, you would scan that list in a one-directional way.
So, why do you think that (doing a similar task), Sqlite scans the list in a two-directional way?
4- And also, from a PRACTICAL perspective, when I use my best judgement, and try to make a logical analysis (regardless of whether Sqlite in its current version allows it or not), for
query 8 in your example, the index iA is the best index to use.
So, if Sqlite doesn't allow that, then how best do you recommend that I write that query (query 8 in your example) or how to create another index?
Maybe, you would say that in the case of query 8, I should leave that indexed by clause out and let Sqlite decide what to do.
In that case Sqlite will scan the main table instead of using the index !!!!!! which means the worst possible performance.
So, how can I speed up that query?
Logically (regardless of whether Sqlite in its current version allows it or not), the best way to guarantee good performance is to create that index iA on T(A), (whether in the main query I explicitly force it by an indexed by clause or not).
But, if creating that index is USELESS for speeding up query 8, then what kind of action is useful?
Please advise.
Thanks a lot.
Ilia
-
Feb 22nd, 2021, 04:41 AM
#4
Re: In vbRichClient5 sqlite, the "indexed by" clause sometimes gives a runtime error
First off...
The documentation for "Indexed By"-behaviour in SQLite: https://sqlite.org/lang_indexedby.html
states the following:
The INDEXED BY clause of SQLite is not a hinting mechanism and it should not be used as such. The INDEXED BY clause does not give the optimizer hints about which index to use; it gives the optimizer a requirement of which index to use.
If the query optimizer is unable to use the index specified by the INDEXED BY clause, then the query will fail with an error.
So, what you currently get, is exactly described in the documentation:
- the SQLite Query-Optimizer telling you via an Error-Message, that in case of <> it will not make use of requested Index
- due to a better matching "FullTable-Scan-Fallback" on the original Table
.. (because cross-referencing from a "FullTableScanned PLT" is slower than the FullTableScan on the referenced table itself).
So it's not really "a Bug" (when it conforms to the documentation, which warned you about the possibilities of error-throwing).
That said, there is of course an "inconsistency", because the successful executed query:
Select ID From T Indexed By iA Where A <>'' (using iA in a FullTableScan as the "sole, covering index, ignoring T", because the ID-Field is contained in the Index itself)
... led you to believe, that:
Select B From T Indexed By iA Where A <>'' (using iA as a "sole, covering index for T" failed, because B is not contained in iA)
... would work as well.
So, maybe this inconsistency is already corrected in SQLite (RC5 is currently a few versions "behind" of the official release).
Ok ... just tested this with the most recent SQLite-version (which comes with the recent RC6-version 6.0.4) -
and the example-code in #2 is now working without any thrown error.
So, the SQLite-team fixed "the inconsistency" in the meantime.
Originally Posted by IliaPreston
1- This type of problem does not exist in Microsoft Sql. It only exists in Sqlite.
...
2- Doesn't Microsoft Sql use the same kind of PLT?
If it does, then why this problem doesn't happen in Microsoft Sql
Microsoft SQL(Server) is not defining "the world-standard" regarding SQL ...
(normal MS-behaviour is more like "ignoring" or "undermining" existing world-standards).
SQLite is more "modeled" after "PostgreSQL", than MS-SQLServer or Oracle.
Originally Posted by IliaPreston
3- And I don't really get the necessity of that "two-directional" scan:
Why does a <> operator force the Sqlite engine to believe that a two-directional scan is the ONLY possible scan?
If you want to have any benefits from the pre-sorted index, then you can:
- try to find the value(range) in the sorted index-list
- and then scan upwards from the found value(range) to index-number 0
- followed by scanning downwards to index-number "count-1"
Alternatively you can resort to a "single Full-Scan from the top" ...
(but then you will ignore any benefits the sorted index-list gives you ... and doing the full-scan on the referenced table itself is the better choice in that case)
Originally Posted by IliaPreston
4- And also, from a PRACTICAL perspective, when I use my best judgement, and try to make a logical analysis (regardless of whether Sqlite in its current version allows it or not), for
query 8 in your example, the index iA is the best index to use.
No, it isn't.
For <> based Operators on a Field in a Where-Clauses, the fastest way is not always to use the index
the fastest way to produce the result in this case, is often the FullTable-scan on the original table...
The QueryPlanner "usually knows better than you", whether the use of an index makes sense in a query or not.
(which was the reason why you got the error thrown at you in the first place).
As for "speeding up that query" ... do you have a concrete scenario you can share? ... as e.g.:
- does this Field of yours really contain "empty string-values" (for most of the records)?
- and how many records are there in that table?
Olaf
-
Mar 4th, 2021, 06:03 PM
#5
Thread Starter
Fanatic Member
Re: In vbRichClient5 sqlite, the "indexed by" clause sometimes gives a runtime error
Thanks for your help.
The INDEXED BY clause of SQLite is not a hinting mechanism and it should not be used as such. The INDEXED BY clause does not give the optimizer hints about which index to use; it gives the optimizer a requirement of which index to use.
If the query optimizer is unable to use the index specified by the INDEXED BY clause, then the query will fail with an error.
So, basically can I conclude from the above that when an indexed by is used and Sqlite does not throw an error, then Sqlite will for sure use that index no matter what (even if that index may even slow down the query)?
Probably the answer should be yes, but it would be better to double-check to make sure.
The QueryPlanner "usually knows better than you", whether the use of an index makes sense in a query or not.
Does that mean that in any and all situations, it would be better to NOT use the indexed by clause?
In what situations do you think the indexed by clause should be used?
As for "speeding up that query" ... do you have a concrete scenario you can share? ... as e.g.:
- does this Field of yours really contain "empty string-values" (for most of the records)?
- and how many records are there in that table?
That field is always trimmed and never null. In rare cases it might have a blank value, but in more than 95% of records it is not blank.
The number of records in the table is ususlly less than 10,000 records, but in rare cases (about 5% of the time) it could be more. But never more than 100,000 records.
Thanks for all your help and advice.
Ilia
-
Mar 4th, 2021, 07:29 PM
#6
Re: In vbRichClient5 sqlite, the "indexed by" clause sometimes gives a runtime error
Originally Posted by IliaPreston
If the query optimizer is unable to use the index specified by the INDEXED BY clause, then the query will fail with an error.
So, basically can I conclude from the above that when an indexed by is used and Sqlite does not throw an error, then Sqlite will for sure use that index no matter what (even if that index may even slow down the query)?
Yes - that's how I understand the Documentation.
Originally Posted by IliaPreston
The QueryPlanner " usually knows better than you", whether the use of an index makes sense in a query or not.
Does that mean that in any and all situations, it would be better to NOT use the indexed by clause?
Well, I wrote "usually" - the SQLite-QueryPlanner is written "lean and mean" - so there's (rare) cases, where the wrong optimization or index is choosen.
Originally Posted by IliaPreston
In what situations do you think the indexed by clause should be used?
I would use it only for testing-purposes (in the rare cases, when the QueryPlanner does not deliver optimal results).
Here's described what one can do before resorting to "indexed by", in case the QueryPlanner works suboptimal:
https://sqlite.org/queryplanner-ng.html#howtofix
Originally Posted by IliaPreston
That (indexed) field is always trimmed and never null. In rare cases it might have a blank value, but in more than 95% of records it is not blank.
And that's exactly the case, where a FullTableScan on the original Table is faster than using an Index
(for your use-case of "Filtering for Txt-Values, which are <> EmptyString").
The SQLite-QueryPlanner "knows that":
Code for RC6 version 6.0.4 (which comes with the newest SQLite-Engine).
Code:
Private Sub Form_Load()
With New_c.MemDB
.Exec "Create Table T(ID Integer Primary Key, Neg Integer, Txt Text Not Null Collate NoCase)"
Dim i As Long, S As String, Rs As cRecordset
For i = 1 To 100000
If i Mod 20 Then S = "Txt_" & i Else S = "" '95% of the Txt-Values will get a String <> ""
.ExecCmd "Insert Into T(Neg, Txt) Values(?,?)", -i, S
Next
.Exec "Create Index i_T_Txt On T(Txt)" 'now create an index on the Txt-Col of the filled table
New_c.Timing True 'the performance is suboptimal, when the index is enforced...
Set Rs = .GetRs("Select Count(*), Min(Neg) From T Indexed By i_T_Txt Where Txt <> ''")
Debug.Print "Forced Index: "; New_c.Timing, Rs(0), Rs(1)
New_c.Timing True 'and twice as good when leaving it to the QPlanner (which prefers a FullTableScan instead)
Set Rs = .GetRs("Select Count(*), Min(Neg) From T Where Txt <> ''")
Debug.Print "Avoided Index: "; New_c.Timing, Rs(0), Rs(1)
End With
End Sub
The above gives the following Debug-Outputs:
Code:
Forced Index: 15.33msec 95000 -99999
Avoided Index: 8.66msec 95000 -99999
HTH
Olaf
Last edited by Schmidt; Mar 4th, 2021 at 07:36 PM.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|