|
-
Apr 10th, 2013, 08:18 AM
#1
Thread Starter
Hyperactive Member
[RESOLVED] Sqlite and db locking? 2 programs, 1 writing, 1 reading...
I am using sqlite in my programs. I did some reading and noticed that sqlite supports writing to the db file from one place and other places can read from the same db file (if they attempt to write, the writing isn't executed).
So, with that in mind, I created this:
Code:
using (SqliteConnection sqlConn = new SqliteConnection(@"Data Source=" + pathgoeshere + ";Pooling=false;synchronous=0;temp_store=2;count_changes=0"))
{
//command
using (SqliteCommand command = sqlConn.CreateCommand())
{
command.CommandText = ""; //Creation of Tables
command.ExecuteNonQuery();
SqliteTransaction sqltrans = sqlConn.BeginTransaction();
command.Transaction = sqltrans;
//open StreamReader
using (StreamReader sr = new StreamReader(pathgoeshere + ".txt")))
{
while ((line = sr.ReadLine()) != null)
{
command.CommandText = ""; //Insert or Ignore, also updating if need to, from 10,000+ records to 2.1 million records (varies every time it runs)
command.ExecuteNonQuery();
}
}
sqltrans.Commit();
}
}
Then, in another program I have this (didn't even bother writing my query yet for it):
Code:
using(SqliteConnection con = new SqliteConnection(@"Data Source=" + pathgoeshere + ";Pooling=false;synchronous=0;temp_store=2;count_changes=0"))
{
if(con.State != System.Data.ConnectionState.Open)
con.Open();
using(SqliteCommand command = new SqliteCommand("SELECT Field1,Field2 FROM Table",con))
{
using(SqliteDataReader reader = command.ExecuteReader())
{
if(reader.HasRows)
{
}
}
}
}
The problem is the 2nd portion of code. Apparently anytime the first program is running through the sqlite code and the 2nd program attempts to run its sqlite code, the 2nd program crashes because the db file is locked. Maybe it's because of the transactions in the first code, but without it, the first code takes too long to run.
The 2nd part is also only going to attempt a simple select.
Does anybody have any suggestions on the best way to handle this?
-
Apr 10th, 2013, 01:56 PM
#2
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
If you don't mind getting "somewhat old" data, you can use NOLOCK to do "dirty reads".
SELECT Field1,Field2 FROM Table(nolock)
You might not get all the data. When the first portion is being executed you will only get records that have been committed. But at least your application will not throw an error.
-
Apr 11th, 2013, 07:33 AM
#3
Thread Starter
Hyperactive Member
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
Thanks for the suggestion.
One thing my top (insert) code does is run on a thread. On disposing of the program, I stop the thread and call a join on it. I didn't include it in the post, apologies for that. Basically, the commit always will get ran, so the data won't really ever be rolled back.
A couple things to note.
In the connection strings of them, I tried adding "Journal Mode=WAL" and "journal_mode=WAL" but either didn't seem to make a difference.
In the second query (select), I tried after "FROM Table" adding variations of: "WITH NOLOCK", "WITH (NOLOCK)", "(WITH NOLOCK)", "(NOLOCK)", "NOLOCK", but it didn't seem to make a difference either.
This leaves me to believe then, that the problem lies in the connection strings (synchronous?, pooling?, etc?) or with the first transaction setup.
I'm thinking for a test, maybe I will comment out the transaction scope and let it run each query like normal and see how the 2 programs react together.
-
Apr 12th, 2013, 04:07 AM
#4
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
The second program shouldn't crash as a result of being locked out, it should simply wait until the lock clears then process it's select as normal. That makes me think you're heading up the wrong garden path. You haven't given any detail on the error message you're getting though so it's impossible for us to help further.
I'm troubled by this statement though:-
Maybe it's because of the transactions in the first code, but without it, the first code takes too long to run
I can't think of any reason why opening a transaction would cause your sql to run faster. That's not what it's for. The purpose of a transaction is to make sure that a bunch of operations complete in their entirety or not at all and to prevent other users from seeing some but not all of those operations if they interleave (the dirty reads serge referred to). By including the transaction you're locking any newly inserted records (and that may escalate to the table) which might slow down the second program but there's no reason it would speed up the first.
I think you've got a fundamental miss-understanding of transactions and it's probably causing you to trip over your own feet.
The best argument against democracy is a five minute conversation with the average voter - Winston Churchill
Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd
-
Apr 12th, 2013, 06:46 AM
#5
Thread Starter
Hyperactive Member
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
I think you've got a fundamental miss-understanding of transactions and it's probably causing you to trip over your own feet.
It probably is and for that I aplogize. What I gathered is that each time I called command.ExecuteNonQuery() without a transaction it does everything needed to fully insert/update a record in the db. With a transaction, it inserts/updates a record, but isn't fully inserted/updated until a .Commit() is ran.
So, I just ran a test (on the first program):
1. Without my 3 lines of transaction code, it is still inserting/updating after 1 minute. This test is a total of 49,239 rows.
2. With my 3 lines of transaction code, it finishes inserting/updating in about 10 seconds.
-
Apr 12th, 2013, 07:10 AM
#6
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
So, I just ran a test (on the first program):
1. Without my 3 lines of transaction code, it is still inserting/updating after 1 minute. This test is a total of 49,239 rows.
2. With my 3 lines of transaction code, it finishes inserting/updating in about 10 seconds.
None the less, a transaction is the wrong solution because it's not intended as a tool to improve performance. As such it's having unwanted side affects... it's doing what it's meant to do, not what you want it to do. It might be making your inserts go quicker but I sincerely doubt the data is actually making it into the datatables any quicker. In fact it's probably slower. What's probably happening is the operations are making it into the transaction log quicker so it will appear to you to be quicker. However, those operation don't move data into the tables until your commit fully completes and that's going to require exactly the same processing as it would have done if you didn't have the transaction in the first place.
The unwanted side effect is that it's locking your table for the whole duraction of the operation which is preventing your second program from running properly... possibly. I can't tell for sure if a locked table is the problem because you still haven't said what error you're getting. Your crash could (and should) be completely unrelated because the standard behaviour when encoutering a lock is not to crash, it's to wait. Perhaps your commit is taking a long time, leaving your resources locked and causing a timeout but that's just a guess. What is the error you're getting?
I'm particularly curious about what you said here:-
One thing my top (insert) code does is run on a thread. On disposing of the program, I stop the thread and call a join on it. I didn't include it in the post, apologies for that. Basically, the commit always will get ran, so the data won't really ever be rolled back.
what do you mean by that. Is the code you posted what's actually in use or have you deferred the commit until the program is disposed? Because that would cause the problem I described above.
Why do you care if your inserts take a minute? If each is atomically correct then they can take as long as they like. Each is an implicit transaction which get's immediately committed, releasing any resources and pereventing the conflicts you're (possibly) experiencing in program 2. If you really do need faster performance on the inserts then you probably want some form of bulk insert. I'm not sure what tools come with SQLLite but I'd suggest googling SQLLite Bulk Insert to see if you can find anything.
Finally, If you want any real help you're going to have to post your code. Your REAL code, including sql statements. Because there's also a chance that you're doign something else wrong in your SQL that's having an effect. E.g. The fact that your inserting updating or ignoring implies you're probably reading from the table to decide what to do... which would put a read lock on it...
edit>Here you go, a link on how to bulk insert (or import) into SQLLite:-
http://stackoverflow.com/questions/6...ta-into-sqlite
Last edited by FunkyDexter; Apr 12th, 2013 at 07:19 AM.
The best argument against democracy is a five minute conversation with the average voter - Winston Churchill
Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd
-
Apr 12th, 2013, 08:19 AM
#7
Thread Starter
Hyperactive Member
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
Funky, I really thank you for the taking the time to explain all of that.
The exact error I get in program 2, I'm not sure if you mean what gets reported in the debugger? Running program 2 through VS results in the program entering debugging mode with a green debug line (not yellow) that says "Database file is locked" or "Cannot access a locked database file" or something like that. I'll get the exact message when I get home later tonight.
As for the other infomration that might be helpful:
Ok, so here is the entire processing method. This is the only code in program 1 that touches sqlite. This entire method is started on a thread. If this thread is still running when Dispose is called, the variable Running is set to false, breaks out of the while loop for inserting/updating. Either when the while loop is done or broken out of, the transaction is commited.
*Please note I am using other libraries and other functions as well*
The only part of my code I changed is the connection string path and streamreader paths. Also, some of the lines in the file are different lengths, which is why I have a line checking the .Length of the split array being 19. Because of this, I don't think BULK inserts/imports could work.
Code:
private void Processing()
{
using (SqliteConnection sqlConn = new SqliteConnection(@"Data Source=" + path + ";Pooling=false;synchronous=0;temp_store=2;count_changes=0;journal_mode=WAL"))
{
//command
using (SqliteCommand command = sqlConn.CreateCommand())
{
command.CommandText = "CREATE TABLE IF NOT EXISTS [GeoNames]([GeoNameID] INTEGER PRIMARY KEY, [Name] TEXT, [ASCIIName] TEXT, [AlternateNames] TEXT, [Latitude] TEXT, [Longitude] TEXT, [Feature Class] TEXT, [Feature Code] TEXT, [Country Code] TEXT, [CC2] TEXT, [Admin1 Code] TEXT, [Admin2 Code] TEXT, [Admin3 Code] TEXT, [Admin4 Code] TEXT, [Population] TEXT, [Elevation] TEXT, [DEM] TEXT, [TimeZone] TEXT, [Modification Date] TEXT);";
if (sqlConn.State != System.Data.ConnectionState.Open)
sqlConn.Open();
command.ExecuteNonQuery();
//stream the contents of the file...
string newname = Misc.CountryCodeNames(Name.Substring(0, Name.LastIndexOf(".")));
if (newname == "")
newname = Name.Substring(0, Name.LastIndexOf("."));
bool eof = false;
string line;
string[] linesplit = new string[19];
int i = 0;
//get total row count
int totalLineCount = 0;
using (StreamReader sr = new StreamReader(path), Name.Substring(0, Name.LastIndexOf(".")) + ".txt")))
{
while (sr.ReadLine() != null)
{
totalLineCount += 1;
}
}
using (StreamReader sr = new StreamReader(path), Name.Substring(0, Name.LastIndexOf(".")) + ".txt")))
{
SqliteTransaction sqltrans = sqlConn.BeginTransaction();
command.Transaction = sqltrans;
while ((line = sr.ReadLine()) != null)
{
if (sr.EndOfStream)
eof = true;
if (i >= ProcessNumber)
{
linesplit = line.Split('\t');
if (linesplit.Length == 19)
{
if (Running)
{
ProcessNumber = i;
command.CommandText = @"INSERT OR IGNORE INTO [GeoNames] VALUES ('" + General.escape(linesplit[0]) + "','" + General.escape(linesplit[1]) + "','" + General.escape(linesplit[2]) + "','" + General.escape(linesplit[3]) + "','" + General.escape(linesplit[4]) + "','" + General.escape(linesplit[5]) + "','" + General.escape(linesplit[6]) + "','" + General.escape(linesplit[7]) + "','" + General.escape(linesplit[8]) + "','" + General.escape(linesplit[9]) + "','" + General.escape(linesplit[10]) + "','" + General.escape(linesplit[11]) + "','" + General.escape(linesplit[12]) + "','" + General.escape(linesplit[13]) + "','" + General.escape(linesplit[14]) + "','" + General.escape(linesplit[15]) + "','" + General.escape(linesplit[16]) + "','" + General.escape(linesplit[17]) + "','" + General.escape(linesplit[18]) + "');" + Environment.NewLine +
"UPDATE [GeoNames] SET [Name]='" + General.escape(linesplit[1]) + "', [ASCIIName]='" + General.escape(linesplit[2]) + "', [AlternateNames]='" + General.escape(linesplit[3]) + "',[Latitude]='" + General.escape(linesplit[4]) + "', [Longitude]='" + General.escape(linesplit[5]) + "', [Feature Class]='" + General.escape(linesplit[6]) + "', [Feature Code]='" + General.escape(linesplit[7]) + "', [Country Code]='" + General.escape(linesplit[8]) + "', [CC2]='" + General.escape(linesplit[9]) + "', [Admin1 Code]='" + General.escape(linesplit[10]) + "', [Admin2 Code]='" + General.escape(linesplit[11]) + "', [Admin3 Code]='" + General.escape(linesplit[12]) + "', [Admin4 Code]='" + General.escape(linesplit[13]) + "', [Population]='" + General.escape(linesplit[14]) + "', [Elevation]='" + General.escape(linesplit[15]) + "', [DEM]='" + General.escape(linesplit[16]) + "', [TimeZone]='" + General.escape(linesplit[17]) + "', [Modification Date]='" + General.escape(linesplit[18]) + "' WHERE changes() = 0 AND GeoNameID='" + linesplit[0] + "';";
command.ExecuteNonQuery();
ProcessingNotification.Header = "Processing: " + newname;
ProcessingNotification.Text = "( " + i.ToString() + " / " + totalLineCount + " )";
}
else
break;
}
}
else
{
ProcessingNotification.Text = "( Attemping To Continue, Please Wait )";
}
i += 1;
}
sqltrans.Commit();
//if here, done with processing file
if (eof)
{
Running = false;
eof = false;
if (ProcessingFinished != null)
ProcessingFinished();
}
}
}
}
}
The reason I cared about the timing of how long it takes is that there could anywhere from 10,000 records to 2.1mil+ records per text file. There could be a total of 200+ text files.
-
Apr 12th, 2013, 08:51 AM
#8
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
At a cursory glance there doesn't appear to be anything wrong with that but it's difficult to be sure because you're insert doesn't specify the columns so it's hard to see what data is going where and how it corresponds to the update. also Insert Or Ignore isn't Ansii standardso I can't speak for it's performance or effects. I can't find a proper explanation of it on the SQL Lite website either but if they're supporting it you'd expect it to work OK.
If I'm reading that correctly you're pulling a bunch of data out a file and "laying it out" as a record which you then insert if it's not going to break any unique constraints. You're then updating the table but I can't see why. Are you marking the records that have changed perhaps?
Here's an alternative suggestion, use the import command I linked to to dump all of the data into a staging table. This should be pretty quick as that's what it's designed to do. Once it's in the staging table you can write a single insert statement, based on a select from the staging table, to write any records that aren't in the main table. something like this:-
Code:
Insert Into geoName (PrimaryKey, Field1, Field2, Field3...)
Select PrimaryKey, Field1, Field2, Field3...
From StagingTable
Left Join GeoName
on StagingTable.PrimaryKey = GeoName.PrimaryKey
Where GeoName.PrimaryKey is null
Because the Select will only return records that don't already exist in GeoNames it will only insert new records, fulfilling the need for an Insert Or Ignore. Alternatively you might be able to use the Insert Or Ignore itself based on a select. Something like:-
Code:
Insert Or Ignore Into geoName (PrimaryKey, Field1, Field2, Field3...)
Select PrimaryKey, Field1, Field2, Field3...
From StagingTable
Which would probably be easier but I'm not familiar enough with it's syntax to know for sure that'll work.
The key point here is that the entire insert is now a single operation which means it's completely contained within a single implicit transaction. There's no longer any need for you to manage the transactions yourself. It'll also be MUCH quicker than looping through the record set throwing millions of inserts at the DB because that requires it to repeatedly do all it's handshaking etc between your app and the DB.
You should be able to do something similar to satisfy the Update. Unfortunately SQLLite doesn't support Update FROM which is what you really want but you can grab values from sub queries. Here's an example from anouther forum:-
http://stackoverflow.com/questions/3...ate-in-sqlite3
Again though, you want to make it a single update instead of a bunch of individual ones if you possibly can. It's not the locking that's degrading your performance it's the looping and the constant aquiring and releasing of loocks. You only really need to do any of that once.
The best argument against democracy is a five minute conversation with the average voter - Winston Churchill
Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd
-
Apr 12th, 2013, 09:48 AM
#9
Thread Starter
Hyperactive Member
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
insert doesn't specify the columns so it's hard to see what data is going where and how it corresponds to the update
The idea behind that is that every field gets a value if an insert happens. The primary key is not an auto increment on my end, it's the primary key/id of the provided lines of data.
If I'm reading that correctly you're pulling a bunch of data out a file and "laying it out" as a record which you then insert if it's not going to break any unique constraints. You're then updating the table but I can't see why. Are you marking the records that have changed perhaps?
Yes, you're reading that correctly. The INSERT OR IGNORE is supposed to insert the record if the primary key (GeoNameID) doesn't exist. If the primary key does exist, nothing is inserted. The second UPDATE line is supposed to only update when no insert from above has happend via this: WHERE changes() = 0.
I'll work at using the .import feature of sqlite into a stagingtable, then I'll perform the updates and inserts based on selects and joins.
The text file I have has different lengths of data on some of the lines, so I wonder if the .import feature will NULL out the non-existant fields for those lines. About 95% of the files have the full amount of fields per line though. It wouldn't be hard to copy to a new text file while omitting the shorter lines and import from that.
Thanks Funky for your help so far. I'll report back later tonight when I test it all out.
-
Apr 12th, 2013, 10:12 AM
#10
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
The idea behind that is that every field gets a value if an insert happens. The primary key is not an auto increment on my end, it's the primary key/id of the provided lines of data.
Heading slightly off topic here but it's a good idea to specify them anyway. It just makes for more robust code. That wasn't really the point of me saying it though, I just want to highlight that it made it difficult for me to tell what information was going into what fields so I couldn't give a "proper" insert statement in the answer.
The second UPDATE line is supposed to only update when no insert from above has happend via this: WHERE changes() = 0.
That will be slightly harder to do using a staging table because you won't be able to leverage that changes function. There are two basic aproaches I couls suggest.
1. Do the insert then update everything. Any record that you previously inserted will get updated but that won't matter because it will just get updated to the same values it already has. or
2. Do the update first. Because the records you're going to insert won't exist in the geoNames table they won't get updated (obviously) and then the insert will still just work againt any record that aren't in the table to start with.
Of the two I think I'd recommend option 2 becaue it allows the DB to do as little work as possible.
It wouldn't be hard to copy to a new text file while omitting the shorter lines and import from that.
That's probably a better aproach than trying to clean it up once it's in the database. It's OK to clean things up in a staging table if you're sitting there supervising it. If you're trying to automate an import it's much better to sanitise the source of the data before you let it into the database. The point is to touch the database as lightly as possible because that's your point of contension.
I was thinking about the "database is locked" issue you're getting in program 2. That might just be the timeout error in which case speeding up the import is likely to resolve it because the wait time will probably be sufficient to allow the import to complete before Program 2's request times out. But assuming that SQLLite just immediately throws an error (which would be RUBBISH functionality) then you can trivially code your own simple timeout in program 2 itself. Just trap the error, wait e.g. 10 seconds and retry the query. You'd probably want to limit the number of time it tries to make sure you don't get stuck in an endless loop.
Last edited by FunkyDexter; Apr 12th, 2013 at 10:18 AM.
The best argument against democracy is a five minute conversation with the average voter - Winston Churchill
Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd
-
Apr 12th, 2013, 12:21 PM
#11
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
Another thing to think about is your transaction log. If you insert 2mil records in a loop your transaction log file gets bigger and bigger. The bigger the transaction log file, the slower the database operations are. If your inserts are a 1 time deal, it fine and you can backup transaction log with truncate after your operation, if it's a constant thing then I would schedule a transaction log backup with truncate.
-
Apr 12th, 2013, 06:36 PM
#12
Thread Starter
Hyperactive Member
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
While doing some googling for sqlite bulk insert, I came across multiple threads/examples that say a parameterized query can do about 100k inserts a second... However, I couldn't get it to work any faster, including using just an insert or ignore without the update line. I'll leave the code I tried here just in case.
For now, I will be trying the sqlite import feature Funky mentioned to see how that reacts when I spare some more time.
Code:
private void Processing()
{
//connection
using (SqliteConnection sqlConn = new SqliteConnection(@"Data Source=" + OpenMobile.Path.Combine(OM.Host.DataPath, "OMGPS") + ";Pooling=false;synchronous=0;temp_store=2;count_changes=0;journal_mode=WAL"))
{
if (sqlConn.State != System.Data.ConnectionState.Open)
sqlConn.Open();
//transaction
using (SqliteTransaction sqltrans = sqlConn.BeginTransaction())
{
//command
using (SqliteCommand command = sqlConn.CreateCommand())
{
command.CommandText = "CREATE TABLE IF NOT EXISTS [GeoNames]([GeoNameID] INTEGER PRIMARY KEY, [Name] TEXT, [ASCIIName] TEXT, [AlternateNames] TEXT, [Latitude] TEXT, [Longitude] TEXT, [Feature Class] TEXT, [Feature Code] TEXT, [Country Code] TEXT, [CC2] TEXT, [Admin1 Code] TEXT, [Admin2 Code] TEXT, [Admin3 Code] TEXT, [Admin4 Code] TEXT, [Population] TEXT, [Elevation] TEXT, [DEM] TEXT, [TimeZone] TEXT, [Modification Date] TEXT);";
command.ExecuteNonQuery();
//stream the contents of the file...
string newname = Misc.CountryCodeNames(Name.Substring(0, Name.LastIndexOf(".")));
if (newname == "")
newname = Name.Substring(0, Name.LastIndexOf("."));
bool eof = false;
string line;
string[] linesplit = new string[19];
int i = 0;
//get total row count
int totalLineCount = 0;
using (StreamReader sr = new StreamReader(OpenMobile.Path.Combine(OpenMobile.Path.Combine(OM.Host.PluginPath, "OMGPS", "Process"), Name.Substring(0, Name.LastIndexOf(".")) + ".txt")))
{
while (sr.ReadLine() != null)
{
totalLineCount += 1;
}
}
command.Transaction = sqltrans;
using (StreamReader sr = new StreamReader(OpenMobile.Path.Combine(OpenMobile.Path.Combine(OM.Host.PluginPath, "OMGPS", "Process"), Name.Substring(0, Name.LastIndexOf(".")) + ".txt")))
{
//command.CommandText = @"INSERT OR IGNORE INTO [GeoNames] VALUES (@p0,@p1,@p2,@p3,@p4,@p5,@p6,@p7,@p8,@p9,@p10,@p11,@p12,@p13,@p14,@p15,@p16,@p17,@p18);" + Environment.NewLine +
// "UPDATE [GeoNames] SET [Name]=@p1, [ASCIIName]=@p2, [AlternateNames]=@p3,[Latitude]=@p4, [Longitude]=@p5, [Feature Class]=@p6, [Feature Code]=@p7, [Country Code]=@p8, [CC2]=@p9, [Admin1 Code]=@p10, [Admin2 Code]=@p11, [Admin3 Code]=@p12, [Admin4 Code]=@p13, [Population]=@p14, [Elevation]=@p15, [DEM]=@p16, [TimeZone]=@p17, [Modification Date]=@p18 WHERE changes() = 0 AND GeoNameID=@p0;";
command.CommandText = @"INSERT OR IGNORE INTO [GeoNames] VALUES (@p0,@p1,@p2,@p3,@p4,@p5,@p6,@p7,@p8,@p9,@p10,@p11,@p12,@p13,@p14,@p15,@p16,@p17,@p18);";
SqliteParameter p0 = new SqliteParameter("@p0");
SqliteParameter p1 = new SqliteParameter("@p1");
SqliteParameter p2 = new SqliteParameter("@p2");
SqliteParameter p3 = new SqliteParameter("@p3");
SqliteParameter p4 = new SqliteParameter("@p4");
SqliteParameter p5 = new SqliteParameter("@p5");
SqliteParameter p6 = new SqliteParameter("@p6");
SqliteParameter p7 = new SqliteParameter("@p7");
SqliteParameter p8 = new SqliteParameter("@p8");
SqliteParameter p9 = new SqliteParameter("@p9");
SqliteParameter p10 = new SqliteParameter("@p10");
SqliteParameter p11 = new SqliteParameter("@p11");
SqliteParameter p12 = new SqliteParameter("@p12");
SqliteParameter p13 = new SqliteParameter("@p13");
SqliteParameter p14 = new SqliteParameter("@p14");
SqliteParameter p15 = new SqliteParameter("@p15");
SqliteParameter p16 = new SqliteParameter("@p16");
SqliteParameter p17 = new SqliteParameter("@p17");
SqliteParameter p18 = new SqliteParameter("@p18");
command.Parameters.Add(p0);
command.Parameters.Add(p1);
command.Parameters.Add(p2);
command.Parameters.Add(p3);
command.Parameters.Add(p4);
command.Parameters.Add(p5);
command.Parameters.Add(p6);
command.Parameters.Add(p7);
command.Parameters.Add(p8);
command.Parameters.Add(p9);
command.Parameters.Add(p10);
command.Parameters.Add(p11);
command.Parameters.Add(p12);
command.Parameters.Add(p13);
command.Parameters.Add(p14);
command.Parameters.Add(p15);
command.Parameters.Add(p16);
command.Parameters.Add(p17);
command.Parameters.Add(p18);
while ((line = sr.ReadLine()) != null)
{
if (sr.EndOfStream)
eof = true;
if (i >= ProcessNumber)
{
linesplit = line.Split('\t');
if (linesplit.Length == 19)
{
if (Running)
{
ProcessNumber = i;
p0.Value = General.escape(linesplit[0]);
p1.Value = General.escape(linesplit[1]);
p2.Value = General.escape(linesplit[2]);
p3.Value = General.escape(linesplit[3]);
p4.Value = General.escape(linesplit[4]);
p5.Value = General.escape(linesplit[5]);
p6.Value = General.escape(linesplit[6]);
p7.Value = General.escape(linesplit[7]);
p8.Value = General.escape(linesplit[8]);
p9.Value = General.escape(linesplit[9]);
p10.Value = General.escape(linesplit[10]);
p11.Value = General.escape(linesplit[11]);
p12.Value = General.escape(linesplit[12]);
p13.Value = General.escape(linesplit[13]);
p14.Value = General.escape(linesplit[14]);
p15.Value = General.escape(linesplit[15]);
p16.Value = General.escape(linesplit[16]);
p17.Value = General.escape(linesplit[17]);
p18.Value = General.escape(linesplit[18]);
command.ExecuteNonQuery();
command.Parameters.Clear();
OMGPS.ProcessingNotification.Header = "Processing: " + newname;
OMGPS.ProcessingNotification.Text = "( " + i.ToString() + " / " + totalLineCount + " )";
}
else
break;
}
}
else
{
OMGPS.ProcessingNotification.Text = "( Attemping To Continue, Please Wait )";
}
i += 1;
}
sqltrans.Commit();
//if here, done with processing file
if (eof)
{
//IsProcessing = false;
Running = false;
eof = false;
if (ProcessingFinished != null)
ProcessingFinished();
}
}
}
}
}
}
-
Apr 13th, 2013, 03:38 AM
#13
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
a parameterized query can do about 100k inserts a second
Parametised queries are better than nuilding up a query string for a number of reasons. They're more readable, they're less prone to injection attacks and, in SQLServer at least, they do perform a bit bit better because the DBMS can more easily recognise them and use a pre-cached execution plan... I have no idea whether that applies to SQLLite though. Even so, you'd still be better off doing a single insert that inserts all the records at once. It removes all the handshaking stuff that goes on with each individual query.
I'm never quite sure I trust figures that say "this aproach will achieve x number of records per second". Performance is too highly dependent on your hardware to draw simple conclusions like that.
The best argument against democracy is a five minute conversation with the average voter - Winston Churchill
Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd
-
Apr 25th, 2013, 08:11 AM
#14
Thread Starter
Hyperactive Member
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
Bringing this thread back again as I've been away from the project for a bit.
Anyways, as it currently stands, I wrote code to do a 2 database approach. I now have a [GeoNames-Staging] db and a [GeoNames] db. I originally did this b/c it would allow the main db to be open for reading until it updated from the staging db.
I found out that the sqlite dll I have is made to be compatible with Mono and therefore isn't the latest. It doesn't support the "journal_mode=WAL" that I tried before. Also, reading on the import command, I'm not sure I can really use that either. I don't have an install of sqlite, just the library file. Since it could be in different locations on my machine, let alone others machines, I'm not sure I can start a command line and navigate to where the dll is...
With that in mind, I set out to write the update and insert queries based on attaching the staging db.
The INSERT query is fully functional. The UPDATE query is having problems with an incorrect syntax near ".". I'm going to post both of those queries here just in case someone with a fresh eye might be able to see the simplest mistake I can't seem to see right now.
Code:
//Attaching, works...
command1.CommandText = "ATTACH DATABASE '" + path+ "OMGPS-Staging" + "' AS [db1]";
command1.ExecuteNonQuery();
//Non-working..
command1.CommandText = "UPDATE GeoNames SET [Name] = db1.[GeoNames-Staging].[Name-Staging], [ASCIIName] = db1.[GeoNames-Staging].[ASCIIName-Staging], [AlternateNames] = db1.[GeoNames-Staging].[AlternateNames-Staging], [Latitude]=db1.[GeoNames-Staging].[Latitude-Staging], [Longitude] = db1.[GeoNames-Staging].[Longitude-Staging], [Feature Class] = db1.[GeoNames-Staging].[Feature Class-Staging], [Feature Code] = db1.[GeoNames-Staging].[Feature Code-Staging], [Country Code] = db1.[GeoNames-Staging].[Country Code-Staging], [CC2] = db1.[GeoNames-Staging].[CC2-Staging], [Admin1 Code] = db1.[GeoNames-Staging].[Admin1 Code-Staging], [Admin2 Code] = db1.[GeoNames-Staging].[Admin2 Code-Staging], [Admin3 Code] = db1.[GeoNames-Staging].[Admin3 Code-Staging], [Admin4 Code] = db1.[GeoNames-Staging].[Admin4 Code-Staging], [Population] = db1.[GeoNames-Staging].[Population-Staging], [Elevation] = db1.[GeoNames-Staging].[Elevation-Staging], [DEM] = db1.[GeoNames-Staging].[DEM-Staging], [TimeZone] = db1.[GeoNames-Staging].[TimeZone-Staging], [Modification Date] = db1.[GeoNames-Staging].[Modification Date-Staging] WHERE GeoNameID = db1.[GeoNames-Staging].[GeoNameID-Staging]";
//command1.ExecuteNonQuery();
//Working...
command1.CommandText = "INSERT INTO GeoNames (GeoNameID, Name, ASCIIName, AlternateNames, Latitude, Longitude, [Feature Class], [Feature Code], [Country Code], [CC2], [Admin1 Code], [Admin2 Code], [Admin3 Code], [Admin4 Code], Population, Elevation, DEM, TimeZone, [Modification Date]) SELECT db1.[GeoNames-Staging].[GeoNameID-Staging], db1.[GeoNames-Staging].[Name-Staging], db1.[GeoNames-Staging].[ASCIIName-Staging], db1.[GeoNames-Staging].[AlternateNames-Staging], db1.[GeoNames-Staging].[Latitude-Staging], db1.[GeoNames-Staging].[Longitude-Staging], db1.[GeoNames-Staging].[Feature Class-Staging], db1.[GeoNames-Staging].[Feature Code-Staging], db1.[GeoNames-Staging].[Country Code-Staging], db1.[GeoNames-Staging].[CC2-Staging], db1.[GeoNames-Staging].[Admin1 Code-Staging], db1.[GeoNames-Staging].[Admin2 Code-Staging], db1.[GeoNames-Staging].[Admin3 Code-Staging], db1.[GeoNames-Staging].[Admin4 Code-Staging], db1.[GeoNames-Staging].[Population-Staging], db1.[GeoNames-Staging].[Elevation-Staging], db1.[GeoNames-Staging].[DEM-Staging], db1.[GeoNames-Staging].[TimeZone-Staging], db1.[GeoNames-Staging].[Modification Date-Staging] FROM db1.[GeoNames-Staging] LEFT JOIN GeoNames ON db1.[GeoNames-Staging].[GeoNameID-Staging] = GeoNames.GeoNameID WHERE GeoNames.GeoNameID IS NULL";
command1.ExecuteNonQuery();
Parametised queries are better than nuilding up a query string for a number of reasons. They're more readable, they're less prone to injection attacks and, in SQLServer at least, they do perform a bit bit better because the DBMS can more easily recognise them and use a pre-cached execution plan... I have no idea whether that applies to SQLLite though.
From what I've been reading, parameterized queries are the way to go regardless of which SQL version being used. I've changed all my beginning code to use them and I don't see an increase nor a decrease. I think my bottleneck might actually stem from all the disk reads of the StreamReader I'm using. The text files "could" be too big so I can't read everything into memory.
My 2 db approach is working well enough. Once I can pinpoint the UPDATE query fixes I'll be good and I'll mark the thread resolved. Thanks for all the help so far guys!
-
Apr 25th, 2013, 09:11 AM
#15
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
From what I've been reading, parameterized queries are the way to go regardless of which SQL version being used.
I'd second that. You may or may not see a perfomance increase but they're more readable and less vulnerable to clever hacking techniques so there's really no good argument not to use them. The main objective I was pushing you towards before was handling the whole insert as a single command which you've done
Not sure I understand why you can't use the import command but a staging db is always a good idea for imports so if that's working for you, go with it.
I can't see anything obviously wrong with your update but it's very hard to read in one big lump as you got it at the moment. My suggestion would be to copy just the sql part (ie not the: command1.CommandText = "" bit) and lay it out in a more readablae format. My preference is something like this:-
Code:
Update MyDestinationTable
Set Field1 = MySourceTable.Field1,
Field2 = MySourceTable.Field2,
...
From MyDestinationTable
join MySourceTable
on MyDestinationTable.PrimaryKey = MySourceTable.PrimaryKey
It's quite likely that you'll spot the problem yourself as you lay it out but, if you don't, it'll certainly make it easier for the rest of us to spot it.
The best argument against democracy is a five minute conversation with the average voter - Winston Churchill
Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd
-
Apr 25th, 2013, 10:21 AM
#16
Thread Starter
Hyperactive Member
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
I'll have to have another round of testing it tonight, but here's the formatted code:
Code:
UPDATE GeoNames
SET [Name] = db1.[GeoNames-Staging].[Name-Staging],
[ASCIIName] = db1.[GeoNames-Staging].[ASCIIName-Staging],
[AlternateNames] = db1.[GeoNames-Staging].[AlternateNames-Staging],
[Latitude]=db1.[GeoNames-Staging].[Latitude-Staging],
[Longitude] = db1.[GeoNames-Staging].[Longitude-Staging],
[Feature Class] = db1.[GeoNames-Staging].[Feature Class-Staging],
[Feature Code] = db1.[GeoNames-Staging].[Feature Code-Staging],
[Country Code] = db1.[GeoNames-Staging].[Country Code-Staging],
[CC2] = db1.[GeoNames-Staging].[CC2-Staging],
[Admin1 Code] = db1.[GeoNames-Staging].[Admin1 Code-Staging],
[Admin2 Code] = db1.[GeoNames-Staging].[Admin2 Code-Staging],
[Admin3 Code] = db1.[GeoNames-Staging].[Admin3 Code-Staging],
[Admin4 Code] = db1.[GeoNames-Staging].[Admin4 Code-Staging],
[Population] = db1.[GeoNames-Staging].[Population-Staging],
[Elevation] = db1.[GeoNames-Staging].[Elevation-Staging],
[DEM] = db1.[GeoNames-Staging].[DEM-Staging],
[TimeZone] = db1.[GeoNames-Staging].[TimeZone-Staging],
[Modification Date] = db1.[GeoNames-Staging].[Modification Date-Staging]
WHERE GeoNameID = db1.[GeoNames-Staging].[GeoNameID-Staging]
INSERT INTO GeoNames
(GeoNameID,
Name,
ASCIIName,
AlternateNames,
Latitude,
Longitude,
[Feature Class],
[Feature Code],
[Country Code],
[CC2],
[Admin1 Code],
[Admin2 Code],
[Admin3 Code],
[Admin4 Code],
Population,
Elevation,
DEM,
TimeZone,
[Modification Date])
SELECT db1.[GeoNames-Staging].[GeoNameID-Staging],
db1.[GeoNames-Staging].[Name-Staging],
db1.[GeoNames-Staging].[ASCIIName-Staging],
db1.[GeoNames-Staging].[AlternateNames-Staging],
db1.[GeoNames-Staging].[Latitude-Staging],
db1.[GeoNames-Staging].[Longitude-Staging],
db1.[GeoNames-Staging].[Feature Class-Staging],
db1.[GeoNames-Staging].[Feature Code-Staging],
db1.[GeoNames-Staging].[Country Code-Staging],
db1.[GeoNames-Staging].[CC2-Staging],
db1.[GeoNames-Staging].[Admin1 Code-Staging],
db1.[GeoNames-Staging].[Admin2 Code-Staging],
db1.[GeoNames-Staging].[Admin3 Code-Staging],
db1.[GeoNames-Staging].[Admin4 Code-Staging],
db1.[GeoNames-Staging].[Population-Staging],
db1.[GeoNames-Staging].[Elevation-Staging],
db1.[GeoNames-Staging].[DEM-Staging],
db1.[GeoNames-Staging].[TimeZone-Staging],
db1.[GeoNames-Staging].[Modification Date-Staging]
FROM db1.[GeoNames-Staging]
LEFT JOIN GeoNames ON db1.[GeoNames-Staging].[GeoNameID-Staging] = GeoNames.GeoNameID WHERE GeoNames.GeoNameID IS NULL
-
Apr 25th, 2013, 04:19 PM
#17
Thread Starter
Hyperactive Member
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
This is the error I just got when running it as is, regardless of the Update coming before or after the Insert.
Code:
no such column: db1.GeoNames-Staging.Name-Staging
Yet, the Insert isn't causing any issues and I can even view the data using a sqlite browser. I can't see any differences in the naming of the columns.
So, I tried taking out the first column (Name to be updated from Name-Staging) and the next column (ASCIIName -> ASCIIName-Staging) is bringing up the same exact error. I'm unsure why the Insert would be working with the same column structure/code.
-
Apr 26th, 2013, 02:49 AM
#18
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
Look closely at your Update query, there's nothing in there that introduces the db1.[GeoNames-Staging] table. You're trying to reference it's fields but you've never said what the table is. Your insert statement is different because you select from the tables (look at your join). Normally, when you want to update one table with values from a join, you use an Update From Join syntax. Unfortunately a quick google says that SQLLite doesn't support it.
I think your only option is to use sub-queries. Something like:-
Code:
UPDATE GeoNames SET
[Name] = (Select Name From db1.[GeoNames-Staging] where db1.[GeoNames-Staging].GeoNameID = GeoNames.GeoNameID),
[ASCIIName] = (Select ASCIIName From db1.[GeoNames-Staging] where db1.[GeoNames-Staging].GeoNameID = GeoNames.GeoNameID),
...
It's not pretty but it should work.
The best argument against democracy is a five minute conversation with the average voter - Winston Churchill
Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd
-
Apr 26th, 2013, 07:19 AM
#19
Thread Starter
Hyperactive Member
Re: Sqlite and db locking? 2 programs, 1 writing, 1 reading...
Funky, I can't thank you enough! I can't believe it's working! The sub queries are fully working and updating the main database.
My code now resides as a paramterized query for fulfilling a 2nd sqlite database file, and when that is done, I update/insert into the 1st database file using the new sub queries for updating and inserting where the ID's don't exist. The 2nd program queries simple selects from the 1st database file, so in theory/testing I am not experiencing locks that crash the programs.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|