will it be faster if use database?
Hi,
I have an application. It loads data from several folders. IN each folder, it has many csv files. Bascially, they are for indexing like:
index, value
index, value
....
the program loads data from all files (all data are in the variables, list(of integer)), then do some calculations. during that, it will need search the index to get some parameters.
it works fine. however, as time goes, we have more and more indexing files. sometimes, the data under ONE folder could have several millions indexes.
Now, the program runs slowly even on a 8G PC (although result is still correct), and we are having more data coming in.
my question is: if I change this to use a sql expression database, it sure will use much much less memory, but how about the performance? will it run much faster? or it really depends, and I have work a working version to compare?
is there a general rule like when the data is such such size, use database will be much better?
(most cases, it will use SELECT to query the table only).
any suggestions?
thanks bear
Re: will it be faster if use database?
If you put this on a database and expect to select a million indexes at once basically you're FKD.You must change your collection statements to sorter ones.
I guess a database would be faster if you open-read-close,open-read-close the csv.I say i guess and not i'm sure because i have never used(and never will) your approach.
Re: will it be faster if use database?
no, maybe you misunderstood it. it has several millions rows, but each SELECT will only return one row, not millions rows.
Re: will it be faster if use database?
Yes, i though you select million rows at once.
So if you say that adding files, even if you only select one, makes performance degrade then you can easily do a test.
Insert one million test rows on an sql server and pick one to see how much time it takes.
I'm guessing the time trouble you have is because of the system, even trying to select on file, still has to search through millions.As i've said i've never tried what you do.
Re: will it be faster if use database?
Setting indeces properly in the DB should provide some significant performance improvement over what you are doing now, which sounds like a brute force search of non-indexed records. The test that Sapator suggested would be a good one to run, and a pretty quick one to run, but it is only valid if you take the effort to set indexes. I would guess that one version of searching non-indexed data is pretty similar to others, except that you might get a benefit from the DB because of it all being contained in one optimized engine, as far as searching goes.