How to retrieve data from large files in database applications?
I have about 25 text files containing ~15,000 names/description in this format :
1234 Anthony Hopkins :
Multiline or multipage description.
5678 Leonardo da Vinci :
Multiline or multipage description.
Let's say that I could extract the names/serial numbers and put in a database -each record has these attributes:
idnum, name, file_name
What is the best way to retrieve the description for a given name?
Do I store the position of start/end characters for the description, so I need two additional attributes like:sPos, ePos ???
Re: How to retrieve data from large files in database applications?
How much freedom do you have to manipulate these files? The most obvious solution would be to move the descriptions into the database itself. After all, that would allow you to get the data most quickly, without resorting to offsets.
If you can't alter the files, or have to maintain the data in the files, then a starting line would be convenient....but ONLY if you can be certain that the file will NEVER be changed. If any change is made anywhere in the file, all of the offsets in the database after the change may be altered. This makes use of saved offsets such as you suggest fairly problematic.
Another solution would be to save the descriptions into individual files. The database would hold nothing but the file name, as the file itself would be the rest of the information. This would prevent accidental or intentional changes from corrupting the entire database.
Re: How to retrieve data from large files in database applications?
The problem is that the description can be very large-pages- that I can't include it in the database.
Another problem is to save each discription in a single file, that will produce about 15,000 files
I can only go the hard way i.e. saving the position. Unfortunately there is no way-as far as I know-in VB.NET to directly read from a particular line and save its number.
Re: How to retrieve data from large files in database applications?
What is the problem with 15k files? Different files will not mean much more total memory. Sure it is a daunting prospect, but 15k records in a database is hardly worth worrying about, so why worry about 15k files?
On the other hand, can you be certain that the data in the files will not change and invalidate your referencing scheme? It sounds like you are assuming that. If so, it seems to me that you need a function that will read the file to line x, then read the file to line y, then return the text in between.
It would be nicer if you could come up with a way to not need that line y. It sounds like your files are laid out in such a way that you might be able to determine where the end of a description is, which would mean that an end point would not be needed. This would very slightly reduce the likelihood of editing causing problems.
It also sounds like the beginning of a chunk of data is standardized enough that you could write something that looked for the beginning, and sychronized the files to the database.
However, I still prefer the idea of a massive database rather than a massive set of unattached and easily edited/deleted files.
Re: How to retrieve data from large files in database applications?
First, thank you very much Shaggy Hiker and Aaron Young
If I'd go with using files instead of appending the description to the database then these file won't be changed- I will make them read only and hidden
[QUOTE=Aaron Young]Have you considered using XML?
I've attached a working example.
QUOTE]
Actually that was my next question!! This would be like including the description in the database and users does not have to have additional database software to install
I guess the code is created using Visual Studio 2003 because it can't be opened proberly in my VS 2002 but I worked it out.
Now I need to practice dealing with more XML