-
Quicker way To Open a File
What's the quicker way to open a file?
I have alway used the open function, but it's too slow for my applications, infact i have measured that, to open 12000 files, it takes 160 seconds, well it's too much for my application. I am trying with the API lopen, but it seems that the time is more and less the same.
Another questions, is it possible to open the file and keep it opened for a long time, let's say a week? And how many files can a vb6 application open? Using the lopen API I have been able to open 160.000 files simultaneously, but i am afraid that the windows can become instable. What do you think about it?
Thanks in advance
-
Re: Quicker way To Open a File
:confused: :eek: What in the world are you doing the requires you to open 12,000 files at the same time?
And, why would you want to keep a file open for a week?
-
Re: Quicker way To Open a File
Well my application needs to access these files, and it has to access them wickly. I have measured that the most of the time is taken by the opening and closing of the file, and not by the data reading. So i was thinking of keeping the files always opened, in order to spare the opening/closing of the file. COnsider that i have a storage of more and less 150k files and the application needs to open these files a lot of times during the day. The files are deleted from the hd after a week (more and less), and new ones arrive from another storage. These new files remain in the machine for another week (more and less) and are accessed by the application, then deleted, and so on
Any idea on how speeding up the file opening/closing?
-
Re: Quicker way To Open a File
Can you define what kind of processes are you doing? There maybe better options than opening enormous amounts of file, which maybe isn't needed. That's my opinion, imi thinking of using a database, .dat or ini files for this. So i need you to explain it in details.
-
Re: Quicker way To Open a File
A database was the first thing that occurred to me too. Any time you need to access large amounts of data, that's the best way to do it.
@jackmoros
The time needed to open a file is dependent on Windows - VB (and the API) just calls a Windows fileopen function - so there's no way a program can open a file faster.
-
Re: Quicker way To Open a File
The amount of data stored into the files is too greate to be stored into the DB, we are talking of an application that runs on 20 machines, every machine has to manage more and less 5 bilions of records stored into the files, so we are talking of more and less 100 billions of records. We already thought to a datawarehouse, but it should be for the future, now all this info is in the files, and sadly i need to open these files very quickly.
@Erroneous: Why are you thinking to a .ini or .dat file? Are they treated in a special way by Windows?
Thanks
-
Re: Quicker way To Open a File
Too much data for a database? :ehh:
What is the total size of all this data?
Could you load it all -or partly- into memory?
What is the purpose of this data?
Why is speed so crucial?
What is kind of data is it?
Is it read only?
What kind of machine does it run on?
Can you invest in hardware?
You could rewrite your code to use API instead of native VB to handle file I/O.
But I'm not sure if that would give a lot of extra speed.
-
Re: Quicker way To Open a File
Quote:
Originally Posted by jackmoros
The amount of data stored into the files is too greate to be stored into the DB
Hmmm, Maybe all those WORLD applications are doing it all wrong with databases :p :p :p :eek2:
Perhaps the stock market should switch to flat files...:D :cool:
-
Re: Quicker way To Open a File
There are plenty of databases that limit the size of the database to the size of the medium. IOW, if you have a terrabyte of storage, you're limited to a terrabyte of data. Need to store more data? Buy more storage.
But if it can be stored on a computer, it can be stored in a database on that computer.
@randem: Hollerith cards? Stone tablets?
-
Re: Quicker way To Open a File
Very well said Al42.
@jackmoros
You're defeating the purpose of database. If it will just textfiles, then there's no problem.
-
Re: Quicker way To Open a File
The other approach is multiple databases. Hmmm... that theory has been applied somewhere already :eek:
Hmmm... Hollerith cards. I haven't heard that term in a long while... Perhaps it will work ;)
-
Re: Quicker way To Open a File
If he has a warehouse large enough. ;)
-
Re: Quicker way To Open a File
Ok guys, very funny, but the application already has a database where to connect and store the data extracted from these files, but what i meant whith my last post was that it's not possible, now ,to store all that data into the db, or in another db, becasue a lot of motivations, first and foremost, because the way the application works should be radically changed, and it could take a lot of time. You will excuse me if i am too sharp but, what i am looking for, now, is a quick solution for my problem, that is, how to open a file quickly, and how many files can be opened in window.
@joren:
The customer where the problem arosehas, at the moment, 20 windows server 2003 running my application, each one serving up to 10GB of files a day, so more and less 70GB a week. Other customers have up to 150 machines, but the problem arose in the one with 20, so i need to focus on that one. Many files, the more recent ones, are already stored into the memory of the task, in order to be accessed in a quicker way, but you will agree that it's not possible to put all those data in memory.
The speed is crucial because the application must respond to a request from a db very quickly, open the files interseted by the request and insert the data retrieved into the db. From the measures we made, the most of the time is taken by the open/close of the files, and not by other operations like insert/update into the db or searches into the files or other. The requests from the db can arrive in every time of the day and the application must respond in "real time". The files are read only, i.e., the application does not need to write in them, but only to read. I can't invest in hardwer because the application has been already delivered. I need a solution in few days, we are investigating the problem and trying to figure out how to resolve it, in the meantime, every help is well accepted :)
Thanks in advance.
Consider that here in Italy it's saturday morning and i'm forced to stay here in the office working on this problem instead of going to the beach :sick: , men it's very sad :)
-
Re: Quicker way To Open a File
Well, VB's file handle is limited to open 511 files. If you use API's you could possibly open whatever the max amount of files for a long value.
I can't imagine having that many files open at once for the memory constraints would be a killer for just having the file open and not even doing any work.
BTW: You can't have a file open for a week unless your app is running for a week and holding it open.
-
Re: Quicker way To Open a File
@randem: Using the lopen API yesterday I have opened 150k files simultaneously, but Windows becomes unstable after a few minutes.
The application is always up, it's a windows service (i have used the ocx nsvc to do it), if it crashes for any reason, the handles should be automatically released and when, after a minute, it is automatically restarted by windows it should be able re-open the files.
-
Re: Quicker way To Open a File
Unfortunately with that design you'll have to give one way or the other... at the moment better hardware (especially secondary storage, get very fast ones) is the easiest option.
Keeping the files open is only feasible while there are still file handles available... use them all up and you'll run into other problems.
Loading file data into memory sounds good until you realize that it gets placed into the swap file anyway... so its just returns to secondary storage managed as part of memory. Also, probably the 'cause of your OS going unstable, memory addresses can only go so high and is not infinite.
Consolidating the files will give you less files to worry about but their sizes will increase and I don't think there'll be much improvement even with binary access.
-
Re: Quicker way To Open a File
Yes, window will be unstable for the reason I stated... That's why I can't imagine that many files being open...
-
Re: Quicker way To Open a File
I have never in all my years know that amount of files being needed open at the same time. If there were 10 files open at the same time that was huge. What kind of app is this?
-
Re: Quicker way To Open a File
It's an application for the revenue assurance of the tlc operators. It receives, via ftp, files called "charging files" from the network elements of the network of the customer and then verifies if the files are correct.
-
Re: Quicker way To Open a File
So what is the problem with them going into a database?
Why do so many files need to be open?
What kind of processing is done?
-
Re: Quicker way To Open a File
Why not have some sort of queue in place? Opening that many files would be so ram consuming it would make vb.net look better than vb6 :( and we don't want that now do we ;)
-
Re: Quicker way To Open a File
Following up post #4, why don't you just open the file that is needed? Can you post a system flow?
-
Re: Quicker way To Open a File
Queue would only add to the delay... and he's still not considering the database.
@jackmoros: another option, which takes advantage of latest CPU, is multithreading your application, unfortunately that can't be implemented easily (and if I'm not mistaken, can't be implemented safely) with VB. But if you go in that direction (writing code for internal app use and maintenance of all IO details involved) it would be like creating your own database engine like in the old days before relational databases... so might as well consider again using an enterprise level database.
-
Re: Quicker way To Open a File
Other that creating an ActiveX exe to handle the files so that each can open a series of files the database is your only option.
-
Re: Quicker way To Open a File
I'm guessing that the bottleneck is because you're putting way too many files in a single directory. I ran into this when I had a project that involved tens of thousands of image files.
How many files are in the directory from which you're opening files? If it's more than a thousand or so, that's your bottleneck. Windows is not a fan of having a ton of files in the same folder.
If this is the case, I'd suggest you come up with some sort of simple method for determining a random folder for each file to naturally reside in. In the project where I had to deal with this, each image file was named as the database record's unique ID. I originally wanted to simply use the first digit in the ID, but it turned out that 75% of them began with the same digit. So instead I used the last digit, which had a nice even distribution, allowing me to disperse the images into 10 different subfolders evenly.
Adding this logic was very simple; all I had to do was write a function that you'd send am ID, and it would return the complete path and filename of the image. Then I replace all references to files with that function. Worked like a charm.
The biggest performance gain will be the algorithm you choose to determine how to disperse the files among the different folders. In the end, simply opening a file is normally very fast, but this speeds degrades exponentially when some critical mass of files is reached. So first come up with the number of folders you need to split the files into, then come up with an effective way of evenly distributing them based on nothing more than the filenames themselves, and you should be good to go.
-
Re: Quicker way To Open a File
As a simple test, do the same time trial you describe in the OP, first with all 12000 files in a single directory, then with them dispersed evenly amoung 10 different folders, just to see how much (or even if) it helps. I'm curious to see how it compares to the 160 seconds you mention.
-
Re: Quicker way To Open a File
The files are already spread in directories, according to a certain algorithm that assigns each file to the proper dir, and the hard disk of every machine is defragmented each day at 3 am to assure a low defragmentation level.