|
-
Jan 16th, 2000, 11:27 PM
#1
Thread Starter
Junior Member
I have 2 lists of file paths that I need to compare and get the difference. The lists are in text files. The first one is about 10474 lines long and the other file ranges from 10474 to 37000 lines long (the longest one I've seen so far). Right now, the program reads in the first file into an array. Then the second file is opened and read in one line at a time. The read line is compared against the array. If it was not found, it is added to a listbox. Right now, it takes about 4 to 5 minutes to do the 37000 line compare on a PII 400 w/ 128 meg ram. This program need to run on a vsriety of PC's ranging from a 386 to a PII. So trying to run it on a 25 MHz 386 would take 30 minutes to compare.
Any one knows of a faster way to compare the 2 files ?
Thanks
------------------
Joe Handal
Workstation Engineer
[email protected]
-
Jan 16th, 2000, 11:53 PM
#2
Instead of reading the first file into an array and then sequentially reading the array, create a keyed Collection instead. That way you can you can tell with just one read if the string from the second file exists in the first. I think this will be much faster. See my response to this post for an example of the technique. I would be most interested in knowing the results. Please e-mail me.
------------------
Marty
-
Jan 17th, 2000, 12:28 PM
#3
Is your file sorted in any way?
If so then you can start in the middle and only compare in one direction.
For example :
if the word was monkey
you would compare the string to the middle string of your array
if it was less than then search backward from the middle of the file if not then search forward.
Also you can terminate your loop once a match is found.
Just suggestions
------------------
Boothman
There is a war out there and it is about who controls the information, it's all about the information.
-
Jan 17th, 2000, 12:47 PM
#4
-
Jan 17th, 2000, 10:05 PM
#5
Thread Starter
Junior Member
I tried MartinLiss's suggestion using Collections and it was very fast.
I timed the old way and the new way. The old way took 6.5 minutes, while the new way using a collection took only 25 seconds WOW!
The only problem with that is that the collection takes up 16 byts of memory no matter what data type is being stored in it. Other than that, collections are great.
Thanks a lot for your help
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|