|
-
May 17th, 2008, 05:08 PM
#1
Thread Starter
Frenzied Member
How can you remove duplicates in a textfile that is 200mb big?
Anyone know how I can do this in VB? Is it possible?
-
May 17th, 2008, 05:15 PM
#2
Re: How can you remove duplicates in a textfile that is 200mb big?
Thats a massive text file, just my two cents but to remove duplicates would be a RAM hog.
-
May 17th, 2008, 05:56 PM
#3
Re: How can you remove duplicates in a textfile that is 200mb big?
I would like to think it is possible and I think :/ it could be done without using too much ram, it just might not be that quick.
If the general theory to remove duplicates involves comparing all the items with each other this version would involve comparing a chunk of items with all the others a chunk at a time.
Can you show us a small sample of your big text file and confirm what you mean by duplicates.
-
May 17th, 2008, 08:46 PM
#4
Re: How can you remove duplicates in a textfile that is 200mb big?
Duplicate whats? Lines??? How big is each item etc...
-
May 17th, 2008, 08:56 PM
#5
Re: How can you remove duplicates in a textfile that is 200mb big?
it's possible but it's probably a better use of resources to break down the file into chunks of say 25mb, delete the dupes from those, then recombine to a new file and check that for duplicates again. As long as there are a lot of duplicates in the smaller files, you will have a much smaller final file to work on.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|