Results 1 to 2 of 2

Thread: Hi Performance File Read/Compare

  1. #1

    Thread Starter
    New Member
    Join Date
    Aug 2009
    Posts
    7

    Question Hi Performance File Read/Compare

    Hey everyone,

    this time im looking for a faster-than-fast way to read binary files
    into a string/buffer/whatever, like Delphi's BlockRead does...

    i need this for an application that checks a list of folders/files and
    looks for multiple files, and handles them according to specific
    criteria.

    I know there are standard solutions for this, like DuplicateFileRemover
    (gotta love that sheep! ), but i need something custommade,
    coz i want to handle the duplicates automatically too, and im talking
    about GigaBytes of data i have to periodically clean out this way...

    So far, the algorithm i use to determine if 2 files are equal is this:

    compare both FileSizes, if they match, then
    compare both FileDates, if they match, then
    read and compare the first 256 bytes of each file, if they match, then
    read and compare the rest of them 2 files...

    is this algorithm sufficient or am i missing something important?
    (so far, i ignore the files names by design.... should i not?)

    Comparing 2 strings/filebuffers i do like this:

    start with char #1, compare char by char, until either all
    chars of the string are compared, or a char does not match
    it's counterpart of the other buffer....

    does it make sense to write this particular procedure in assembler,
    or is the Do-Loop-Until fast enough?

    Thanx in advance, any help/suggestion/thought is appreciated...

    Mike aka StoneTheIceman

  2. #2
    PowerPoster stanav's Avatar
    Join Date
    Jul 2006
    Location
    Providence, RI - USA
    Posts
    9,290

    Re: Hi Performance File Read/Compare

    I have a different approach by using a Dictionary(Of String, String) where the key is the file full path and the value is the MD5 hash of that file. So basically you'd get the all the files in a directory (System.IO.Directory.GetFiles method) and run a loop thru the files. For each file, you compute its MD5 hash and add it to the dictionary. When all done, you will examine the values in your dictionary for duplicates.
    Let us have faith that right makes might, and in that faith, let us, to the end, dare to do our duty as we understand it.
    - Abraham Lincoln -

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width