Page 1 of 2 12 LastLast
Results 1 to 40 of 69

Thread: Easiest and quick way to check if files are identical?

  1. #1

    Thread Starter
    Member
    Join Date
    Nov 2005
    Posts
    62

    Easiest and quick way to check if files are identical?

    What would be the easiest way to check if two files are identical. Mind you, this for server / client checks. basicly the server sends some code of the file and something about the file (lets say file size), and then client will check to see if it is the same or not, and if it is different, it will download it. But this ain't about getting the file sent over.

    Simple put, what would be the best way to detect if a file is different. I have been trying modified dates, but many installers change the modified date depending on timezone from what I've seen, Ive also had one change it by simply seconds!

    I've done file size before, but I've seen filesize sometimes doesn't 'change fully' if the change is something minor, (Like a couple things done in a image.)

    I've hear of hash checks spitting out a 16 number code or something like that, but I've not been able to find it anywhere, google isn't my friend today.

    Thanks!

  2. #2
    Hyperactive Member
    Join Date
    Jun 2006
    Posts
    372

    Re: Easiest and quick way to check if files are identical?

    google md5

  3. #3
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Easiest and quick way to check if files are identical?

    CRC is another option. This is an extremely fast CRC32 class and in my opinion, will provide the fastest way to see if one file is the same as another.

    A quick file length check before checking CRC32 can speed things up (incase you're comparing say, a 4GB file with a 1MB file, it would be a waste of time to do the CRC).

    http://www.pscode.com/vb/scripts/Sho...12638&lngWId=1

  4. #4

    Thread Starter
    Member
    Join Date
    Nov 2005
    Posts
    62

    Re: Easiest and quick way to check if files are identical?

    Thank you, rated as well.

    That's what I needed, CRC checks, totally forgot about them! Great!

  5. #5
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    CRC is not reliable to see if files are the same, just to see if a file has changed.

    Two files with the same characters in a different order will produce the same CRC.

  6. #6
    PowerPoster Code Doc's Avatar
    Join Date
    Mar 2007
    Location
    Omaha, Nebraska
    Posts
    2,354

    Re: Easiest and quick way to check if files are identical?

    This might work:

    Code:
    DIM FirstFileData as String, SecondFileData as String
    OPEN "FirstFile" for Binary as #1
    FirstFileData = SPACE(LOF(1))
    GET 1, , FirstFileData
    OPEN "SecondFile" for Binary as #2
    SecondFileData = SPACE(LOF(2))
    GET 2, , SecondFileData
    IF FirstFileData = SecondFileData THEN
        MsgBox "Files are Identical."
    Else MsgBox "Files are NOT Identical."
    End If 
    Close
    Before FileCopy showed up, I used to copy files like this:
    Code:
    DIM FirstFileData as String
    OPEN "FirstFile" for Binary as #1
    FirstFileData = SPACE(LOF(1))
    GET 1, , FirstFileData
    OPEN "SecondFile" for Binary as #2
    Put 2, , FirstFileData
    Close
    Last edited by Code Doc; Mar 20th, 2007 at 09:06 AM.
    Doctor Ed

  7. #7
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by randem
    CRC is not reliable to see if files are the same, just to see if a file has changed.

    Two files with the same characters in a different order will produce the same CRC.
    That's why it's usually a good idea to take at least a couple CRCs from different parts of the file and compare them.

  8. #8
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    That won't make a bit of difference if those are not the places the file was changed at. It only takes one bit changed in a file to make it not the same...

  9. #9
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by randem
    That won't make a bit of difference if those are not the places the file was changed at. It only takes one bit changed in a file to make it not the same...
    If even one byte of the file is different, doing a CRC on the whole file will produce different results.

  10. #10
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    That's my point exactly.... It is not reliable It only takes one BIT of information one binary digit to throw the whole thing off. That will determine that they are not identical but take a file with "AB" and change it to "BA" and do a CRC and it will say that the files are identical and they are not.

  11. #11
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by randem
    That's my point exactly.... It is not reliable It only takes one BIT of information one binary digit to throw the whole thing off. That will determine that they are not identical but take a file with "AB" and change it to "BA" and do a CRC and it will say that the files are identical and they are not.
    Uh, no. What CRC code are you using that gives you that result?

  12. #12
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    Basic CRC is done by adding a byte to a word and having it rollover or add a word to a dword and having it rollover. Either way if you exchange a byte for the first one or a word for the second the results will remain the same.

  13. #13
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by randem
    Basic CRC is done by adding a byte to a word and having it rollover or add a word to a dword and having it rollover. Either way if you exchange a byte for the first one or a word for the second the results will remain the same.
    Not with (a good) CRC algorithm. The one I'm using (CRC32) does not produce those results.

  14. #14
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    It has to that is the meaning of Cyclical Redundancy Check. In CRC32 all that means is that you are operating on 32 bit words at a time. Try switch binary information on a word boundary.

  15. #15
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by randem
    It has to that is the meaning of Cyclical Redundancy Check. In CRC32 all that means is that you are operating on 32 bit words at a time. Try switch binary information on a word boundary.
    I'm not going to try and claim I'm an expert at CRC algorithms but all I'm saying is I've never encountered one where if you simply switch the bytes it will produce the same result. I even tested that several times on the CRC class I'm using now.

    I would imagine if just switching the byte order of the data produced the same results, CRC wouldn't be of much good for anything (checking for packet loss/corruption in a network program, for example).

    While CRC is not perfect and might not be the best option for checking if a file is the same as another (although it seems accurate enough for me and is definitely faster than comparing every byte of the file). Maybe you can suggest a better method.

  16. #16
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    Switch a byte in CRC32 would generate a "not the same" result but switching a word (two bytes on a word boundary) would generate as "the same". It is good to determine if a file has been changed basically but if you want to be certain CRC will produce good results most of the time. But it's in the times when it doesn't is where the most problems will occur. CRC was mainly used in communications when one side would send data to the other side and it needed to be verified that the same data was received. This was for transmission purposes in which the byte word swapping was not an issue. The issue then was dropped bits or scrambled bits.

  17. #17
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by randem
    Switch a byte in CRC32 would generate a "not the same" result but switching a word (two bytes on a word boundary) would generate as "the same". It is good to determine if a file has been changed basically but if you want to be certain CRC will produce good results most of the time. But it's in the times when it doesn't is where the most problems will occur. CRC was mainly used in communications when one side would send data to the other side and it needed to be verified that the same data was received. This was for transmission purposes in which the byte word swapping was not an issue. The issue then was dropped bits or scrambled bits.
    So you're saying that some CRC algorithms use 2 byte word boundaries (meaning it works with 2 bytes at a time?) Which is why if you just reverse the 2 bytes (AB to BA) then it will produce the same checksum?

    But if the CRC algorithm worked with 1 byte at a time (1 word boundary?) then switching 2 bytes in the file will produce different checksums.

    Because I tried the AB/BA test on the CRC algorithm I'm using and it produced different checksums.

  18. #18
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    The Qualities of the CRC-32

    CRCMAN uses the CRC-32 algorithm to generate a 32 bit number for any given file. We then treat this 32 bit number as a somewhat unique "fingerprint" for that file. This fingerprint differs somewhat from the human fingerprint. It often said that no two people have identical fingerprints. This can't be the case for our CRC fingerprint. Since there are more than 4,294,967,296 different files in the world, it is a foregone conclusion that some of them must have identical checksums.

    However, the CRC-32 does have attributes that make it very attractive for the verification of files. These include the following:

    You can read more here http://www.dogma.net/markn/articles/crcman/crcman.htm

  19. #19
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    BTW: in your example you are using a CRC16 example in CRC32 computations. That won't produce the same results.

  20. #20
    PowerPoster
    Join Date
    Feb 2006
    Location
    East of NYC, USA
    Posts
    5,691

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by randem
    Basic CRC is done by adding a byte to a word and having it rollover or add a word to a dword and having it rollover.
    That's a checksum, not a CRC.
    The most difficult part of developing a program is understanding the problem.
    The second most difficult part is deciding how you're going to solve the problem.
    Actually writing the program (translating your solution into some computer language) is the easiest part.

    Please indent your code and use [HIGHLIGHT="VB"] [/HIGHLIGHT] tags around it to make it easier to read.

    Please Help Us To Save Ana

  21. #21
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    What is produced from CRC32 is exactly a checksum it's just a different way of achieving the same thing but the basic thing is it is still not 100% reliable to compare files. A text file and a pure binary file could have the same checksum and be totally different.

  22. #22
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    Now this could be most embarrassing in a production environment when you produce the same checksums from two different files and say that they are the same when after checking the first byte of each file it would have told you they were different.

    And that would be 100% reliable...

  23. #23
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Easiest and quick way to check if files are identical?

    Well, to be honest CRC seems to work in MOST cases. And I really can't think of a better solution. I don't know of a better/more reliable algorithm out there. MD5 might be an option.

    The only sure-fire way is a byte-by-byte comparison which can be extremely slow if the files are large and they are the same. If there was an ASM/C DLL written for that it might be faster.

    Either way, AFAIK, CRC32 seems to be the best option. Maybe you know a better way?

  24. #24
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    I use byte by byte comparison and it is not slow at all. It may be possible that the routine you would use could be tweaked but I use it all the time.

  25. #25
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by randem
    I use byte by byte comparison and it is not slow at all. It may be possible that the routine you would use could be tweaked but I use it all the time.
    I personally don't have any use for an algorithm like this. I could write an optimized byte-by-byte comparison routine, but if 2 large files (~2GB) are the same, it would be slow as hell.

    If they were different, the loop could just exit on the first byte that is not the same, and it would be pretty fast. But looping through 2GB worth of data (if the file are the same) would take awhile.

    Maybe you can post your function for the OP?

  26. #26
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    That is the correct way to determine if a file is not the same... by exiting on the first non match. I will post when I get time to extract it.

  27. #27
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Easiest and quick way to check if files are identical?

    Here's one I wrote real quick. If you're comparing 2 files that are the same, and they are really large (~2GB) then it will take along time because it will have to scan through the whole file.

    Might be able to make it faster by loading more than 1 byte from the file at a time but this is just an example:

    vb Code:
    1. Option Explicit
    2.  
    3. 'Check if 2 files are the same.
    4. Private Function FilesSame(ByVal FilePath1 As String, ByVal FilePath2 As String) As Boolean
    5.     Dim intFF1 As Integer, intFF2 As Integer
    6.     Dim byt1 As Byte, byt2 As Byte
    7.     Dim lonL1 As Long, lonL2 As Long
    8.     Dim lonCurByte As Long, bolDiff As Boolean
    9.    
    10.     If Len(FilePath1) = 0 Or Len(FilePath2) = 0 Then Exit Function
    11.     lonL1 = FileLen(FilePath1)
    12.     lonL2 = FileLen(FilePath2)
    13.    
    14.     If lonL1 <> lonL2 Then Exit Function
    15.     If lonL1 = 0 Or lonL2 = 0 Then Exit Function
    16.    
    17.     intFF1 = FreeFile
    18.     lonCurByte = 1
    19.    
    20.     Open FilePath1 For Binary Access Read As #intFF1
    21.         intFF2 = FreeFile
    22.            
    23.         Open FilePath2 For Binary Access Read As #intFF2
    24.        
    25.             Do
    26.                 Get #intFF1, lonCurByte, byt1
    27.                 Get #intFF2, lonCurByte, byt2
    28.                
    29.                 If byt1 <> byt2 Then
    30.                     Close #intFF1
    31.                     Close #intFF2
    32.                     bolDiff = True
    33.                     Exit Do
    34.                 End If
    35.                
    36.                 lonCurByte = lonCurByte + 1
    37.             Loop Until lonCurByte > lonL1
    38.         Close #intFF2
    39.     Close #intFF1
    40.    
    41.     FilesSame = Not bolDiff
    42. End Function
    43.  
    44. 'Test the function.
    45. Private Sub Command1_Click()
    46.     Dim str1 As String, str2 As String
    47.    
    48.     str1 = "C:\1.txt"
    49.     str2 = "C:\2.txt"
    50.    
    51.     MsgBox FilesSame(str1, str2)
    52. End Sub

  28. #28
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    You would never attempt to load one byte at a time... That would take a very long time.

  29. #29
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by randem
    You would never attempt to load one byte at a time... That would take a very long time.
    Either way, even if you load say, 1KB at a time, you still need to compare that 1KB packet. You could convert it to string with StrConv() and compare it that way, but it's better to keep it as a byte array when comparing binary files.

  30. #30
    PowerPoster Code Doc's Avatar
    Join Date
    Mar 2007
    Location
    Omaha, Nebraska
    Posts
    2,354

    Question Re: Easiest and quick way to check if files are identical?

    Does anyone recall my earler post on this thread?
    Code:
    Dim FirstFileData As String, SecondFileData As String
    
    Private Sub Form_Load()
    Open "SAT 1" For Binary As #1
    FirstFileData = Space(LOF(1))
    Get 1, , FirstFileData
    Open "SAT 1.Bak" For Binary As #2
    SecondFileData = Space(LOF(2))
    Get 2, , SecondFileData
    If FirstFileData = SecondFileData Then
        MsgBox "Files are Identical."
    Else: MsgBox "Files are NOT Identical."
    End If
    Close
    End Sub
    To test it, I used two files, one a copy of the other, each 14 Mb in size. Execution time < 0.2 seconds. Result: "Files are Identical"

    Then I changed the absolute last visible character of the backup file. Both files were still the same byte size. Execution time < 0.2 seconds. Result: "Files are NOT Identical".

    Am I missing something? My case rests.
    Doctor Ed

  31. #31
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by Code Doc
    Does anyone recall my earler post on this thread?
    Code:
    Dim FirstFileData As String, SecondFileData As String
    
    Private Sub Form_Load()
    Open "SAT 1" For Binary As #1
    FirstFileData = Space(LOF(1))
    Get 1, , FirstFileData
    Open "SAT 1.Bak" For Binary As #2
    SecondFileData = Space(LOF(2))
    Get 2, , SecondFileData
    If FirstFileData = SecondFileData Then
        MsgBox "Files are Identical."
    Else: MsgBox "Files are NOT Identical."
    End If
    Close
    End Sub
    To test it, I used two files, one a copy of the other, each 14 Mb in size. Execution time < 0.2 seconds. Result: "Files are Identical"

    Then I changed the absolute last visible character of the backup file. Both files were still the same byte size. Execution time < 0.2 seconds. Result: "Files are NOT Identical".

    Am I missing something? My case rests.
    Try testing that on a 2GB file.

    1. You're using a string variable (at least you're buffering it with Space() but it's still slower).

    2. You're loading the entire file into memory! If comparing 2 files, each of them 2GB, that's 4GB being loaded into memory.

    My case rests.

  32. #32
    PowerPoster Code Doc's Avatar
    Join Date
    Mar 2007
    Location
    Omaha, Nebraska
    Posts
    2,354

    Re: Easiest and quick way to check if files are identical?

    "You're loading the entire file into memory! If comparing 2 files, each of them 2GB, that's 4GB being loaded into memory."

    Sorry, DigiRev. I saw nowhere on the thread that either you or the original poster was working with files that big. At least I got your attention.
    Doctor Ed

  33. #33
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by Code Doc
    "You're loading the entire file into memory! If comparing 2 files, each of them 2GB, that's 4GB being loaded into memory."

    Sorry, DigiRev. I saw nowhere on the thread that either you or the original poster was working with files that big. At least I got your attention.
    If he's not working with large files, either of our methods will work. If he is, though, then so far CRC is the fastest option, in my opinion.

  34. #34
    PowerPoster Code Doc's Avatar
    Join Date
    Mar 2007
    Location
    Omaha, Nebraska
    Posts
    2,354

    Re: Easiest and quick way to check if files are identical?

    "If he's not working with large files, either of our methods will work."
    --------------
    Agreed. You can still use my method by reading the files in using 30 Mb chunks or so, and comparing each chunk. Move the pointer in each iteration. Exit the loop when a chunk fails to compare. Done.

    If my timer is correct, it will cost you at most 2 seconds on the average for a pair of 2 Gb files. Worst case is my example when the last chunk fails. I'm only running at 1.7 Ghz.
    Doctor Ed

  35. #35
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Easiest and quick way to check if files are identical?

    Alternative method using the rather quick native InStr function (quick if we take into account it does compare every single byte). Also reads in bigger chunks to improve performance, although admittedly strings aren't the best datatype to use. Used a chunk size of 50 HDkB.

    I also added some more error checks (in comparison to DigiRev's solution) so you can throw pretty much anything at it and it'll do it's job without throwing an error.
    Code:
    Public Function IsFilesSame(ByVal File1 As String, ByVal File2 As String) As Boolean
        Dim intFF1 As Integer, intFF2 As Integer, blnIsSame As Boolean
        Dim lngLen1 As Long, lngLen2 As Long
        Dim str1 As String, str2 As String
        ' ensure strings contain something
        If LenB(File1) = 0 Or LenB(File2) = 0 Then Exit Function
        ' ensure not same filename
        If File1 = File2 Then IsFilesSame = True: Exit Function
        ' ensure files exist
        If LenB(Dir$(File1, vbHidden Or vbSystem)) = 0 Then Exit Function
        If LenB(Dir$(File2, vbHidden Or vbSystem)) = 0 Then Exit Function
        ' get file lengths
        lngLen1 = FileLen(File1)
        lngLen2 = FileLen(File2)
        ' compare file lengths
        If lngLen1 = lngLen2 Then
            ' see if zero length
            blnIsSame = (lngLen1 = 0)
            ' if not zero length
            If Not blnIsSame Then
                blnIsSame = True
                ' read files in chunks
                intFF1 = FreeFile
                Open File1 For Binary Access Read As #intFF1
                intFF2 = FreeFile
                Open File2 For Binary Access Read As #intFF2
                Do While blnIsSame And lngLen1 > 50000
                    str1 = Input$(50000, #intFF1)
                    str2 = Input$(50000, #intFF2)
                    lngLen1 = lngLen1 - 50000
                    ' compare
                    blnIsSame = (InStr(str1, str2) = 1)
                Loop
                If blnIsSame Then
                    str1 = Input$(lngLen1, #intFF1)
                    str2 = Input$(lngLen1, #intFF2)
                    ' compare
                    blnIsSame = (InStr(str1, str2) = 1)
                End If
                Close #intFF1
                Close #intFF2
            End If
            ' what is the result...
            IsFilesSame = blnIsSame
        End If
    End Function
    Oh, and the reason optimizing for other than strings doesn't give that much in this case is that you're reading from disk to memory anyway, which is many times slower than processing only in memory. Although having a good chunk size can make a surprisingly big difference in some cases, but the optimal size for that often varies from computer to computer.

    Aww heck, and I'm awfully tired now... why oh why I spent my time on this...


    Edit!
    I'll never learn out of vbcode tags.

  36. #36
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    Here is something I cooked up so that you can actually see and compare the differences.
    Attached Files Attached Files

  37. #37
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    byte by byte comparison is the way to do it and relatively fast too.
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  38. #38
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    but make sure the files are read into a buffer and then processed rather than loading the whole file into ram in one hit.
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  39. #39
    Banned learning c's Avatar
    Join Date
    Mar 2007
    Location
    canberra (australia's capital)
    Posts
    198

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by questioner
    byte by byte comparison is the way to do it and relatively fast too.
    another way would be to divide the file into x parts and byte sum each part and do that for both files, however, i don't think you can beat that approach for accuracy.

  40. #40
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Easiest and quick way to check if files are identical?

    Just for improving the speed for InStr method in what randem posted:
    Code:
    Private Function InstrCompareFiles() As Boolean
    Dim FirstFileData As String, SecondFileData As String
    Dim bytBuffer1(65535) As Byte
    Dim bytBuffer2(65535) As Byte
    
    
        Seek #Fnum, 1
        Seek #Fnum1, 1
        
        InstrCompareFiles = True
        StartTime = GetTickCount
    
        Do While Not EOF(Fnum)
    
            Get Fnum, , bytBuffer1
            
            Get Fnum1, , bytBuffer2
            
            If InStr(bytBuffer1, bytBuffer2) = 0 Then
                InstrCompareFiles = False
                Exit Do
            End If
            
        Loop
        
        EndTime = GetTickCount
        
        Msg = Msg & vbCrLf & vbCrLf & "InstrCompareFiles Elapsed time - " & EndTime - StartTime & " ms"
            
    End Function
    In this format it is less than one third of what it was (on my computer means about three seconds vs. about ten seconds). All I did was to replace string processing with byte arrays and take away that recreation and freeing of memory blocks in a loop...

    Taking this into account, randem's sample is flawed: too much time goes into something that shouldn't be happening (in a benchmark of something).


    Edit!
    I also ticked on all the advanced optimizations, after which ByteArrayCompareFiles and ByteArrayXorXCompareFiles became the two fastest solutions in randem's original sample (but are slower than the chunked InStr above, because reading chunks is more efficient).
    Last edited by Merri; Mar 22nd, 2007 at 05:29 PM.

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width