Page 2 of 2 FirstFirst 12
Results 41 to 69 of 69

Thread: Easiest and quick way to check if files are identical?

  1. #41
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    questioner,

    How can you make a statement like that after the code I posted????

  2. #42
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by randem
    questioner,

    How can you make a statement like that after the code I posted????
    hi randem

    your code wouldn't run on my machine due to a file path issue!

    was the conclusion that loading into a string and doing a string comparison was the fastest approach? applying a buffer read to that approach would make it scalable too
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  3. #43
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    SO, change the path to a file on your machine that exist...

  4. #44
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    ok, good comparison, string comparison and instr both won from time to time. must be that there were other instructions being executed by the cpu slowing down or speeding up the execution of each procedure.
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  5. #45
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    btw i tried to use it to compare 2 different files like this

    Private Sub Form_Load()

    ' Filename = "E:\Server Data\Randem\Jazz Drive\Jazz Drive.rar"
    ' Filename1 = Filename
    Dim filename2 As String

    Filename = "c:\temp\temp\mov\001.mpg"
    Filename1 = Filename
    filename2 = "c:\temp\temp\mov\002.mpg"

    Fnum = FreeFile
    Open Filename For Binary Access Read Shared As #Fnum

    Fnum1 = FreeFile
    Open filename2 For Binary Access Read Shared As #Fnum1

    hTL = GetThreadLocale()

    End Sub



    and got this:
    files are equal

    ByteArrayCompareFiles Elapsed time - 172 ms

    ByteArrayXorCompareFiles Elapsed time - 360 ms

    StringCompareFiles Elapsed time - 15 ms

    InstrCompareFiles Elapsed time - 16 ms

    APICompareFiles Elapsed time - 62 ms
    Last edited by questioner; Mar 23rd, 2007 at 01:06 AM.
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  6. #46
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    And....???

  7. #47
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    are you sure that code compares properly?
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  8. #48
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    Quote Originally Posted by questioner
    are you sure that code compares properly?
    This coming from a person that did not change the filename to point to a file already on their computer

    Of course...

  9. #49
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    Here are a few minor tweaks to the code as suggested...
    Attached Files Attached Files
    Last edited by randem; Mar 23rd, 2007 at 02:18 AM.

  10. #50
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    ok, well now did you do something to the instr comparison? it seems faster now although the string comparison still wins some of the time.
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  11. #51
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    Yes, that was one of the tweaks Merri suggested.

  12. #52
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    so the process gets interrupted sometimes making the string comparison win?
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  13. #53
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    What????

  14. #54
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    well why does the string comparison win sometimes, and the instr win at other times?
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  15. #55
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    On my system Instr wins everytime. Are you changing filenames between runs?

    Because the byte buffer at the end of the file for the instr is still the large length (65535) enven though the last part of the file could be (128). It will still compare the whole size of the buffer leading to longer results. This may be the reason of the variance...

  16. #56
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    another way of looking at it; why does the same code take a different amount of time to run every time?
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  17. #57
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    It depend on what else is running on your computer at the time and the speed of your hard.

  18. #58
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    exactly the thread of you program is constantly being interrupted by other threads that are running on the pc at the same time.
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  19. #59
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    And that's a problem how???

    On my computer the Instr compare is almost half the time of the string compare

  20. #60
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    no problem.

    have updated the project to a higher resolution timer
    Attached Files Attached Files
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  21. #61
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    And what is that supposed to accomplish and add to or help the original problem?

  22. #62
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    randem

    if you look up to my post above you will see that string comparison and instr comparison are almost the same number of milliseconds, the higher resolution timer alows one to distinguish performance in such a circumstance. it also highlights that the same code usually takes a different amount of time to run. understanding this allows a proper analysis of each algorithm.
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  23. #63
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    Does not... It is just for informational purposes only... If you need that degree of accuracy in instruction timing you are using the wron language and it will be MOOT!!!

  24. #64
    Member questioner's Avatar
    Join Date
    Mar 2007
    Location
    question land
    Posts
    49

    Re: Easiest and quick way to check if files are identical?

    those same issues hold true of other languages too
    if God killed everyone on earth in the flood of Noah, then he killed hundreds of millions of innocent lives. He could have saved all the good souls; he didn't though. Isn't condemning the souls of innocents, the work of the devil? Is Jesus the real God?

  25. #65
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Easiest and quick way to check if files are identical?

    There is no need for high accuracy in this case, a few milliseconds here or there do not matter in the long run. Basically, what was to be shown true was that reading chunks and comparing those is more efficient (with bigger files) than reading a file and comparing one byte at a time (which is faster with small files).

    A single code snippet as such is not important. Instead, it is important for a programmer to understand what are the choices he has and how those affect in what he is going to accomplish so that he can do things in a good way.

    So in this case we can come up to the conclusion that in general it is good to recommend or give an example of a chunked file comparison, because 1) if somebody happens to use the code for a big file or do his own code based on given information, it'll perform well enough and 2) even though chunked is slower with small files, the files are so small it doesn't matter if the performance is slightly worse.

    There is also the case of reading the whole file at once, but it is the worst possible choice: the file may be big and since the comparison only begins once the file is read entirely, the worst scenario could be that the first byte is different. In which case reading all the data to memory was a waste of time.

    If we'd be giving an example to a specialized case where we know the file is always small, then it'd be a different story and it would be pretty much all the same if one byte or chunked would be used.


    Yay for informational posts.

  26. #66
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    The best senario would be to determine the size of the file then determine if you want to read the whole file at once or in chunks. I have never heard of reading one byte at a time being faster. I will put that in the test code to disprove or verify.

  27. #67
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    As expected reading byte by byte is the absolute slowest of them all. Snails passed it as it was working... it's at least 32 times slower than the slowest routine...
    Attached Files Attached Files
    Last edited by randem; Mar 24th, 2007 at 03:53 AM.

  28. #68
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Easiest and quick way to check if files are identical?

    It depends on filesize and if the files are different; and how early different. But it is far from ideal. It seemed the fastest when files were very small (and thus, when it least matters to be the fastest).

    Also, the way of reading whole file at once becomes awfully slow when there is not enough memory to read the file into memory. Swapping kills all the performance.

    Code:
    Private Function LongCompareFiles() As Boolean
        Dim lngData1(4095) As Long, lngData2(4095) As Long, lngA As Long
        Dim lngLow As Long, lngHigh As Long, lngCompare As Long
    
        Seek #Fnum, 1
        Seek #Fnum1, 1
        
        lngLow = LBound(lngData1)
        lngHigh = UBound(lngData1)
        
        lngCompare = 0
        StartTime = GetTickCount
        Do While Not EOF(Fnum) And lngCompare = 0
            Get #Fnum, , lngData1
            Get #Fnum1, , lngData2
            
            For lngA = lngLow To lngHigh
                lngCompare = (lngData1(lngA) Xor lngData2(lngA))
            Next lngA
        Loop
    
        EndTime = GetTickCount
        
        LongCompareFiles = lngCompare = 0
        
        Text1.Text = Text1.Text & vbCrLf & vbCrLf & "LongCompareFiles Elapsed time - " & EndTime - StartTime & " ms"
        DoEvents
    End Function
    To test the speed of this function, compile the program: Make Project1.exe > Options... > Compile > Advanced Optimizations... > Check all > OK > OK > OK

    It should be twice faster than the previously fastest.


    Edit!
    Improved by keeping the datatype as much in Long as possible. Now should be over two times faster with big files than the earlier fastest.

    Edit #2
    Also noticed that the 16kB chunk size that I used performs far better than other chunk sizes: with 1 GB file the difference can be about 300-400 ms when compared to 64 kB chunks. Of course, this varies depending on hardware.
    Last edited by Merri; Mar 24th, 2007 at 09:23 AM.

  29. #69
    Banned randem's Avatar
    Join Date
    Oct 2002
    Location
    Maui, Hawaii
    Posts
    11,385

    Re: Easiest and quick way to check if files are identical?

    Here is an update to the program along with Merri's new function. It also test different buffer sizes for maximum efficiency.
    Attached Files Attached Files

Page 2 of 2 FirstFirst 12

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width