Results 1 to 19 of 19

Thread: Help achieving CPU's max potencial

  1. #1

    Thread Starter
    New Member
    Join Date
    Mar 2017
    Posts
    10

    Help achieving CPU's max potencial

    Hey, I'm writing a small program that compares images and my problem is:
    I need to compare over a 1300 images.
    The method i'm using is incredibly slow for that much images.
    Can anyone help me upgrading this source?

    Code:
     For i = 0 To files.Length - 3
                For j = i + 1 To files.Length - 2
                    p1.Image = Image.FromFile(files(i))
                    p2.Image = Image.FromFile(files(j))
    
                    If AreSameImage(p1.Image, p2.Image, counter) Then
                        f(f.Length - 1) = files(i)
                        Dim tmp() As String = f
                        f = New String(f.Length) {}
    
                        For k = 0 To tmp.Length - 1
                            f(k) = tmp(k)
                        Next
                    Else
                        Dim percentual As Double = (100 * counter) / (p1.Height * p1.Width)
    
                        If percentual > 90 Then
                            f(f.Length - 1) = files(i)
                            Dim tmp() As String = f
                            f = New String(f.Length) {}
    
                            For k = 0 To tmp.Length - 1
                                f(k) = tmp(k)
                            Next
                        End If
                    End If
    
                    counter = 0
                    p1.Image.Dispose()
                    p2.Image.Dispose()
                Next
            Next
    I guess the rest of the code is irrelevant but if needed I post it here.

  2. #2
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: Help achieving CPU's max potencial

    Instead of using a For loop, you might try calling Parallel.For. That will use the ThreadPool to execute multiple iterations in parallel. I've never actually written nested parallel loops before so give me a bit and I'll post an example.

  3. #3
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: Help achieving CPU's max potencial

    This code:
    vb.net Code:
    1. Imports System.Threading
    2.  
    3. Module Module1
    4.  
    5.     Sub Main()
    6.         Dim items = {"Zero", "One", "Two", "Three", "Four", "Five", "Six", "Seven"}
    7.         Dim timer As New Stopwatch
    8.  
    9.         timer.Start()
    10.  
    11.         For i = 0 To items.Length - 2
    12.             For j = i + 1 To items.Length - 1
    13.                 Thread.Sleep(500)
    14.                 Console.WriteLine("{0}: {1}, {2}", timer.ElapsedMilliseconds, items(i), items(j))
    15.             Next
    16.         Next
    17.  
    18.         Console.WriteLine("Total time: " & timer.Elapsed.ToString())
    19.         Console.ReadLine()
    20.     End Sub
    21.  
    22. End Module
    produced this output:
    500: Zero, One
    1014: Zero, Two
    1515: Zero, Three
    2016: Zero, Four
    2517: Zero, Five
    3018: Zero, Six
    3519: Zero, Seven
    4020: One, Two
    4521: One, Three
    5022: One, Four
    5523: One, Five
    6024: One, Six
    6525: One, Seven
    7026: Two, Three
    7527: Two, Four
    8028: Two, Five
    8529: Two, Six
    9030: Two, Seven
    9531: Three, Four
    10032: Three, Five
    10533: Three, Six
    11034: Three, Seven
    11535: Four, Five
    12036: Four, Six
    12537: Four, Seven
    13038: Five, Six
    13539: Five, Seven
    14040: Six, Seven
    Total time: 00:00:14.0408164
    while this code:
    vb.net Code:
    1. Imports System.Threading
    2.  
    3. Module Module1
    4.  
    5.     Sub Main()
    6.         Dim items = {"Zero", "One", "Two", "Three", "Four", "Five", "Six", "Seven"}
    7.         Dim timer As New Stopwatch
    8.  
    9.         timer.Start()
    10.  
    11.         Parallel.For(0,
    12.                      items.Length - 1,
    13.                      Sub(i, outerState)
    14.                          Parallel.For(i + 1,
    15.                                       items.Length,
    16.                                       Sub(j, innerState)
    17.                                           Thread.Sleep(500)
    18.                                           Console.WriteLine("{0}: {1}, {2}", timer.ElapsedMilliseconds, items(i), items(j))
    19.                                       End Sub)
    20.                      End Sub)
    21.  
    22.         Console.WriteLine("Total time: " & timer.Elapsed.ToString())
    23.         Console.ReadLine()
    24.     End Sub
    25.  
    26. End Module
    produced this output:
    524: Zero, One
    531: Zero, Two
    531: One, Two
    532: Two, Three
    535: One, Three
    1025: Zero, Three
    1032: Zero, Four
    1032: One, Four
    1033: Two, Four
    1036: One, Five
    1526: Zero, Five
    1533: One, Six
    1533: Zero, Six
    1534: Two, Five
    1537: One, Seven
    2027: Zero, Seven
    2035: Two, Six
    2035: Three, Four
    2038: Four, Five
    2038: Five, Six
    2528: Six, Seven
    2554: Three, Five
    2554: Five, Seven
    2554: Two, Seven
    2554: Four, Six
    3056: Three, Seven
    3056: Three, Six
    3058: Four, Seven
    Total time: 00:00:03.0595208
    That's a performance increase of about 4.6 times. Note that your mileage may vary, depending on what you're actually doing and also from run to run. It's important to note that the order of execution is going to vary from run to run as well, so you need to be sure that the order of execution is not important.

    EDIT: The code has been updated from my original post because I had made a mistake in the parallel loop limits. I thought I'd point it out here because it's an easy mistake to make. Note that the upper limits for the standard For loops are 'items.Length - 2' and 'items.Length - 1' respectively. Those upper limits are inclusive while in the Parallel.For calls, the upper limits are exclusive, they are 'items.Length - 1' and 'items.Length' respectively. I originally used inclusive upper limits in both cases so not all intended iterations were executed in the parallel case.

  4. #4
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,344

    Re: Help achieving CPU's max potencial

    I have updated the code example to fix an error in the parallel loops.

  5. #5
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,930

    Re: Help achieving CPU's max potencial

    The routine AreSameImage is likely to be a major bottleneck and have huge potential for improvement, so it would be a good idea to show us that.

    There are various ways of working with image data, and most of them are slow... using something like FastPix (in our CodeBank forum) can provide a big speed boost in most cases.

    I also get the impression that minor changes to AreSameImage could reduce the size of the code you showed us by eliminating duplication of the tmp/f parts of the code.

  6. #6

    Thread Starter
    New Member
    Join Date
    Mar 2017
    Posts
    10

    Re: Help achieving CPU's max potencial

    Quote Originally Posted by si_the_geek View Post
    The routine AreSameImage is likely to be a major bottleneck and have huge potential for improvement, so it would be a good idea to show us that.

    There are various ways of working with image data, and most of them are slow... using something like FastPix (in our CodeBank forum) can provide a big speed boost in most cases.

    I also get the impression that minor changes to AreSameImage could reduce the size of the code you showed us by eliminating duplication of the tmp/f parts of the code.
    I have made a few alterations in the code, after I have posted the code.
    Here's the update with the AreSameImage function.

    Code:
    Sub Main()
            Dim args As String() = Environment.GetCommandLineArgs
    
            Dim mpath As String = args(0).Substring(0, args(0).LastIndexOf("\"))
    
            Directory.CreateDirectory(mpath & "\Duplicatas")
    
            Dim counter As Integer = 0
            Dim f As String() = New String(0) {"Nenhum arquivo duplicado"}
            Dim p1, p2 As New PictureBox
    
    inicio:
            Dim ftmp As String() = Directory.GetFiles(mpath)
            Dim files As String() = New String(0) {"Nenhum arquivo de imagem"}
            Dim counterr As Integer = 0
    
            For a = 0 To ftmp.Length - 1
                If (ftmp(a).EndsWith(".jpg") Or ftmp(a).EndsWith(".png") Or ftmp(a).EndsWith(".bmp")) = True Then
                    files(counterr) = ftmp(a)
                    Dim tmp() As String = files
    
                    counterr += 1
                    files = New String(counterr) {}
    
                    For b = 0 To tmp.Length - 1
                        files(b) = tmp(b)
                    Next
                End If
            Next
    
            If files.Length = 1 Then
                If files(0) = "Nenhum arquivo de imagem" Then
                    Console.WriteLine(files(0))
                    Console.ReadKey()
                    Exit Sub
                End If
            End If
    
            For i = 0 To files.Length - 3
                For j = i + 1 To files.Length - 2
                    p1.Image = Image.FromFile(files(i))
                    p2.Image = Image.FromFile(files(j))
    
                    If AreSameImage(p1.Image, p2.Image, counter) Then
                        counter = 0
                        p1.Image.Dispose()
                        p2.Image.Dispose()
    
                        Shell("move /y " & files(i) & " " & mpath & "\Duplicatas" & files(i).Substring(files(i).LastIndexOf("\")))
                        Shell("move /y " & files(j) & " " & mpath & "\Duplicatas" & files(j).Substring(files(j).LastIndexOf("\")))
                        Console.WriteLine(files(i) & "   -   " & files(j))
                        Console.WriteLine("")
                        GoTo inicio
                    Else
                        Dim percentual As Double = (counter / (p1.Image.Height * p1.Image.Width)) * 100
    
                        If percentual > 90 Then
                            counter = 0
                            p1.Image.Dispose()
                            p2.Image.Dispose()
    
                            Dim str1 As String = "move /y """ & files(i) & """ """ & mpath & "\Duplicatas" & files(i).Substring(files(i).LastIndexOf("\") & """")
                            Dim str2 As String = "move /y """ & files(j) & """ """ & mpath & "\Duplicatas" & files(j).Substring(files(j).LastIndexOf("\") & """")
    
                            Shell(str1)
                            Shell(str2)
                            Console.WriteLine(files(i) & "   -   " & files(j))
                            Console.WriteLine("")
                            GoTo inicio
                        End If
                    End If
                    counter = 0
                Next
            Next
        End Sub
    
        Public Function AreSameImage(ByVal I1 As Image, ByVal I2 As Image, ByRef c As Integer) As Boolean
            Dim BM1 As Bitmap = I1
            Dim BM2 As Bitmap = I2
    
            If BM1.Width <> BM2.Width Or BM1.Height <> BM2.Height Then
                Return False
            End If
    
            For X = 0 To BM1.Width - 1
                For y = 0 To BM2.Height - 1
                    If BM1.GetPixel(X, y) <> BM2.GetPixel(X, y) Then
                        c += 1
                    End If
                Next
            Next
    
            If c > 0 Then
                Return False
            Else
                Return True
            End If
        End Function
    This is the entire code so far.
    I'll later give a try to the Parallel.For

  7. #7
    PowerPoster
    Join Date
    Nov 2017
    Posts
    3,138

    Re: Help achieving CPU's max potencial

    One glaring significant issue with your AreSameImage function is in your nested for loops. Once you detect one difference, there is no reason to continue processing the for loops at all, which you currently are. Try the below and see how much just that change makes.

    Code:
        Public Function AreSameImage(ByVal I1 As Image, ByVal I2 As Image, ByRef c As Integer) As Boolean
            Dim BM1 As Bitmap = I1
            Dim BM2 As Bitmap = I2
    
            If BM1.Width <> BM2.Width Or BM1.Height <> BM2.Height Then
                Return False
            End If
    
            For X = 0 To BM1.Width - 1
                For y = 0 To BM2.Height - 1
                    If BM1.GetPixel(X, y) <> BM2.GetPixel(X, y) Then
                        Return False
                    End If
                Next
            Next
    
            Return True
        End Function

  8. #8

    Thread Starter
    New Member
    Join Date
    Mar 2017
    Posts
    10

    Re: Help achieving CPU's max potencial

    Quote Originally Posted by OptionBase1 View Post
    One glaring significant issue with your AreSameImage function is in your nested for loops. Once you detect one difference, there is no reason to continue processing the for loops at all, which you currently are. Try the below and see how much just that change makes.

    Code:
        Public Function AreSameImage(ByVal I1 As Image, ByVal I2 As Image, ByRef c As Integer) As Boolean
            Dim BM1 As Bitmap = I1
            Dim BM2 As Bitmap = I2
    
            If BM1.Width <> BM2.Width Or BM1.Height <> BM2.Height Then
                Return False
            End If
    
            For X = 0 To BM1.Width - 1
                For y = 0 To BM2.Height - 1
                    If BM1.GetPixel(X, y) <> BM2.GetPixel(X, y) Then
                        Return False
                    End If
                Next
            Next
    
            Return True
        End Function
    This is meant to detect also the percentage of how much they are equal...
    That's why I don't end the loop in the first pixel that's different.

  9. #9
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,930

    Re: Help achieving CPU's max potencial

    If OptionBase1's version gives you the right results, it should be faster to use that rather than using FastPix etc (for most images anyway).


    I'm a bit worried about the addition of "GoTo inicio", as it seems to me that will just cause lots of extra work (every time you find matching images, restart the entire process). I'm not sure of your intent there, but there is probably a much better way to deal with it.

  10. #10
    PowerPoster
    Join Date
    Nov 2017
    Posts
    3,138

    Re: Help achieving CPU's max potencial

    Quote Originally Posted by Darkratos View Post
    This is meant to detect also the percentage of how much they are equal...
    That's why I don't end the loop in the first pixel that's different.
    Yeah, sorry about that. I see that now looking closer at the rest of your code. Good luck.

  11. #11
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    39,038

    Re: Help achieving CPU's max potencial

    What is the point of the percentage of how much they are equal? You are comparing the individual pixels in an image. That's potentially comparing 32-bit images. They could be images of identical things, taken from the same vantage point, and only seconds apart, yet have 0% similarity. All that would have to happen would be to have a slight (even one pixel) shift in the image, and none of the pixels may line up. So, it the goal is some kind of image recognition, it won't do that. If they are rendered pictures, then there's a better chance, and if they are really simple, then there's a much better chance.
    My usual boring signature: Nothing

  12. #12

    Thread Starter
    New Member
    Join Date
    Mar 2017
    Posts
    10

    Re: Help achieving CPU's max potencial

    Okay, I've been reading your comments and let's answer them:

    si_the_geek: the point of "GoTo inicio" is to restart the folder list... my first version of the program took hashes and added the matches to a list, but that envolves a lot more code. I thought that if I already moved the files to where they belong, then the program would be smaller, but also slower.

    Shaggy: There are some images with just a few pixels edited (not sure why) but they don't change much (that's the reason of the 90% percent margin)


    I'm having another trouble with the move command:

    If I use shell or File.Move it gives the following:
    System.IO.IOException: 'O processo não pode acessar o arquivo porque ele está sendo usado por outro processo.'
    (translation: The process cannot access the file because it's being used by another process)

    I can't seem to get rid of it, even disposing the image
    Last edited by Darkratos; Mar 24th, 2018 at 07:58 PM.

  13. #13
    Addicted Member Goggy's Avatar
    Join Date
    Oct 2017
    Posts
    196

    Re: Help achieving CPU's max potencial

    The GetPixel method is very slow.

    Try looking in to the bitmapdata class.

    here
    Utterly useless, but always willing to help

    As a finishing touch god created the dutch

  14. #14

    Thread Starter
    New Member
    Join Date
    Mar 2017
    Posts
    10

    Re: Help achieving CPU's max potencial

    Quote Originally Posted by Goggy View Post
    The GetPixel method is very slow.

    Try looking in to the bitmapdata class.

    here
    Okay, i've messed arround a bit more and decided to get rid of a few things as suggested:

    1- As soon as it encounters a different pixel it returns false

    2- Got rid of percentage

    3- I changed the move and GoTo inicio to do it in the end, after the loops are completed

    4- Changed the for loop to the Parallel.For

    But in the Parallel.For it gives an error with the image: System.ArgumentException: 'Invalid Parameter.'
    StackTrace: em System.Drawing.Image.get_Width()
    em Comparador_de_Imagens_2._0.Module1.AreSameImage(Image I1, Image I2) na F:\Users\Darkratos\Desktop\Projetos VB\Comparador de Imagens 2.0\Comparador de Imagens 2.0\Module1.vb:linha 95
    em Comparador_de_Imagens_2._0.Module1._Closure$__0-1._Lambda$__1(Int32 j, ParallelLoopState innerState) na F:\Users\Darkratos\Desktop\Projetos VB\Comparador de Imagens 2.0\Comparador de Imagens 2.0\Module1.vb:linha 52
    em System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()

    If we can take off this error, my task here is complete and the performance will be a lot better.

  15. #15
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,930

    Re: Help achieving CPU's max potencial

    There are many potential causes of that error, but the chances are it is due to something in your code... but as we can't see your code, we can't really help you correct it.

  16. #16

    Thread Starter
    New Member
    Join Date
    Mar 2017
    Posts
    10

    Re: Help achieving CPU's max potencial

    Quote Originally Posted by si_the_geek View Post
    There are many potential causes of that error, but the chances are it is due to something in your code... but as we can't see your code, we can't really help you correct it.
    No problem in seeing the code:

    Code:
    Imports System.Drawing, System.IO, System.Windows.Forms, System.Threading
    
    Module Module1
    
        Sub Main()
            Dim args As String() = Environment.GetCommandLineArgs
    
            Dim mpath As String = args(0).Substring(0, args(0).LastIndexOf("\"))
    
            Directory.CreateDirectory(mpath & "\Duplicatas")
    
            Dim f As String() = New String(0) {"Nenhum arquivo duplicado"}
            Dim p1, p2 As New PictureBox
    
            Dim ftmp As String() = Directory.GetFiles(mpath)
            Dim files As String() = New String(0) {"Nenhum arquivo de imagem"}
            Dim counterr As Integer = 0
    
            For a = 0 To ftmp.Length - 1
                If (ftmp(a).EndsWith(".jpg") Or ftmp(a).EndsWith(".png") Or ftmp(a).EndsWith(".bmp")) = True Then
                    files(counterr) = ftmp(a)
                    Dim tmp() As String = files
    
                    counterr += 1
                    files = New String(counterr) {}
    
                    For b = 0 To tmp.Length - 1
                        files(b) = tmp(b)
                    Next
                End If
            Next
    
            If files.Length = 1 Then
                If files(0) = "Nenhum arquivo de imagem" Then
                    Console.WriteLine(files(0))
                    Console.ReadKey()
                    Exit Sub
                End If
            End If
    
            Parallel.For(0,
                         files.Length - 1,
                         Sub(i, outerState)
                             Parallel.For(i + 1,
                                          files.Length,
                                          Sub(j, innerState)
                                              Console.WriteLine("Verificando arquivo {0} contra arquivo {1}", i, j)
                                              Console.WriteLine()
                                              p1.Image = Image.FromFile(files(i))
                                              p2.Image = Image.FromFile(files(j))
    
                                              If AreSameImage(p1.Image, p2.Image) Then
                                                  f(f.Length - 1) = files(i)
                                                  Dim tmp() As String = f
                                                  f = New String(f.Length) {}
    
                                                  For k = 0 To tmp.Length - 1
                                                      f(k) = tmp(k)
                                                  Next
    
    
                                              End If
                                              p1.Image.Dispose()
                                              p2.Image.Dispose()
                                          End Sub)
                         End Sub)
    
                    p1.Image.Dispose()
                    p2.Image.Dispose()
                Next
            Next
        End Sub
    
        Public Function AreSameImage(ByVal I1 As Image, ByVal I2 As Image) As Boolean
            Dim BM1 As Bitmap = I1
            Dim BM2 As Bitmap = I2
    
            If BM1.Width <> BM2.Width Or BM1.Height <> BM2.Height Then
                Return False
            End If
    
            For X = 0 To BM1.Width - 1
                For y = 0 To BM2.Height - 1
                    If BM1.GetPixel(X, y) <> BM2.GetPixel(X, y) Then
                        Return False
                    End If
                Next
            Next
    
            Return True
        End Function
    End Module

  17. #17
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,930

    Re: Help achieving CPU's max potencial

    I'm thinking the variables p1 and p2 are the problem... they are declared outside of the Parallel.For, so it is likely that they are being shared between the different parallel iterations in a way that is not safe.

    If you move the declarations of those variables inside the Parallel.For instead (where you assign values to them) it is likely to correct the problem. Also note that you don't need to use a PictureBox (as you aren't going to use it to show the image on screen), so you can just declare a System.Drawing.Image object instead.

  18. #18

    Thread Starter
    New Member
    Join Date
    Mar 2017
    Posts
    10

    Re: Help achieving CPU's max potencial

    Quote Originally Posted by si_the_geek View Post
    I'm thinking the variables p1 and p2 are the problem... they are declared outside of the Parallel.For, so it is likely that they are being shared between the different parallel iterations in a way that is not safe.

    If you move the declarations of those variables inside the Parallel.For instead (where you assign values to them) it is likely to correct the problem. Also note that you don't need to use a PictureBox (as you aren't going to use it to show the image on screen), so you can just declare a System.Drawing.Image object instead.
    Thanks for the tip, going to test it and bring the results!

  19. #19
    Powered By Medtronic dbasnett's Avatar
    Join Date
    Dec 2007
    Location
    Jefferson City, MO
    Posts
    9,764

    Re: Help achieving CPU's max potencial

    Quote Originally Posted by Darkratos View Post
    Hey, I'm writing a small program that compares images and my problem is:
    I need to compare over a 1300 images...
    What about using tasks to compare files???
    Code:
            Dim ctRun As Long = 0L
            Dim maxRun As Long = Environment.ProcessorCount - 1 ' 4L ' 2L * 16L ' controls the number of tasks to run at once
            If maxRun < 0 Then maxRun = 1
    
            Dim compTasks As New Concurrent.BlockingCollection(Of Task)
    
            Parallel.For(0, files.Length, (Sub(oidx)
                                               Dim t As Task(Of Boolean)
                                               t = New Task(Of Boolean)(Function() As Boolean
                                                                            Dim isEqual As Boolean = False
                                                                            Dim ofile As String = files(oidx)
                                                                            Dim p1 As Image = Image.FromFile(ofile)
                                                                            Dim ifile As String
                                                                            For idx As Integer = 0 To files.Length - 1
                                                                                If idx <> oidx Then
                                                                                    ifile = files(idx)
                                                                                    Dim p2 As Image = Image.FromFile(ifile)
                                                                                    'compare p1 and p2 here '<<<<<<<<<<<<
    
                                                                                    'then
                                                                                    p2.Dispose()
                                                                                Else
                                                                                    Debug.WriteLine(ofile)
                                                                                End If
                                                                            Next
                                                                            p1.Dispose()
                                                                            Threading.Interlocked.Decrement(ctRun)
    
                                                                            Return isEqual
                                                                        End Function)
    
                                               Threading.Interlocked.Increment(ctRun)
                                               compTasks.Add(t)
                                               t.Start()
                                               Do While Threading.Interlocked.Read(ctRun) >= maxRun
                                                   Threading.Thread.Sleep(10)
                                               Loop
                                           End Sub))
    
            Task.WaitAll(compTasks.ToArray)
    Not sure what goes at 'compare p1 and p2 here '<<<<<<<<<<<<, and what the actual result is. Just a thought.
    My First Computer -- Documentation Link (RT?M) -- Using the Debugger -- Prime Number Sieve
    Counting Bits -- Subnet Calculator -- UI Guidelines -- >> SerialPort Answer <<

    "Those who use Application.DoEvents have no idea what it does and those who know what it does never use it." John Wein

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width