-
image processing
i'm writing an app to process images, some of the image i will be processing are over 16mb in size so the processing takes quite a while. the majority of my processing code is done in 4 for loop (1,2,3 and 4 below).
At the moment i'm trying to think of ways to improve the performance of the code. (depending on the colours in an image it can take over 30 minutes)
One thing i thougth i could do is to make each of the four for loops a thread. (because they don't need access any shared resources) would this be a good thing to do? and if so could someone give me some pointers on how to do it. i've looked at a few tutorials on free threading but i'm still not 100% sure hwo togo about it.
Code:
public void ProcessSelection()
{
//1
for (yIndex = 0; yIndex != beginY; yIndex++)
{
for (xIndex = 0; xIndex != xSize; xIndex++)
{
testCol = myBitmap.GetPixel(xIndex,yIndex);
//fullRGBString = testCol.R.ToString() + testCol.G.ToString() + testCol.B.ToString();
fullRGBString = testCol.ToArgb();
// try setting searched bits to white automatically
myBitmap.SetPixel(xIndex, yIndex, Color.White );
if (backgroundValues.Contains(fullRGBString))
{
//if already n arraylist then do nothing
}
else
{
backgroundValues.Add(fullRGBString);
}
}
}
Console.WriteLine("1 done");
//2
for (yIndex = endY; yIndex != ySize; yIndex++)
{
for (xIndex = 0; xIndex != xSize; xIndex++)
{
testCol = myBitmap.GetPixel(xIndex,yIndex);
//fullRGBString = testCol.R.ToString() + testCol.G.ToString() + testCol.B.ToString();
fullRGBString = testCol.ToArgb();
// try setting searched bits to white automatically
myBitmap.SetPixel(xIndex, yIndex, Color.White );
if (backgroundValues.Contains(fullRGBString))
{
//if already n arraylist then do nothing
}
else
{
backgroundValues.Add(fullRGBString);
}
}
}
Console.WriteLine("2 done");
//3
for (yIndex = beginY; yIndex != endY; yIndex++)
{
for (xIndex = 0; xIndex != beginX; xIndex++)
{
testCol = myBitmap.GetPixel(xIndex,yIndex);
//fullRGBString = testCol.R.ToString() + testCol.G.ToString() + testCol.B.ToString();
fullRGBString = testCol.ToArgb();
// try setting searched bits to white automatically
myBitmap.SetPixel(xIndex, yIndex, Color.White );
if (backgroundValues.Contains(fullRGBString))
{
//if already n arraylist then do nothing
}
else
{
backgroundValues.Add(fullRGBString);
}
}
}
Console.WriteLine("3 done");
//4
for (yIndex = beginY; yIndex != endY; yIndex++)
{
for (xIndex = endX; xIndex != xSize; xIndex++)
{
testCol = myBitmap.GetPixel(xIndex,yIndex);
//fullRGBString = testCol.R.ToString() + testCol.G.ToString() + testCol.B.ToString();
fullRGBString = testCol.ToArgb();
// try setting searched bits to white automatically
myBitmap.SetPixel(xIndex, yIndex, Color.White );
if (backgroundValues.Contains(fullRGBString))
{
//if already n arraylist then do nothing
}
else
{
backgroundValues.Add(fullRGBString);
}
}
}
Console.WriteLine("4 done");
for (xIndex = beginX; xIndex != endX; xIndex++)
{
for (yIndex = beginY; yIndex != endY; yIndex++)
{
testCol = myBitmap.GetPixel(xIndex,yIndex);
//fullRGBString = testCol.R.ToString() + testCol.G.ToString() + testCol.B.ToString();
fullRGBString = testCol.ToArgb();
if (fullRGBString != Color.White.ToArgb())
{
if (backgroundValues.Contains(fullRGBString))
{
myBitmap.SetPixel(xIndex, yIndex, Color.White);
}
}
}
}
Console.WriteLine("Done Selected");
-
The bitmap itself is a shared resource. Besides, unless you're on a multiprocessor system, threads gain you next to nothing and may even cost you.
*** is this doing anyway? You're basically collecting all existing colors in an image and setting it completly to white. For some reason, you're doing this for one of three parts, then the last, then the second.
After the third loop, the whole image is white. The fourth and fifth loop accomplish nothing.
backgroundValues seems to be an ArrayList (from the comments), this is BAD! I'm willing to bet that this accounts for 70% of your loop runtime, likely more. Especially when there are many colors. Make it some sort of Set. Preferably one that accepts only unique values but doesn't throw an exception at a duplicate, so that you don't need the contains check.
Hmmm, a check with the reference confirms my fear that .Net doesn't provide such a thing. What you need is the equivalent of the Java class java.util.TreeSet. Maybe you can find such a thing.
-
Cheers for the reply.
The company i work for do a lot of work with product images, images are taken and then the background stripped from them using photoshop. this is a manual and costly process. i'm trying to write an app that automatically removes the background from an image. all the user has to do is do a simple selection of the product within the image wiht a selection rectangle.
this is what the four for loops do
1 analyses the top strip of the image from (0,0) to
(end of image, start of selected area)
2 analyses the bottom strip of the image from
(0,end of y selection) to (end of image, end of image)
3 and 4 then analyse to the left and right of the selected area.
once these have finished i am left with a big arraylist which i then use the process the selected area. i check every pixel wihtiin the selected area and if it matches one in the 'background colour' arraylist i turn it white.
using this process i can erase nearly 100% of the background from an image.
I had thought of not using an arraylist as i know it will be less efficient than an array, buty there are two reason why i use it.
1) i don't know how big the list will be of background colours
2) i really need to have something like the 'contains' method. because after i analyse the top,bottom,left and right areas i then need to compare every pixel of the selected area against the list.
If there is a collection that only accepts unique values that would be useful.
Bearing in mind what i've told you, can you think of any suggestions to improve the performance?
thanks in advance.
-
ArrayList or Array, it doesn't matter, both of them are equally unsuited for the job.
I'm currently doing algorithms and datastructures at university and I think you could greatly benefit from a bit of reading there.
The thing is, the lookup in a sequential container (such as ArrayList and an array) takes at worst as many steps as there are elements, formally written O(n). This is bad, very bad as n (the number of elements) gets large.
What you need is a better container, and one such is a binary search tree as implemented by the Java class java.util.TreeSet or the C++ class std::set. Such a class has a worst-case lookup speed of O(log n).
Just as an example, with 100 colors in your list, when you come to the 101st, you get worst-case lookup (looking for something non-existant is always worst case). In an ArrayList, you need 100 lookups. In a binary tree, you need 7 lookups.
The absolute worst case for a 24bpp image is (in theory) a total of 16,777,215 colors. Worst case lookup for a sequence is 16,777,215 checks, for a tree 24 checks. I take it you get the idea ;)
About your loops, this line that occurs in every loop makes them quite useless because they erase the image while they analyze it.
myBitmap.SetPixel(xIndex, yIndex, Color.White );
-
My 2cent ...
Recursion methods are highly used in image processing . I'm not sure how this would fit to your code but it should boosts and shorten it considerably .
-
In this case recursion is not applicable except by brute force, in which case it would slow the app down a lot and likely crash it for lack of stack space.
-
thanks for your replies.
Not sure whether there is a tree set style thing as you recommend in c# but i'll have a look.
At the moment i'm thinking the realtime part of the app will just be the user selecting an image and selecting the product wihtin the image, i can then write the relevent x,y info to a database, and then have a separate app that intermitently proccesses the information from the database.
that way i can give the user the realtime workflow, and have the hardcore processing done at some other point.
Despite all this the app does work quite well on smaller images, a 500x500 png takes between 5-10 seconds to be processed. its the 2365x2365 tiffs that take ages (10-39 mins)
"About your loops, this line that occurs in every loop makes them quite useless because they erase the image while they analyze it.
myBitmap.SetPixel(xIndex, yIndex, Color.White );"
thats not the case, every iteration of the loop reads the argb value into an int and then sets the pixel to white.
this only happens for those pixels outside of the product selection area.
the code does work believe me.
-
I won't believe it 'til I see the final output. :p
About the tree. I know there is no such thing in the standard framework. I can't find a proper one with Google either. It's crazy!
You can use SortedList for now, it should be considerably faster, but I think I'll write a balanced binary search tree in C#.
-
i'm now looking into improving performance by using unsafe code.
i can lock the bitmap in memory and start fiddling with the bytes that make up the pixels.
-
That's just a tiny part of the problem though.
There's a rule programmers have. Whatever you THINK your program spends time on is likely wrong. Use a profiler to find out where the program spends time.
This means that in several places in your loop you should measure tick counts and see where you waste all the time.
-
1 Attachment(s)
This is a C# source containing a binary tree and two usage classes. TreeDictionary is an IDictionary implementation based on the tree while TreeSet implements my ISet interface and is what you should try. ISet derives from ICollection and provides three additional methods:
Add(object): adds a value to the set.
Remove(object): removes an object from the set.
Contains(object): checks if an object is in the set.
The point is that it is much much faster than an ArrayList for the Contains call.
-
I saw an article that said by using unsafe code in Image processing you could save like more than 30% processing time..but CornedBee is right, what is slowing things most of the time is the arraylist and not the image processing..
-
using unsafe code got it down from 30 minutes to 15.
i used a hashtable instead of my array list and that reduced the image processing time to between 30-40 seconds.
there is some chaff in my algorithm (like calling the same method twice in a loop, when i only need to do it once and hold the return value in a variable) that when i remove should shave a few more seconds.
but 30 seconds instead of 30 minutes is a eduction i'm more than happy with.
Cheers for all the help i really appreciate it.