PDA

Click to See Complete Forum and Search --> : OCR Find unique pixel signature (perhaps matrices)


Phenix
Mar 14th, 2006, 10:22 AM
Given a simple font whose width is W pixels and height is H pixels, and the pixels which make up the numbers 0,1,2...9 are known. Determine the minimum amount of pixels which uniquely identifies the number in a given WxH grid. A type of optical character recognition.

Let's say I have a 5x7 grid where zero is made up of
XXX
X X
X X
X X
X X
X X
XXX

eight is:
XXX
X X
X X
XXX
X X
X X
XXX

If I was just dealing with 0 and 8, I could test if a central pixel was set and assume I had an 8. But I want to deal with all digits 0 through 9.

Once I came across a font where all I had to do was check the middle horizontal row of pixels. I would like to have a general algorithm to apply to other simple blocky fonts for numbers.

Another font I came across is 3x5. 0 and 1 are shown. X is a set pixel, b is an unset or blank pixel. For just 0 and 1, I could check the second row to uniquely identify a 0 or 1. But I would not be able to only check the first top row as they are the same bit pattern. However, I would like 0 through 9. Also for just this 0 and 1, I could test only 1 pixel (second row from top, first column).
bXb bXb
XbX bXb
XbX bXb
XbX bXb
bXb bXb

The output from the algorithm may even give asymetric unique signatures. What I mean is I may know which number I have by testing 1 pixel, then for others I may have to test more pixels. Hypothetically something like this contrived example:
If pixel at 0,0 is set{
//I have a 2
}else{
If pixel at 1,4 is set{
//May have 0,3,6,8,9
If pixel at 0,1 is set{
//May have 3 or 8
If pixel at 3,5 is not set{
//Have 5
}else{
//Have 3
}
}else{
...

Another idea to come up with such an algorithm is to visualize overlaying a WxH grid on top of another WxH grid and seeing which pixels "shine through". Perhaps using boolean math or, and, xor, etc.

There would be 2 algorithms:
1) The development algorithm which is used to find the algorithm to be used in production
2) The production algorithm

The above "Hypothetically contrived example" would be a production algorithm. The production algorithm must be fast.

The development algorithm to come up with the production algorithm can be slow and memory intensive, perhaps using the full grids of each matrix and performing tests on every pixel in the grid.

Thoughts?

NotLKH
Mar 15th, 2006, 05:47 PM
I would think that you would have to check a minimum of 4 bits, err, pixels in the worst case scenario, if you can identify 4 such shared pixels that can binarily define the numbers 0 thru 9, then it should be simple and fast.

Can you map all the numbers as X and O here, as they will appear using the ultimate font that you want?

:wave:
-Lou

Phenix
Mar 26th, 2006, 04:52 PM
Sorry for my delay.

I was looking for a general solution for a given font for the ten digits 0 through 9. For a general font, there may be multiple solutions with minimum pixel tests. But here is my current "ultimate font" (or working font) whose width is 5 pixels and height is 7 pixels (X is set; O is not set):


OXXXO OOOXO OXXXO OXXXO OOOXO OXXXX OXXXO XXXXX OXXXO OXXXO
XOOOX OOXXO XOOOX XOOOX OOXXO OXOOO XOOOX OOOOX XOOOX XOOOX
XOOOX OXOXO OOOOX OOOOX OXOXO XXXXO XOOOO OOOXO XOOOX XOOOX
XOOOX OOOXO OOOXO OOXXO OXOXO XOOOX XXXXO OOOXO OXXXO OXXXX
XOOOX OOOXO OOXOO OOOOX XOOXO OOOOX XOOOX OOXOO XOOOX OOOOX
XOOOX OOOXO OXOOO XOOOX XXXXX XOOOX XOOOX OOXOO XOOOX XOOOX
OXXXO OOOXO XXXXX OXXXO OOOXO OXXXO OXXXO OOXOO OXXXO OXXXO


Ahh, this looks a little better. I used the Fixedsys font (for the above X and O) and colored the X and O.