Phenix
Mar 14th, 2006, 10:22 AM
Given a simple font whose width is W pixels and height is H pixels, and the pixels which make up the numbers 0,1,2...9 are known. Determine the minimum amount of pixels which uniquely identifies the number in a given WxH grid. A type of optical character recognition.
Let's say I have a 5x7 grid where zero is made up of
XXX
X X
X X
X X
X X
X X
XXX
eight is:
XXX
X X
X X
XXX
X X
X X
XXX
If I was just dealing with 0 and 8, I could test if a central pixel was set and assume I had an 8. But I want to deal with all digits 0 through 9.
Once I came across a font where all I had to do was check the middle horizontal row of pixels. I would like to have a general algorithm to apply to other simple blocky fonts for numbers.
Another font I came across is 3x5. 0 and 1 are shown. X is a set pixel, b is an unset or blank pixel. For just 0 and 1, I could check the second row to uniquely identify a 0 or 1. But I would not be able to only check the first top row as they are the same bit pattern. However, I would like 0 through 9. Also for just this 0 and 1, I could test only 1 pixel (second row from top, first column).
bXb bXb
XbX bXb
XbX bXb
XbX bXb
bXb bXb
The output from the algorithm may even give asymetric unique signatures. What I mean is I may know which number I have by testing 1 pixel, then for others I may have to test more pixels. Hypothetically something like this contrived example:
If pixel at 0,0 is set{
//I have a 2
}else{
If pixel at 1,4 is set{
//May have 0,3,6,8,9
If pixel at 0,1 is set{
//May have 3 or 8
If pixel at 3,5 is not set{
//Have 5
}else{
//Have 3
}
}else{
...
Another idea to come up with such an algorithm is to visualize overlaying a WxH grid on top of another WxH grid and seeing which pixels "shine through". Perhaps using boolean math or, and, xor, etc.
There would be 2 algorithms:
1) The development algorithm which is used to find the algorithm to be used in production
2) The production algorithm
The above "Hypothetically contrived example" would be a production algorithm. The production algorithm must be fast.
The development algorithm to come up with the production algorithm can be slow and memory intensive, perhaps using the full grids of each matrix and performing tests on every pixel in the grid.
Thoughts?
Let's say I have a 5x7 grid where zero is made up of
XXX
X X
X X
X X
X X
X X
XXX
eight is:
XXX
X X
X X
XXX
X X
X X
XXX
If I was just dealing with 0 and 8, I could test if a central pixel was set and assume I had an 8. But I want to deal with all digits 0 through 9.
Once I came across a font where all I had to do was check the middle horizontal row of pixels. I would like to have a general algorithm to apply to other simple blocky fonts for numbers.
Another font I came across is 3x5. 0 and 1 are shown. X is a set pixel, b is an unset or blank pixel. For just 0 and 1, I could check the second row to uniquely identify a 0 or 1. But I would not be able to only check the first top row as they are the same bit pattern. However, I would like 0 through 9. Also for just this 0 and 1, I could test only 1 pixel (second row from top, first column).
bXb bXb
XbX bXb
XbX bXb
XbX bXb
bXb bXb
The output from the algorithm may even give asymetric unique signatures. What I mean is I may know which number I have by testing 1 pixel, then for others I may have to test more pixels. Hypothetically something like this contrived example:
If pixel at 0,0 is set{
//I have a 2
}else{
If pixel at 1,4 is set{
//May have 0,3,6,8,9
If pixel at 0,1 is set{
//May have 3 or 8
If pixel at 3,5 is not set{
//Have 5
}else{
//Have 3
}
}else{
...
Another idea to come up with such an algorithm is to visualize overlaying a WxH grid on top of another WxH grid and seeing which pixels "shine through". Perhaps using boolean math or, and, xor, etc.
There would be 2 algorithms:
1) The development algorithm which is used to find the algorithm to be used in production
2) The production algorithm
The above "Hypothetically contrived example" would be a production algorithm. The production algorithm must be fast.
The development algorithm to come up with the production algorithm can be slow and memory intensive, perhaps using the full grids of each matrix and performing tests on every pixel in the grid.
Thoughts?