-
Jun 24th, 2022, 04:07 PM
#1
[RESOLVED] Optical Character Recognition
I am looking for both commercial and open source options regarding optical character recognition (aka OCR).
The use case that I have is that I have two Cognex cameras, one takes pictures of barcodes and datamatrixes (aka QR codes) whereas the other takes pictures of the information that generated those barcodes. The idea is that we will have a file system watcher to watch a directory for incoming images, one directory is for the barcodes whereas the other is for images that generate the barcodes. After the images are read and parsed, it will compare whether or not the OCR that generated the barcode did so accurately.
I already implemented a barcode/datamatrix scanner using Cognex's to read/parse the incoming images. Now I need a way to take the OCR image, pass it through an OCR library, and get the results.
So far I have looked at Aspose, but their commercial license is $799 for 1 developer and 1 location (too much for too little). I really don't know of any other libraries because this is sort of outside my wheelhouse, which is why I figured I would ask here.
-
Jun 24th, 2022, 06:14 PM
#2
Re: Optical Character Recognition
This might be too specific a question to get a good answer on such a small forum. You'd have to hope that among the limited membership here that someone has experience with this.
In case it doesn't pan out, I'd ask this on StackOverflow and I would also not limit it to just VB.Net. I would try to reach as wide an audience as possible. Let's say someone gives you a good answer for say, C++, I'd then look into how I could integrate that into my .Net solution. whether it's looking for .Net bindings of whatever gets suggested or making the bindings myself.
-
Jun 27th, 2022, 09:19 PM
#3
Re: Optical Character Recognition
I found a .NET library named Tesseract .NET Wrapper (GitHub repo) using the eng.traineddata language file from Tesseract OCR (GitHub repo)
Unfortunately, I had to downgrade to .NET 4.0, but that wasn't too much difficulty. I whipped up a sample console app using the following:
Code:
Imports Tesseract
Module Module1
Sub Main()
Dim testImage = IO.Path.Combine("C:\-file location truncated-\sample1.jpg")
Try
Using engine = New TesseractEngine(Environment.CurrentDirectory, "eng", EngineMode.Default)
Using img = Pix.LoadFromFile(testImage)
Using Page = engine.Process(img)
Dim text = Page.GetText()
Console.WriteLine("Confidence: {0}", Page.GetMeanConfidence())
Console.WriteLine("Text: {0}", text)
End Using
End Using
End Using
Catch ex As Exception
Trace.TraceError(ex.ToString())
Console.WriteLine("Unexpected Error: {0}", ex.Message)
Console.WriteLine("Details: ")
Console.WriteLine(ex.ToString())
End Try
Console.ReadLine()
End Sub
End Module
On the first pass, it worked as expected. However, I noticed that I had the image file rotated so that it instead of reading right-to-left it was reading bottom-to-top. This gave me a confidence score of like 29 and some random characters it tried to parse. So I edited the file using the Windows photos app and rotated the text to rotate the image so that it is right-side-up. The issue is that now it always returns a confidence level of 0 and no text read, even if I use the photos app to rotate it back to what it was.
What is odd is that after I deleted the image and redownloaded, the application worked again. So I'm not sure what's going on behind the scenes that causes it to be unparsable after editing.
Edit - I did not need to downgrade. That was from a separate library I tried (and failed) to use. The issue with images unable to be parsed after editing persisted, so what I did was the following:
- Loop from 1 to 4
- I would load the image in memory
- I would rotate the image based on the current iteration; e.g. iteration 0 = 0 degrees, iteration 1 = 90 degrees, etc.
- I would use an ImageConverter to convert the image to a byte array
- I then used Pix.LoadFromMemory to load the rotated image
And this seemed to work for me. Here is an example:
Code:
Public Sub Parse(path As String, languageDirectory As String, languageName As String)
If (Not IO.File.Exists(path)) Then
Throw New ArgumentNullException(NameOf(path))
End If
Dim ocr = New OcrImage() With {.Filename = path}
Dim rotations = {
Drawing.RotateFlipType.RotateNoneFlipNone,
Drawing.RotateFlipType.Rotate90FlipNone,
Drawing.RotateFlipType.Rotate180FlipNone,
Drawing.RotateFlipType.Rotate270FlipNone
}
Using btmImg = Drawing.Bitmap.FromFile(path)
Dim converter As New Drawing.ImageConverter
For angle As Integer = 0 To 3
btmImg.RotateFlip(rotations(angle))
Dim bytes = DirectCast(converter.ConvertTo(btmImg, GetType(Byte())), Byte())
Using engine = New TesseractEngine(languageDirectory, languageName, EngineMode.Default)
Using img = Pix.LoadFromMemory(bytes)
Using Page = engine.Process(img)
Dim text = Page.GetText()
Dim angleText = "{0} degrees"
If (angle = 0) Then
angleText = String.Format(angleText, "0")
ElseIf (angle = 1) Then
angleText = String.Format(angleText, "90")
ElseIf (angle = 2) Then
angleText = String.Format(angleText, "180")
Else
angleText = String.Format(angleText, "270")
End If
Console.WriteLine("{0} Confidence: {1}", angleText, Page.GetMeanConfidence())
Console.WriteLine("{0} Text: {1}", angleText, text)
End Using
End Using
End Using
Next
End Using
End Sub
Last edited by dday9; Jun 27th, 2022 at 09:57 PM.
-
Jun 28th, 2022, 04:23 AM
#4
Re: [RESOLVED] Optical Character Recognition
The library you used (https://github.com/charlesw/tesseract/) supports from .NET Framework 4.0 to .NET 6 - info got from NuGet package page:
https://www.nuget.org/packages/Tesse...works-body-tab
This allows supporting from older Windows versions (XP support .NET Framework 4.0) to latest versions and also multiplatform support with .NET 6 (as current version).
-
Jun 28th, 2022, 06:41 AM
#5
Re: [RESOLVED] Optical Character Recognition
You can "optimize" this part of your code:
VB.NET Code:
Dim angleText = "{0} degrees"
If (angle = 0) Then
angleText = String.Format(angleText, "0")
ElseIf (angle = 1) Then
angleText = String.Format(angleText, "90")
ElseIf (angle = 2) Then
angleText = String.Format(angleText, "180")
Else
angleText = String.Format(angleText, "270")
End If
to
VB.NET Code:
Dim angleText = $"{angle * 90} degrees"
Not a big win but makes code shorter and more readable.
-
Jun 28th, 2022, 06:55 AM
#6
Re: [RESOLVED] Optical Character Recognition
Now more real optimization that may avoid some future problems:
Loading image from file (Bitmat.FromFile()) actually locks the file and may become an issue if you run multiple threads or processes and for some reason more than one tries to load the image from the same file. And because OCR is slow operation, the lock persists for that processing time.
The solution is quite simple: load the file in MemoryStream, close file, then load image from that memory stream instead of file. This will leave the file accessible for other processes.
-
Jun 29th, 2022, 10:04 AM
#7
Re: [RESOLVED] Optical Character Recognition
Originally Posted by peterst
Yeah, in my edit I indicated that I didn't actually need to downgrade my .NET framework version.
Originally Posted by peterst
You can "optimize" this part of your code:
VB.NET Code:
Dim angleText = "{0} degrees"
If (angle = 0) Then
angleText = String.Format(angleText, "0")
ElseIf (angle = 1) Then
angleText = String.Format(angleText, "90")
ElseIf (angle = 2) Then
angleText = String.Format(angleText, "180")
Else
angleText = String.Format(angleText, "270")
End If
to
VB.NET Code:
Dim angleText = $"{angle * 90} degrees"
Not a big win but makes code shorter and more readable.
Agreed. I'm not actually doing that specific part in my production code, but for the sake of providing an example on the forum I whipped that up.
Originally Posted by peterst
Now more real optimization that may avoid some future problems:
Loading image from file (Bitmat.FromFile()) actually locks the file and may become an issue if you run multiple threads or processes and for some reason more than one tries to load the image from the same file. And because OCR is slow operation, the lock persists for that processing time.
The solution is quite simple: load the file in MemoryStream, close file, then load image from that memory stream instead of file. This will leave the file accessible for other processes.
I actually want the file to be locked in this situation. I'm not going to elaborate, but just take my word that I want the file to be locked in this situation.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|