Results 1 to 7 of 7

Thread: [RESOLVED] Optical Character Recognition

  1. #1

    Thread Starter
    Super Moderator dday9's Avatar
    Join Date
    Mar 2011
    Location
    South Louisiana
    Posts
    11,715

    Resolved [RESOLVED] Optical Character Recognition

    I am looking for both commercial and open source options regarding optical character recognition (aka OCR).

    The use case that I have is that I have two Cognex cameras, one takes pictures of barcodes and datamatrixes (aka QR codes) whereas the other takes pictures of the information that generated those barcodes. The idea is that we will have a file system watcher to watch a directory for incoming images, one directory is for the barcodes whereas the other is for images that generate the barcodes. After the images are read and parsed, it will compare whether or not the OCR that generated the barcode did so accurately.

    I already implemented a barcode/datamatrix scanner using Cognex's to read/parse the incoming images. Now I need a way to take the OCR image, pass it through an OCR library, and get the results.

    So far I have looked at Aspose, but their commercial license is $799 for 1 developer and 1 location (too much for too little). I really don't know of any other libraries because this is sort of outside my wheelhouse, which is why I figured I would ask here.
    "Code is like humor. When you have to explain it, it is bad." - Cory House
    VbLessons | Code Tags | Sword of Fury - Jameram

  2. #2
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,598

    Re: Optical Character Recognition

    This might be too specific a question to get a good answer on such a small forum. You'd have to hope that among the limited membership here that someone has experience with this.

    In case it doesn't pan out, I'd ask this on StackOverflow and I would also not limit it to just VB.Net. I would try to reach as wide an audience as possible. Let's say someone gives you a good answer for say, C++, I'd then look into how I could integrate that into my .Net solution. whether it's looking for .Net bindings of whatever gets suggested or making the bindings myself.
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  3. #3

    Thread Starter
    Super Moderator dday9's Avatar
    Join Date
    Mar 2011
    Location
    South Louisiana
    Posts
    11,715

    Re: Optical Character Recognition

    I found a .NET library named Tesseract .NET Wrapper (GitHub repo) using the eng.traineddata language file from Tesseract OCR (GitHub repo)

    Unfortunately, I had to downgrade to .NET 4.0, but that wasn't too much difficulty. I whipped up a sample console app using the following:
    Code:
    Imports Tesseract
    
    Module Module1
    
        Sub Main()
            Dim testImage = IO.Path.Combine("C:\-file location truncated-\sample1.jpg")
            Try
                Using engine = New TesseractEngine(Environment.CurrentDirectory, "eng", EngineMode.Default)
                    Using img = Pix.LoadFromFile(testImage)
                        Using Page = engine.Process(img)
                            Dim text = Page.GetText()
                            Console.WriteLine("Confidence: {0}", Page.GetMeanConfidence())
                            Console.WriteLine("Text: {0}", text)
                        End Using
                    End Using
                End Using
            Catch ex As Exception
                Trace.TraceError(ex.ToString())
                Console.WriteLine("Unexpected Error: {0}", ex.Message)
                Console.WriteLine("Details: ")
                Console.WriteLine(ex.ToString())
            End Try
            Console.ReadLine()
        End Sub
    
    End Module
    On the first pass, it worked as expected. However, I noticed that I had the image file rotated so that it instead of reading right-to-left it was reading bottom-to-top. This gave me a confidence score of like 29 and some random characters it tried to parse. So I edited the file using the Windows photos app and rotated the text to rotate the image so that it is right-side-up. The issue is that now it always returns a confidence level of 0 and no text read, even if I use the photos app to rotate it back to what it was.

    What is odd is that after I deleted the image and redownloaded, the application worked again. So I'm not sure what's going on behind the scenes that causes it to be unparsable after editing.

    Edit - I did not need to downgrade. That was from a separate library I tried (and failed) to use. The issue with images unable to be parsed after editing persisted, so what I did was the following:
    1. Loop from 1 to 4
    2. I would load the image in memory
    3. I would rotate the image based on the current iteration; e.g. iteration 0 = 0 degrees, iteration 1 = 90 degrees, etc.
    4. I would use an ImageConverter to convert the image to a byte array
    5. I then used Pix.LoadFromMemory to load the rotated image


    And this seemed to work for me. Here is an example:
    Code:
    Public Sub Parse(path As String, languageDirectory As String, languageName As String)
        If (Not IO.File.Exists(path)) Then
            Throw New ArgumentNullException(NameOf(path))
        End If
    
        Dim ocr = New OcrImage() With {.Filename = path}
        Dim rotations = {
            Drawing.RotateFlipType.RotateNoneFlipNone,
            Drawing.RotateFlipType.Rotate90FlipNone,
            Drawing.RotateFlipType.Rotate180FlipNone,
            Drawing.RotateFlipType.Rotate270FlipNone
        }
        Using btmImg = Drawing.Bitmap.FromFile(path)
            Dim converter As New Drawing.ImageConverter
    
            For angle As Integer = 0 To 3
                btmImg.RotateFlip(rotations(angle))
                Dim bytes = DirectCast(converter.ConvertTo(btmImg, GetType(Byte())), Byte())
                Using engine = New TesseractEngine(languageDirectory, languageName, EngineMode.Default)
                    Using img = Pix.LoadFromMemory(bytes)
                        Using Page = engine.Process(img)
                            Dim text = Page.GetText()
                            Dim angleText = "{0} degrees"
    
                            If (angle = 0) Then
                                angleText = String.Format(angleText, "0")
                            ElseIf (angle = 1) Then
                                angleText = String.Format(angleText, "90")
                            ElseIf (angle = 2) Then
                                angleText = String.Format(angleText, "180")
                            Else
                                angleText = String.Format(angleText, "270")
                            End If
    
                            Console.WriteLine("{0} Confidence: {1}", angleText, Page.GetMeanConfidence())
                            Console.WriteLine("{0} Text: {1}", angleText, text)
                        End Using
                    End Using
                End Using
            Next
        End Using
    End Sub
    Last edited by dday9; Jun 27th, 2022 at 09:57 PM.
    "Code is like humor. When you have to explain it, it is bad." - Cory House
    VbLessons | Code Tags | Sword of Fury - Jameram

  4. #4
    Fanatic Member
    Join Date
    Jun 2019
    Posts
    557

    Re: [RESOLVED] Optical Character Recognition

    The library you used (https://github.com/charlesw/tesseract/) supports from .NET Framework 4.0 to .NET 6 - info got from NuGet package page:
    https://www.nuget.org/packages/Tesse...works-body-tab

    This allows supporting from older Windows versions (XP support .NET Framework 4.0) to latest versions and also multiplatform support with .NET 6 (as current version).

  5. #5
    Fanatic Member
    Join Date
    Jun 2019
    Posts
    557

    Re: [RESOLVED] Optical Character Recognition

    You can "optimize" this part of your code:

    VB.NET Code:
    1. Dim angleText = "{0} degrees"
    2. If (angle = 0) Then
    3.     angleText = String.Format(angleText, "0")
    4. ElseIf (angle = 1) Then
    5.     angleText = String.Format(angleText, "90")
    6. ElseIf (angle = 2) Then
    7.     angleText = String.Format(angleText, "180")
    8. Else
    9.     angleText = String.Format(angleText, "270")
    10. End If
    to
    VB.NET Code:
    1. Dim angleText = $"{angle * 90} degrees"
    Not a big win but makes code shorter and more readable.

  6. #6
    Fanatic Member
    Join Date
    Jun 2019
    Posts
    557

    Re: [RESOLVED] Optical Character Recognition

    Now more real optimization that may avoid some future problems:

    Loading image from file (Bitmat.FromFile()) actually locks the file and may become an issue if you run multiple threads or processes and for some reason more than one tries to load the image from the same file. And because OCR is slow operation, the lock persists for that processing time.

    The solution is quite simple: load the file in MemoryStream, close file, then load image from that memory stream instead of file. This will leave the file accessible for other processes.

  7. #7

    Thread Starter
    Super Moderator dday9's Avatar
    Join Date
    Mar 2011
    Location
    South Louisiana
    Posts
    11,715

    Re: [RESOLVED] Optical Character Recognition

    Quote Originally Posted by peterst View Post
    The library you used (https://github.com/charlesw/tesseract/) supports from .NET Framework 4.0 to .NET 6 - info got from NuGet package page:
    https://www.nuget.org/packages/Tesse...works-body-tab

    This allows supporting from older Windows versions (XP support .NET Framework 4.0) to latest versions and also multiplatform support with .NET 6 (as current version).
    Yeah, in my edit I indicated that I didn't actually need to downgrade my .NET framework version.

    Quote Originally Posted by peterst View Post
    You can "optimize" this part of your code:

    VB.NET Code:
    1. Dim angleText = "{0} degrees"
    2. If (angle = 0) Then
    3.     angleText = String.Format(angleText, "0")
    4. ElseIf (angle = 1) Then
    5.     angleText = String.Format(angleText, "90")
    6. ElseIf (angle = 2) Then
    7.     angleText = String.Format(angleText, "180")
    8. Else
    9.     angleText = String.Format(angleText, "270")
    10. End If
    to
    VB.NET Code:
    1. Dim angleText = $"{angle * 90} degrees"
    Not a big win but makes code shorter and more readable.
    Agreed. I'm not actually doing that specific part in my production code, but for the sake of providing an example on the forum I whipped that up.

    Quote Originally Posted by peterst View Post
    Now more real optimization that may avoid some future problems:

    Loading image from file (Bitmat.FromFile()) actually locks the file and may become an issue if you run multiple threads or processes and for some reason more than one tries to load the image from the same file. And because OCR is slow operation, the lock persists for that processing time.

    The solution is quite simple: load the file in MemoryStream, close file, then load image from that memory stream instead of file. This will leave the file accessible for other processes.
    I actually want the file to be locked in this situation. I'm not going to elaborate, but just take my word that I want the file to be locked in this situation.
    "Code is like humor. When you have to explain it, it is bad." - Cory House
    VbLessons | Code Tags | Sword of Fury - Jameram

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width