-
Oct 29th, 2024, 07:07 PM
#1
VB6 - Simple OCR Text Recognition from a PictureBox Image
This project is a simple example of using the "WinRT OCR Engine" to recognize the text from a PictureBox image pulled from the Clipboard. Just copy any image containing some text in the Clipboard and click on the PictureBox to display it. The recognized text will appear in the TextBox below.
The code is fairly simple, the biggest hurdle to overcome was that the OCR Engine doesn't work with regular GDI bitmaps but uses "SoftwareBitmap" objects instead. So all we had to do was create such an object and then go through a bunch of interfaces just to expose the underlying byte buffer. Once we got that, it all came down to calling "GetDIBits" on the PictureBox picture handle to pull the bitmap bytes into the "SoftwareBitmap" object and let the OCR Engine do its magic. That meant calling the "RecognizeAsync" method and raising an event when the asynchronous operation was completed (which is pretty much instant).
frmOCR.frm
Code:
Option Explicit
Private Type BITMAP
bmType As Long
bmWidth As Long
bmHeight As Long
bmWidthBytes As Long
bmPlanes As Integer
bmBitsPixel As Integer
bmBits As Long
End Type
Private Type BITMAPINFOHEADER
biSize As Long
biWidth As Long
biHeight As Long
biPlanes As Integer
biBitCount As Integer
biCompression As Long
biSizeImage As Long
biXPelsPerMeter As Long
biYPelsPerMeter As Long
biClrUsed As Long
biClrImportant As Long
End Type
Private Type RGBQUAD
rgbBlue As Byte
rgbGreen As Byte
rgbRed As Byte
rgbReserved As Byte
End Type
Private Type BITMAPINFO
bmiHeader As BITMAPINFOHEADER
bmiColors As RGBQUAD
End Type
Private Const DIB_RGB_COLORS As Long = 0
Private Declare Function GetDIBits Lib "gdi32" (ByVal hDC As Long, ByVal hBitmap As Long, ByVal lStart As Long, ByVal cLines As Long, ByVal lpvBits As Long, ByVal lpBMI As Long, ByVal lUsage As Long) As Long
Private Declare Function GetObjectW Lib "gdi32" (ByVal hGDIObj As Long, ByVal cbBuffer As Long, ByVal lpvObject As Long) As Long
Private WithEvents OcrEngine As cOCR, SoftwareBitmap As cSoftwareBitmap, bmiBitmapInfo As BITMAPINFO
Private Sub Form_Load()
With bmiBitmapInfo.bmiHeader: .biSize = LenB(bmiBitmapInfo.bmiHeader): .biPlanes = 1: .biBitCount = 32: End With
Set SoftwareBitmap = New cSoftwareBitmap: Set OcrEngine = New cOCR
End Sub
Private Sub OcrEngine_GetText(sText As String)
txtOCR = sText
End Sub
Private Sub OcrEngine_GetTextLines(colOcrLines As Collection)
Dim OcrLine As cOcrLine, OcrWord As cOcrWord
If Not colOcrLines Is Nothing Then
#If bInIDE Then
Debug.Print "Rotation Angle:", OcrEngine.TextAngle(False)
#End If
For Each OcrLine In colOcrLines
For Each OcrWord In OcrLine.WordsCollection
#If bInIDE Then
Debug.Print OcrWord.WordIndex, OcrWord.WordText
#End If
With OcrWord.RotatedBoundingRect
.X11 = ScaleX(.X11, vbPixels, ScaleMode): .Y11 = ScaleY(.Y11, vbPixels, ScaleMode)
.X12 = ScaleX(.X12, vbPixels, ScaleMode): .Y12 = ScaleY(.Y12, vbPixels, ScaleMode)
.X22 = ScaleX(.X22, vbPixels, ScaleMode): .Y22 = ScaleY(.Y22, vbPixels, ScaleMode)
.X21 = ScaleX(.X21, vbPixels, ScaleMode): .Y21 = ScaleY(.Y21, vbPixels, ScaleMode)
picOCR.Line (.X11, .Y11)-(.X12, .Y12), vbRed
picOCR.Line (.X12, .Y12)-(.X22, .Y22), vbRed
picOCR.Line (.X22, .Y22)-(.X21, .Y21), vbRed
picOCR.Line (.X21, .Y21)-(.X11, .Y11), vbRed
End With
Next OcrWord
Next OcrLine
End If
End Sub
Private Sub picOCR_Click()
Dim bmBitmap As BITMAP, hDC As Long, hOldBitmap As Long
If Clipboard.GetFormat(vbCFBitmap) Then
With picOCR
Set .Picture = Clipboard.GetData
GetObjectW .Picture.Handle, LenB(bmBitmap), VarPtr(bmBitmap)
.Width = .ScaleX(bmBitmap.bmWidth, vbPixels, .ScaleMode): .Height = .ScaleY(bmBitmap.bmHeight, vbPixels, .ScaleMode)
txtOCR.Move .Left, .Top + .Height, .Width: Width = Width - ScaleWidth + .Left * 2 + .Width: Height = Height - ScaleHeight + .Height + txtOCR.Height + .Top * 2
With bmiBitmapInfo.bmiHeader: .biWidth = bmBitmap.bmWidth: .biHeight = -bmBitmap.bmHeight: End With
If SoftwareBitmap.CreateSoftwareBitmap(bmBitmap.bmWidth, bmBitmap.bmHeight) Then
If GetDIBits(.hDC, .Picture.Handle, 0, bmBitmap.bmHeight, SoftwareBitmap.GetBitmapBuffer, VarPtr(bmiBitmapInfo), DIB_RGB_COLORS) Then
SoftwareBitmap.UnlockBitmapBuffer: OcrEngine.RecognizeAsync SoftwareBitmap
End If
End If
End With
Else
MsgBox "Clipboard does not contain a picture!", vbExclamation, App.Title: Exit Sub
End If
End Sub
It seems the more you crank up the picture contrast the better results you get on the text recognition. Best results would be from an image containing black text over a white background.
Here is the demo project: OCR.ZIP (Updated)
Last edited by VanGoghGaming; Nov 2nd, 2024 at 08:34 PM.
Reason: Added new features and bug fixes
-
Oct 29th, 2024, 10:14 PM
#2
Fanatic Member
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
i usde PaddleOCR-json_v1.4.0_windows_x86-64 for github?
-
Oct 29th, 2024, 11:23 PM
#3
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
Originally Posted by xxdoc123
i usde PaddleOCR-json_v1.4.0_windows_x86-64 for github?
Not sure if anyone can answer if you did or did not besides you
But yes there's lots of 3rd party solutions; I like VanGoghGaming's method here though since it's built in to Windows; though unfortunately only in the VB-unfriendly WinRT API.
-
Oct 30th, 2024, 12:46 AM
#4
Fanatic Member
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
Only BMP can be recognized
-
Oct 30th, 2024, 05:18 AM
#5
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
Very nice way to do OCR.
BEtter than my way (using Poppler).
I tried to convert it to an Activex DLL, so it could be generic (by passing the file as parameter or an picture), but ran into some problems.
Not too much time right now to investigate, but could be good interesting to make as a DLL
-
Oct 30th, 2024, 09:46 AM
#6
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
I tried to make the code as short as possible and present it as a proof of concept. Feel free to expand on it any way you want. There are many ways to accomplish what you want, here are some pointers:
1. The Long Way
- Create an IStream from a file using SHCreateStreamOnFileW
- Create a RandomAccessStream from the IStream using CreateRandomAccessStreamOverStream
- Enter WinRT land and create a BitmapDecoder object from the RandomAccessStream: BitmapDecoder.CreateAsync(RandomAccessStream)
- Get a SoftwareBitmap from the BitmapDecoder by calling its GetSoftwareBitmapAsync method
- Now you have a SoftwareBitmap object which you can use for text recognition with the code above
2. The Somewhat Shorter Way
- Use WIA to load a file into an IPicture object: With New WIA.ImageFile: .LoadFile sFileName: Set objIPicture = .FileData.Picture: End With
- Use SelectObject to select the objIPicture.Handle into a memory created hDC (CreateCompatibleDC(0))
- Now you can use GetDIBits as shown in the code above
Hope this helps.
Last edited by VanGoghGaming; Oct 30th, 2024 at 09:52 AM.
-
Oct 30th, 2024, 11:32 AM
#7
Fanatic Member
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
-
Oct 31st, 2024, 02:56 PM
#8
Lively Member
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
This is great. I've been wanting to do this since I first learnt about WinRT APIs and having seen people in the AutoHotkey community having used it for this sort of OCR.
-
Nov 2nd, 2024, 01:16 AM
#9
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
I've been playing around with this some more. In addition to providing the recognized text in a big chunk, the OCR engine also exposes a collection of text Lines which in turn expose a collection of text Words (all via an abstract interface called IVectorView).
Code:
Private Function GetLines(lpOcrResult As Long) As Collection
Dim lpIVectorViewLines As Long, lpIVectorViewWords As Long, lCountLines As Long, lCountWords As Long, i As Long, j As Long, hString As Long, lpOcrLine As Long, lpOcrWord As Long, _
OcrLine As cOcrLine, OcrWord As cOcrWord, colWords As New Collection, rcBoundingRect As RECTF
If InvokePtr(lpOcrResult, IOcrResult_GetLines, VarPtr(lpIVectorViewLines)) = S_OK Then
If InvokePtr(lpIVectorViewLines, IVectorView_GetSize, VarPtr(lCountLines)) = S_OK Then
For i = 0 To lCountLines - 1
If GetLines Is Nothing Then Set GetLines = New Collection
If InvokePtr(lpIVectorViewLines, IVectorView_GetAt, i, VarPtr(lpOcrLine)) = S_OK Then
Set OcrLine = New cOcrLine: OcrLine.LineIndex = i
If InvokePtr(lpOcrLine, IOcrLine_GetText, VarPtr(hString)) = S_OK Then OcrLine.LineText = WindowsGetString(hString): hString = WindowsDeleteString(hString)
If InvokePtr(lpOcrLine, IOcrLine_GetWords, VarPtr(lpIVectorViewWords)) = S_OK Then
If InvokePtr(lpIVectorViewWords, IVectorView_GetSize, VarPtr(lCountWords)) = S_OK Then
For j = 0 To lCountWords - 1
If InvokePtr(lpIVectorViewWords, IVectorView_GetAt, j, VarPtr(lpOcrWord)) = S_OK Then
Set OcrWord = New cOcrWord: OcrWord.WordIndex = j
If InvokePtr(lpOcrWord, IOcrWord_GetText, VarPtr(hString)) = S_OK Then OcrWord.WordText = WindowsGetString(hString): hString = WindowsDeleteString(hString)
If InvokePtr(lpOcrWord, IOcrWord_GetBoundingRect, VarPtr(rcBoundingRect)) = S_OK Then OcrWord.BoundingRect = rcBoundingRect
colWords.Add OcrWord: ReleasePtr lpOcrWord
End If
Next j
End If
ReleasePtr lpIVectorViewWords
End If
Set OcrLine.WordsCollection = colWords: Set colWords = Nothing: GetLines.Add OcrLine: ReleasePtr lpOcrLine
End If
Next i
End If
ReleasePtr lpIVectorViewLines
End If
End Function
Each Word has a bounding rectangle which we can draw in the PictureBox using the Line method:
Code:
Private Sub OcrEngine_GetTextLines(colOcrLines As Collection)
Dim OcrLine As cOcrLine, OcrWord As cOcrWord
If Not colOcrLines Is Nothing Then
For Each OcrLine In colOcrLines
For Each OcrWord In OcrLine.WordsCollection
With OcrWord.BoundingRect
.X = ScaleX(.X, vbPixels, ScaleMode): .Y = ScaleY(.Y, vbPixels, ScaleMode)
.Width = ScaleX(.Width, vbPixels, ScaleMode): .Height = ScaleY(.Height, vbPixels, ScaleMode)
picOCR.Line (.X, .Y)-(.X + .Width, .Y + .Height), vbRed, B
End With
Next OcrWord
Next OcrLine
End If
End Sub
The OCR Engine even recognizes text drawn at an angle and it also reports this angle of rotation. For example in this picture the rotation is -20.2 degrees around the center of the image:
Even if this looks like a simple problem of rotation/translation I can't figure out how to draw the new angled bounding rectangles in the PictureBox, all this trigonometry is doing my head in!
Maybe someone could figure out a formula for the new coordinates.
-
Nov 2nd, 2024, 02:02 AM
#10
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
The extraction with the position could be intersting.
I already do the same using Modi. Based on templates I create, it extracts data. But this OCR is more reliable thant the Modi one. So a combination of both could do a good job
NB : The definition for cOcrLine, cOcrWor... are missing
-
Nov 2nd, 2024, 03:03 AM
#11
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
In post #5 above you were talking about something called "Poppler", now you're saying "Modi", just how many OCR engines are you using?
Anyway I haven't gotten around to updating the ZIP archive in the original post. The cOcrLine and cOcrWord are mostly empty classes just holding the properties you see being updated in the code above. In fact the only reason they are classes is that so they can be added to their respective collections.
Right now I'm having troubles drawing the rotated rectangles around their corresponding words and the AI isn't helping much with the trigonometry, haha! I think it's because the rectangle coordinates are relative to the top-left corner while the rotation angle is relative to the center of the image. I may have to go old school on this (pen and paper) because I can't visualize it in my head for the life of me...
-
Nov 2nd, 2024, 03:23 AM
#12
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
I am using Poppler, Modi, and now Winrt.
I combine everything to make it nearly perfect.
Poppler to convert PDF to JPG (and previously also to Text)
Winrt (now) to extract the text.
both combined to detect blank pages and Poppler again to merge PDF to make a new PDF without blank pages
Then Modi to display JPG files and based on templates I create with Modi, to detect areas where are data to be extracterd.
I use the extracted text to determine which template should be used (based on VAT, account number etc...)
It is not so bad at all, I can do OCR on 90% of document
With the extration with the coordonates, I could probably be more accurate.
I also have another OCR, which uses an IA engine, and there, also I can reach 95% of accuracy.
But it costs arround 0.20 € per page
-
Nov 2nd, 2024, 04:13 AM
#13
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
Originally Posted by Thierry69
But it costs around 0.20 € per page
Alright, updated the code in the original post, now you can direct that to my PayPal account!
-
Nov 2nd, 2024, 04:31 AM
#14
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
Try GDI world transformations for cheap rotation (or any affine transformation) on an hDC with no math.
-
Nov 2nd, 2024, 05:23 AM
#15
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
I've never done that before so I wouldn't know where to begin. Also the drawing is just for visual candy. What would be really helpful would be updated coordinates for the rotated/translated bounding rectangles to reflect the real position of the words. I don't know why the OCR Engine doesn't do these calculations automatically and instead it just provides the rotation angle.
Even the MSDN page doesn't mention how to calculate the coordinates of the bounding rectangles but instead talks about rotating the image:
Use the TextAngle property to overlay recognition results correctly on the original image. If the value of the TextAngle property is not null or 0 (zero), then to overlay the recognized text correctly on the original image, you either have to rotate the original image by the detected angle in a counter-clockwise direction, or rotate the recognized text by the detected angle in a clockwise direction.
Last edited by VanGoghGaming; Nov 2nd, 2024 at 06:38 AM.
-
Nov 2nd, 2024, 08:37 PM
#16
-
Nov 3rd, 2024, 01:26 AM
#17
Fanatic Member
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
verygood ?why in my system can not used Set .Picture = Clipboard.GetData?
-
Nov 3rd, 2024, 04:00 AM
#18
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
I was thinking this morning, and I think that with this code, it will be easy to recreate the old MODI, and enhance it by creating an OCX with PDF viewer and extraction of text based on templates.
I already did it with MODI.
NB : VanGoghGaming, I sent you a MP
-
Nov 12th, 2024, 09:53 AM
#19
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
Originally Posted by Thierry69
I was thinking this morning, and I think that with this code, it will be easy to recreate the old MODI, and enhance it by creating an OCX with PDF viewer and extraction of text based on templates.
This should not be a problem. If you use the WinRT namespace Windows.Data.Pdf, you can extract the pages of a PDF into a WinRT Windows.Storage.Streams.RandomAccessStream and create a WinRT Windows.Graphics.Imaging.SoftwareBitmap from it. You can also create an IStream from the RandomAccessStream and use GDI+ to create an image to display the pages of the PDF. You can then pass the SoftwareBitmap to the WinRT OCR.
-
Nov 12th, 2024, 09:59 AM
#20
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
> You can also create an IStream from the RandomAccessStream and use GDI+ to create an image to display the pages of the PDF
This is worth a sample on its own because we have a pdfium based viewer w/ RC6 only.
cheers,
</wqw>
-
Nov 12th, 2024, 10:26 AM
#21
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
Originally Posted by wqweto
> You can also create an IStream from the RandomAccessStream and use GDI+ to create an image to display the pages of the PDF
This is worth a sample on its own because we have a pdfium based viewer w/ RC6 only.
cheers,
</wqw>
You can find a WinRT example on ActiveVB in the Up/Download area -> VBC_PdfViewer.zip The disadvantage is that the pages of a PDF are rendered as images. This means that links cannot be clicked, fields cannot be filled in, etc.
-
Nov 12th, 2024, 01:08 PM
#22
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
Yeah, I was looking at the "Windows.Data.Pdf Namespace" myself but was put off by the fact that it only renders PDFs as SoftwareBitmaps. The OCR Engine can be used to recognize the text from the images but as Franky mentioned above, you lose any interactivity with the PDF document...
-
Nov 12th, 2024, 01:19 PM
#23
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
Originally Posted by VanGoghGaming
Yeah, I was looking at the "Windows.Data.Pdf Namespace" myself but was put off by the fact that it only renders PDFs as SoftwareBitmaps. The OCR Engine can be used to recognize the text from the images but as Franky mentioned above, you lose any interactivity with the PDF document...
Which other PDF samples/libraries do you both envision which one can use in VB6 to view PDFs which allow text selection and link clicking?
Is there anything remotely useful and different than source code of plain StdPictures viewers?
Edit: I'm looking at pdfium API -- it has links but it would be quite some work to make it usable in VB6. The other one is mupdf and this one is humongous -- ~40MB DLL.
cheers,
</wqw>
-
Nov 12th, 2024, 04:51 PM
#24
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
Well I guess you could simulate text selection given the fact that the OCR Engine provides rectangles for all the recognized words but not sure it's worth the trouble. I'm not familiar with the pdfium one, I've seen fafalone put together an example with it merging multiple PDFs into one.
-
Nov 13th, 2024, 02:57 AM
#25
Re: VB6 - Simple OCR Text Recognition from a PictureBox Image
I would only use the Windows.Data.Pdf namespace for quickly displaying PDFs or extracting individual pages as images where editing the PDF is not important. Alternatively, convert the pages of a PDF into a bitmap, edit the bitmap and output it again via a PDF printer. You could use this to program your own preview handler for PDFs, for example. An additional OCR to extract text would certainly be an advantage. The Snipping Tool can already extract text from images. Alternatively, I could also imagine having the recognized text read out using WinRT TTS.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|