-
Aug 24th, 2019, 09:33 AM
#1
VB6 PDFium-Binding (Viewing + Text- and ImageExports)
On request, here a little Demo which shows how to use the (license-wise very generous), "pdfium.dll in action".
PDFium (licensed under BSD by Google/Foxit) is the only PDF-reader/viewer-tool I know of,
which can be used (and deployed) also in commercial Apps ...
(the size of pdfium.dll is about 4 to 5MB in its windows-version, depending on optional compile-includes - that's normal)
The VB6-Demo here comes in form of a little Viewer-App, which also allows Exports of:
- Document-PlainText (into a Zip-Archive, which later will contain page_1.txt to page_n.txt files)
- Document-Images (also into a Zip, either as *.png or *.jpg images, within page_x subfolders)
Those exports are done in two stages:
Stage 1 is an export into appropriate InMemoryDB-tables (Tables Text and Images, from where the contents can be selected into a Rs - containing two Fields -> Name As Text and Data As Blob aka. ByteArray, so the FileSystem is not touched by stage 1).
Stage 2 (as used in the Zip-File-Export, triggered by appropriate Buttons in the Viewer-App),
will convert the contents of either the Text- or Images MemDB-Tables into a Zip-ByteArray (still not touching the FileSystem)
Stage 3 (one line of code) will then simply write-out the appropriate Zip-ByteArray to disk.
Also the Read-Direction (of a PDF) can be triggered "FileSystem-less", when the PDF-content exists in a ByteArray somewhere
(e.g. when retrieved via a Download, or retrieved via a RemoteDB-call, sitting in an Rs-BlobField).
Ok, the 3 RC5-Base-libs (+ vbWidgets.dll in addition) are a requirement for this demo -
please download and (re-)register them in a recent version (Zip-Archive support is only available in recent ones).
As for a recent build of the pdfium.dll, there's a build-service provided by Pieter van Ginkel on GitHub here:
https://github.com/pvginkel/PdfiumBu.../master/Builds
There's several different versions of it available behind the above URL,
the one I'm using is under: 2018-04-08/Pdfium-x86-no_v8-no_xfa
So, to make the Demo work in the IDE, you will have to install (or re-register) the RC5-libs
in a folder of your choice on your dev-machine...
(this should not be the Project-Folder of this Demo, but a spearate RC5-Folder - as e.g. C:\RC5\
...please make sure, that both: vbRichClient5.dll and vbWidgets.dll are reigstered beforehand)
Another pre-requisite before you run the Demo, is to place the pdfium.dll in the DemoProjects \Bin\-Folder.
(it is the only Dll which needs to be placed there, when you run the Demo in the IDE,
...if you compile - and run the Executable, then also the 4 RC5-BaseDlls need to be placed in \Bin\)
The just mentioned \Bin\ Folder of the Demo-Project does currently contain only a ReadMe.txt, which describes this as well).
The pdfium-wrapping is done in 3 Classes:
- cPDFium (the main-class)
- cPDFiumPageText (a cPDFium-derivable ChildClass, which allows interaction with Page-Texts)
- cPDFiumPageObject (a cPDFium-derivable ChildClass, which allows interaction with Page-Objects)
The latter class above, can represent different (enumerable) PageObject-types as shown in the following enum:
Code:
Public Enum ePdfObjectType
FPDF_PAGEOBJ_UNKNOWN = 0
FPDF_PAGEOBJ_TEXT = 1
FPDF_PAGEOBJ_PATH = 2
FPDF_PAGEOBJ_IMAGE = 3
FPDF_PAGEOBJ_SHADING = 4
FPDF_PAGEOBJ_FORM = 5
End Enum
The magenta-colored entry above, is the one we will filter for, when we do Image-Exports (per Page).
Note, that this Demo (due to using the RC5-libs), is (when compiled):
- directly regfree deployable as a true portable App (without any Setup)
- fully DPI-aware also on HighRes-Displays (without any problems with e.g. 200% DPI-scaling)
- not using any Win32-API-calls (and thus fully "linux-aware", when compiled by a future platform-independent compiler)
- not really requiring any manifesting (since the UI is not using any Windows-CommonControls, and DPI-awareness is ensured by Cairo)
Here a ScreenShot:
And here the Zipped Demo-Source (without any Dlls in the Bin-Folder, please ensure its population yourself)
PDFiumViewer.zip
Have fun!
Olaf
Last edited by Schmidt; Aug 24th, 2019 at 09:41 AM.
-
Aug 24th, 2019, 04:21 PM
#2
Hyperactive Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Nice job, and very useful
Thank you
-
Aug 26th, 2019, 01:38 AM
#3
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
I'll try to test it, but I need to install RC5.
My actual homemade PDF viewer is very powerfull, and based on MODI.
I export everything to JPG, then open and manage with MODI, including OCR, extracting, etc...
And 100% working on all Windows, and TS also
but of course quite a lot of code behind
-
Aug 26th, 2019, 04:16 AM
#4
Hyperactive Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
My actual homemade PDF viewer is very powerfull, and based on MODI.
But one needs to have Microsoft Office, right?
-
Aug 26th, 2019, 04:30 AM
#5
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
No need to have Microsoft Office.
Just installing the MODI from the officepack (ich is free)
-
Aug 26th, 2019, 06:20 AM
#6
Hyperactive Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Originally Posted by Thierry69
No need to have Microsoft Office.
Just installing the MODI from the officepack (ich is free)
Well, I "googled" around and couldn't get an installer to download. What is that "officepack"?
All I could find was some tricky solutions from microsoft itself. Is there a simple solution to deploy it?
-
Aug 26th, 2019, 06:37 AM
#7
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Search Sharepointdesigner.exe
-
Aug 26th, 2019, 07:16 AM
#8
Hyperactive Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Originally Posted by Thierry69
Search Sharepointdesigner.exe
SharePoint Designer for Office 2007 is no long available. I was able to install MDI to TIFF though, and got mdi2tif.exe, mstfcore.dll, mstfink.dll, msptls.dll, and richedit20.dll inside Program File(x86)\modiconv. Is this all one need?
-
Aug 26th, 2019, 07:49 AM
#9
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
You can still download it, not hard to find it, and no problem to install it
You'll have an OCX installed, and it is powerfull
-
Aug 26th, 2019, 08:43 AM
#10
Hyperactive Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Originally Posted by Thierry69
You can still download it, not hard to find it, and no problem to install it
You'll have an OCX installed, and it is powerfull
I think I tried from everywhere. All I could get is for Office 2013.
I guess it's a no way road now.
Thanks anyway.
-
Aug 26th, 2019, 11:08 AM
#11
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Yes, there is no newest version.
When isntalling it, just install the Imaging part, and no need for the remaining
-
Aug 26th, 2019, 11:42 AM
#12
Hyperactive Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
What I mean is the OLDER version available is for Office 2013 and it doesn't include the Imaging Tools.
The last one with MODI is for Office 2007 and MS link simply "disappeared".
I was hoping that being freeware the installer could be found somewhere else, but no, there must be "something" in the license the prevents it, as usually with MS pseudo-free stuff.
-
Aug 26th, 2019, 11:44 AM
#13
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
-
Aug 30th, 2019, 11:18 PM
#14
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Schmidt,
Will it work (developing) in XP ?
Rob
-
Aug 31st, 2019, 07:24 AM
#15
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Originally Posted by Bobbles
Schmidt,
Will it work (developing) in XP ?
The pdfium.dll on the link I gave in the opener post, was built with a newer VC-environment,
and that (depending on the VC-project-settings) can cause problems on "systems, which ran out of support".
Out of interest, I've just tested "pdfium.dll" with "depends.exe" on an XP-VM,
and it says that "ieshims.dll" and also "wer.dll" were expected - but missing.
(those were introduced with IE8 for Vista, though not with IE8 for XP).
So, whilst the RC5-package would work on XP (compiled with a 7year old VC-environment and statically linked msvcrt-deps),
the pdfium.dll will not load on XP "out of the box" due to the above missing dependencies.
There might be a chance, that those are contained in some "MS-VC-redist"-setup (or some other older MS-upgrade-package for XP),
but I don't have the time to find out, which one you might have to install to fix those dependencies.
5 years ago, I might have tried to solve that "for VB6-devs who still had to deploy to XP",
but in the meantime there's not much of an XP-userbase left "in the wild", to bother with it.
IMO (regarding "time to invest" on your end), you'd be better off,
to just move your VB6-stuff over to at least a Win7-installation (or -VM).
Olaf
-
Feb 17th, 2021, 12:04 AM
#16
Fanatic Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
i have 2 problems with this sample.
1-when i opened pdf files with persian or arabic contents and then i saved as text.all text is empty,this source can support arabic or farsi fonts and utf-8 or ... ?
2-i registered rc5 or rc6 complete.when i run project it work but when i maked exe (compiled) and then run exe,i see this error :
file not found DIRECTCOM
i have no problem with other vbrichclient samples and i can make exe and run theme without problem but when i want run this sample (compiled exe ) it show that error to me.
-
Feb 17th, 2021, 02:56 AM
#17
Lively Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
@OLAF
Instead of ZIP files, is it possible to export both images and text to single files?
How to?
-
Feb 22nd, 2021, 03:01 AM
#18
Lively Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Invece di file ZIP, è possibile esportare immagini e testo in singoli file?
Come fare?
Did I ask too difficult a question? :confuso:
-
Feb 22nd, 2021, 07:46 AM
#19
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Originally Posted by LeoFar
Did I ask too difficult a question? :confuso:
No, I'm just not reading the CodeBank-threads as regularly as the normal VB6-forum...
As for your question... there should be a loop "in the export-button-click-EventHandlers",
which currently ensures the "packing into a Zip".
It should be quite easy, to write the byte- or text-contents out into "single files" instead.
Where exactly do you have problems, adapting the code?
Olaf
-
Apr 19th, 2022, 01:51 AM
#20
New Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Dear Olaf,
cool stuff! Thanks a lot for that!
another question: have you ever implemented FPDF_SaveAsCopy or FPDF_SaveWithVersion in VB?
Thanks in advance
Wolfgang
-
Apr 20th, 2022, 03:18 AM
#21
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Originally Posted by WFuertbauer
another question: have you ever implemented FPDF_SaveAsCopy or FPDF_SaveWithVersion in VB?
No, but the cPDFium-Class (which contains the API-Declares) can be enhanced quite easily...
E.g. currently you will find:
...
Private Declare Function FPDFText_LoadPage Lib "PDFium" Alias "_FPDFText_LoadPage@4" (ByVal hPage As Long) As Long
...
Now, if you start Depends.exe (loading the pdfium-dll), you will see an export like: _FPDF_SaveAsCopy@12
It shouldn't be that much of a problem, to make your own Declare-line out of it (following the pattern).
As for the Parameters (probably 3, due to the @12), just google for the definition of this function.
HTH
Olaf
-
Apr 20th, 2022, 09:17 AM
#22
New Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Looking at the foxit pdfium sdk, SaveAsCopy needs a FPDF_FILEWRITE structure. Along with creating a new doc/pages and adding text/image objects, this would be super useful to be implemented in VB:
Code:
FPDF_BOOL STDCALL FPDF_SaveAsCopy ( FPDF_DOCUMENT document,
FPDF_FILEWRITE * pFileWrite,
FPDF_DWORD flags
)
Save the document to a copy in custom way.
Parameters
[in] document Handle to a FPDF_DOCUMENT object that specifies a valid PDF document.
[in] pFileWrite Pointer to a FPDF_FILEWRITE interface structure that specifies a custom file writing structure.
[in] flags Saving flag. It should be one of the following macro definitions:
FPDF_INCREMENTAL
FPDF_NO_INCREMENTAL
FPDF_REMOVE_SECURITY
Code:
FPDF_FILEWRITE Struct Reference
PDFium
Interface structure for customized file writing access. More...
#include <fpdf_save.h>
Public Attributes
int version
Version number of the interface. Currently it must be 1.
int(* WriteBlock )(struct FPDF_FILEWRITE_ *pThis, const void *pData, unsigned long size)
(Required)Callback function to output a block of data in your custom way.
https://developers.foxit.com/resourc...68f67a5c562f86
-
Aug 12th, 2024, 04:34 AM
#23
Lively Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Hi Olaf,
in your example, texts and images are extracted from the PDF. Would the extraction of further data such as XML (XInvoice or ZUGFeRD) also work? Are there other functions in the DLL that are not encapsulated by your example?
-
Aug 12th, 2024, 05:15 AM
#24
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
Originally Posted by wwolf
in your example, texts and images are extracted from the PDF.
Would the extraction of further data such as XML (XInvoice or ZUGFeRD) also work?
I'm quite sure, it would -
(one would just have to watch for the right "PDF-NodeType" in the enumeration...)
Originally Posted by wwolf
Are there other functions in the DLL that are not encapsulated by your example?
Of course - the example covers only about 10% of the available exports -
E.g. there's a whole bunch of them, dedicated to the "write-direction", which my example did not cover -
fafalone did an example recently, which demonstrated "merging" of images and other PDFs into a given document IIRC.
Olaf
-
Aug 12th, 2024, 05:46 AM
#25
Lively Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
I have loaded a simple ZUGFeRD invoice into your example and used FPDFPage_GetObject and FPDFPageObj_GetType to output the object types. I only found FPDF_PAGEOBJ_TEXT and FPDF_PAGEOBJ_PATH in the PDF. I now assume that one of these objects represents the contained XML. In your example, only Imege objects are evaluated, so I can't get any further for the time being. I am currently trying to find helpful documentation on PDFium. Do you have any links or suggestions that might help me?
-
Aug 12th, 2024, 08:49 AM
#26
Member
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
The only way for me to create ZUGFeRD / FacturX (PDF+XML) is to use a dirty call to a Python program with a library dedicated to this task.
Would it be possible to make this kind of document with VB6+pdfium.dll only ?
-
Aug 12th, 2024, 12:50 PM
#27
Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)
My examples were merging pdf pages from multiple pdfs into a single pdf, and text search; didn't delve into the objects on a page. The latter might be helpful as it delves a little farther into accessing native pdfium string types.
The headers are the best source for documentation I've seen.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|