Results 1 to 27 of 27

Thread: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

  1. #1

    Thread Starter
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,356

    VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    On request, here a little Demo which shows how to use the (license-wise very generous), "pdfium.dll in action".

    PDFium (licensed under BSD by Google/Foxit) is the only PDF-reader/viewer-tool I know of,
    which can be used (and deployed) also in commercial Apps ...
    (the size of pdfium.dll is about 4 to 5MB in its windows-version, depending on optional compile-includes - that's normal)

    The VB6-Demo here comes in form of a little Viewer-App, which also allows Exports of:
    - Document-PlainText (into a Zip-Archive, which later will contain page_1.txt to page_n.txt files)
    - Document-Images (also into a Zip, either as *.png or *.jpg images, within page_x subfolders)
    Those exports are done in two stages:
    Stage 1 is an export into appropriate InMemoryDB-tables (Tables Text and Images, from where the contents can be selected into a Rs - containing two Fields -> Name As Text and Data As Blob aka. ByteArray, so the FileSystem is not touched by stage 1).

    Stage 2 (as used in the Zip-File-Export, triggered by appropriate Buttons in the Viewer-App),
    will convert the contents of either the Text- or Images MemDB-Tables into a Zip-ByteArray (still not touching the FileSystem)

    Stage 3 (one line of code) will then simply write-out the appropriate Zip-ByteArray to disk.

    Also the Read-Direction (of a PDF) can be triggered "FileSystem-less", when the PDF-content exists in a ByteArray somewhere
    (e.g. when retrieved via a Download, or retrieved via a RemoteDB-call, sitting in an Rs-BlobField).


    Ok, the 3 RC5-Base-libs (+ vbWidgets.dll in addition) are a requirement for this demo -
    please download and (re-)register them in a recent version (Zip-Archive support is only available in recent ones).

    As for a recent build of the pdfium.dll, there's a build-service provided by Pieter van Ginkel on GitHub here:
    https://github.com/pvginkel/PdfiumBu.../master/Builds

    There's several different versions of it available behind the above URL,
    the one I'm using is under: 2018-04-08/Pdfium-x86-no_v8-no_xfa

    So, to make the Demo work in the IDE, you will have to install (or re-register) the RC5-libs
    in a folder of your choice on your dev-machine...
    (this should not be the Project-Folder of this Demo, but a spearate RC5-Folder - as e.g. C:\RC5\
    ...please make sure, that both: vbRichClient5.dll and vbWidgets.dll are reigstered beforehand)

    Another pre-requisite before you run the Demo, is to place the pdfium.dll in the DemoProjects \Bin\-Folder.
    (it is the only Dll which needs to be placed there, when you run the Demo in the IDE,
    ...if you compile - and run the Executable, then also the 4 RC5-BaseDlls need to be placed in \Bin\)
    The just mentioned \Bin\ Folder of the Demo-Project does currently contain only a ReadMe.txt, which describes this as well).

    The pdfium-wrapping is done in 3 Classes:
    - cPDFium (the main-class)
    - cPDFiumPageText (a cPDFium-derivable ChildClass, which allows interaction with Page-Texts)
    - cPDFiumPageObject (a cPDFium-derivable ChildClass, which allows interaction with Page-Objects)

    The latter class above, can represent different (enumerable) PageObject-types as shown in the following enum:
    Code:
    Public Enum ePdfObjectType
      FPDF_PAGEOBJ_UNKNOWN = 0
      FPDF_PAGEOBJ_TEXT = 1
      FPDF_PAGEOBJ_PATH = 2
      FPDF_PAGEOBJ_IMAGE = 3
      FPDF_PAGEOBJ_SHADING = 4
      FPDF_PAGEOBJ_FORM = 5
    End Enum
    The magenta-colored entry above, is the one we will filter for, when we do Image-Exports (per Page).

    Note, that this Demo (due to using the RC5-libs), is (when compiled):
    - directly regfree deployable as a true portable App (without any Setup)
    - fully DPI-aware also on HighRes-Displays (without any problems with e.g. 200% DPI-scaling)
    - not using any Win32-API-calls (and thus fully "linux-aware", when compiled by a future platform-independent compiler)
    - not really requiring any manifesting (since the UI is not using any Windows-CommonControls, and DPI-awareness is ensured by Cairo)

    Here a ScreenShot:


    And here the Zipped Demo-Source (without any Dlls in the Bin-Folder, please ensure its population yourself)
    PDFiumViewer.zip

    Have fun!

    Olaf
    Last edited by Schmidt; Aug 24th, 2019 at 09:41 AM.

  2. #2
    Hyperactive Member
    Join Date
    Jul 2013
    Posts
    400

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Nice job, and very useful
    Thank you
    Carlos

  3. #3
    Fanatic Member
    Join Date
    Jan 2015
    Posts
    610

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    I'll try to test it, but I need to install RC5.

    My actual homemade PDF viewer is very powerfull, and based on MODI.
    I export everything to JPG, then open and manage with MODI, including OCR, extracting, etc...
    And 100% working on all Windows, and TS also

    but of course quite a lot of code behind

  4. #4
    Hyperactive Member
    Join Date
    Jul 2013
    Posts
    400

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    My actual homemade PDF viewer is very powerfull, and based on MODI.
    But one needs to have Microsoft Office, right?
    Carlos

  5. #5
    Fanatic Member
    Join Date
    Jan 2015
    Posts
    610

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    No need to have Microsoft Office.
    Just installing the MODI from the officepack (ich is free)

  6. #6
    Hyperactive Member
    Join Date
    Jul 2013
    Posts
    400

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Quote Originally Posted by Thierry69 View Post
    No need to have Microsoft Office.
    Just installing the MODI from the officepack (ich is free)
    Well, I "googled" around and couldn't get an installer to download. What is that "officepack"?
    All I could find was some tricky solutions from microsoft itself. Is there a simple solution to deploy it?
    Carlos

  7. #7
    Fanatic Member
    Join Date
    Jan 2015
    Posts
    610

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Search Sharepointdesigner.exe

  8. #8
    Hyperactive Member
    Join Date
    Jul 2013
    Posts
    400

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Quote Originally Posted by Thierry69 View Post
    Search Sharepointdesigner.exe
    SharePoint Designer for Office 2007 is no long available. I was able to install MDI to TIFF though, and got mdi2tif.exe, mstfcore.dll, mstfink.dll, msptls.dll, and richedit20.dll inside Program File(x86)\modiconv. Is this all one need?
    Carlos

  9. #9
    Fanatic Member
    Join Date
    Jan 2015
    Posts
    610

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    You can still download it, not hard to find it, and no problem to install it
    You'll have an OCX installed, and it is powerfull

  10. #10
    Hyperactive Member
    Join Date
    Jul 2013
    Posts
    400

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Quote Originally Posted by Thierry69 View Post
    You can still download it, not hard to find it, and no problem to install it
    You'll have an OCX installed, and it is powerfull
    I think I tried from everywhere. All I could get is for Office 2013.
    I guess it's a no way road now.
    Thanks anyway.
    Carlos

  11. #11
    Fanatic Member
    Join Date
    Jan 2015
    Posts
    610

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Yes, there is no newest version.
    When isntalling it, just install the Imaging part, and no need for the remaining

  12. #12
    Hyperactive Member
    Join Date
    Jul 2013
    Posts
    400

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    What I mean is the OLDER version available is for Office 2013 and it doesn't include the Imaging Tools.
    The last one with MODI is for Office 2007 and MS link simply "disappeared".
    I was hoping that being freeware the installer could be found somewhere else, but no, there must be "something" in the license the prevents it, as usually with MS pseudo-free stuff.
    Carlos

  13. #13
    Fanatic Member
    Join Date
    Jan 2015
    Posts
    610

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    MP Sent

  14. #14
    Frenzied Member
    Join Date
    Dec 2008
    Location
    Melbourne Australia
    Posts
    1,487

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Schmidt,
    Will it work (developing) in XP ?
    Rob

  15. #15

    Thread Starter
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,356

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Quote Originally Posted by Bobbles View Post
    Schmidt,
    Will it work (developing) in XP ?
    The pdfium.dll on the link I gave in the opener post, was built with a newer VC-environment,
    and that (depending on the VC-project-settings) can cause problems on "systems, which ran out of support".

    Out of interest, I've just tested "pdfium.dll" with "depends.exe" on an XP-VM,
    and it says that "ieshims.dll" and also "wer.dll" were expected - but missing.
    (those were introduced with IE8 for Vista, though not with IE8 for XP).

    So, whilst the RC5-package would work on XP (compiled with a 7year old VC-environment and statically linked msvcrt-deps),
    the pdfium.dll will not load on XP "out of the box" due to the above missing dependencies.

    There might be a chance, that those are contained in some "MS-VC-redist"-setup (or some other older MS-upgrade-package for XP),
    but I don't have the time to find out, which one you might have to install to fix those dependencies.

    5 years ago, I might have tried to solve that "for VB6-devs who still had to deploy to XP",
    but in the meantime there's not much of an XP-userbase left "in the wild", to bother with it.

    IMO (regarding "time to invest" on your end), you'd be better off,
    to just move your VB6-stuff over to at least a Win7-installation (or -VM).

    Olaf

  16. #16
    Fanatic Member Black_Storm's Avatar
    Join Date
    Sep 2007
    Location
    any where
    Posts
    579

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    i have 2 problems with this sample.
    1-when i opened pdf files with persian or arabic contents and then i saved as text.all text is empty,this source can support arabic or farsi fonts and utf-8 or ... ?


    2-i registered rc5 or rc6 complete.when i run project it work but when i maked exe (compiled) and then run exe,i see this error :
    file not found DIRECTCOM
    i have no problem with other vbrichclient samples and i can make exe and run theme without problem but when i want run this sample (compiled exe ) it show that error to me.

  17. #17
    Lively Member
    Join Date
    Nov 2020
    Posts
    67

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    @OLAF

    Instead of ZIP files, is it possible to export both images and text to single files?
    How to?

  18. #18
    Lively Member
    Join Date
    Nov 2020
    Posts
    67

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)


    Invece di file ZIP, è possibile esportare immagini e testo in singoli file?
    Come fare?
    Did I ask too difficult a question? :confuso:

  19. #19

    Thread Starter
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,356

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Quote Originally Posted by LeoFar View Post
    Did I ask too difficult a question? :confuso:
    No, I'm just not reading the CodeBank-threads as regularly as the normal VB6-forum...

    As for your question... there should be a loop "in the export-button-click-EventHandlers",
    which currently ensures the "packing into a Zip".

    It should be quite easy, to write the byte- or text-contents out into "single files" instead.

    Where exactly do you have problems, adapting the code?

    Olaf

  20. #20
    New Member
    Join Date
    Apr 2022
    Posts
    1

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Dear Olaf,
    cool stuff! Thanks a lot for that!
    another question: have you ever implemented FPDF_SaveAsCopy or FPDF_SaveWithVersion in VB?
    Thanks in advance
    Wolfgang

  21. #21

    Thread Starter
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,356

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Quote Originally Posted by WFuertbauer View Post
    another question: have you ever implemented FPDF_SaveAsCopy or FPDF_SaveWithVersion in VB?
    No, but the cPDFium-Class (which contains the API-Declares) can be enhanced quite easily...

    E.g. currently you will find:
    ...
    Private Declare Function FPDFText_LoadPage Lib "PDFium" Alias "_FPDFText_LoadPage@4" (ByVal hPage As Long) As Long
    ...


    Now, if you start Depends.exe (loading the pdfium-dll), you will see an export like: _FPDF_SaveAsCopy@12

    It shouldn't be that much of a problem, to make your own Declare-line out of it (following the pattern).
    As for the Parameters (probably 3, due to the @12), just google for the definition of this function.

    HTH

    Olaf

  22. #22
    New Member
    Join Date
    Jul 2019
    Posts
    9

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Looking at the foxit pdfium sdk, SaveAsCopy needs a FPDF_FILEWRITE structure. Along with creating a new doc/pages and adding text/image objects, this would be super useful to be implemented in VB:

    Code:
    FPDF_BOOL STDCALL FPDF_SaveAsCopy 	( 	FPDF_DOCUMENT  	document,
    		FPDF_FILEWRITE *  	pFileWrite,
    		FPDF_DWORD  	flags 
    	) 		
    
    Save the document to a copy in custom way.
    
    Parameters
        [in]	document	Handle to a FPDF_DOCUMENT object that specifies a valid PDF document.
        [in]	pFileWrite	Pointer to a FPDF_FILEWRITE interface structure that specifies a custom file writing structure.
        [in]	flags	Saving flag. It should be one of the following macro definitions:
    
            FPDF_INCREMENTAL
            FPDF_NO_INCREMENTAL
            FPDF_REMOVE_SECURITY
    Code:
    FPDF_FILEWRITE Struct Reference
    PDFium
    
    Interface structure for customized file writing access. More...
    
    #include <fpdf_save.h>
    Public Attributes
    int 	version
     	Version number of the interface. Currently it must be 1.
     
    int(* 	WriteBlock )(struct FPDF_FILEWRITE_ *pThis, const void *pData, unsigned long size)
     	(Required)Callback function to output a block of data in your custom way.
    https://developers.foxit.com/resourc...68f67a5c562f86

  23. #23
    Lively Member
    Join Date
    Sep 2016
    Location
    Germany, Bavaria
    Posts
    77

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Hi Olaf,
    in your example, texts and images are extracted from the PDF. Would the extraction of further data such as XML (XInvoice or ZUGFeRD) also work? Are there other functions in the DLL that are not encapsulated by your example?

  24. #24

    Thread Starter
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,356

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    Quote Originally Posted by wwolf View Post
    in your example, texts and images are extracted from the PDF.
    Would the extraction of further data such as XML (XInvoice or ZUGFeRD) also work?
    I'm quite sure, it would -
    (one would just have to watch for the right "PDF-NodeType" in the enumeration...)

    Quote Originally Posted by wwolf View Post
    Are there other functions in the DLL that are not encapsulated by your example?
    Of course - the example covers only about 10% of the available exports -
    E.g. there's a whole bunch of them, dedicated to the "write-direction", which my example did not cover -
    fafalone did an example recently, which demonstrated "merging" of images and other PDFs into a given document IIRC.

    Olaf

  25. #25
    Lively Member
    Join Date
    Sep 2016
    Location
    Germany, Bavaria
    Posts
    77

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    I have loaded a simple ZUGFeRD invoice into your example and used FPDFPage_GetObject and FPDFPageObj_GetType to output the object types. I only found FPDF_PAGEOBJ_TEXT and FPDF_PAGEOBJ_PATH in the PDF. I now assume that one of these objects represents the contained XML. In your example, only Imege objects are evaluated, so I can't get any further for the time being. I am currently trying to find helpful documentation on PDFium. Do you have any links or suggestions that might help me?

  26. #26
    Member
    Join Date
    Mar 2020
    Posts
    42

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    The only way for me to create ZUGFeRD / FacturX (PDF+XML) is to use a dirty call to a Python program with a library dedicated to this task.
    Would it be possible to make this kind of document with VB6+pdfium.dll only ?

  27. #27
    PowerPoster
    Join Date
    Jul 2010
    Location
    NYC
    Posts
    6,179

    Re: VB6 PDFium-Binding (Viewing + Text- and ImageExports)

    My examples were merging pdf pages from multiple pdfs into a single pdf, and text search; didn't delve into the objects on a page. The latter might be helpful as it delves a little farther into accessing native pdfium string types.

    The headers are the best source for documentation I've seen.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width