Search inside a pdf document for a specific text without opening the document?-VBForums
Results 1 to 12 of 12

Thread: Search inside a pdf document for a specific text without opening the document?

  1. #1

    Thread Starter
    Fanatic Member
    Join Date
    Jul 2002
    Posts
    665

    Question Search inside a pdf document for a specific text without opening the document?

    How can I search inside a pdf document for a specific text without opening the document, and if it finds the text in the pdf document a MsgBox should pop up. I really need help with this.

  2. #2

    Thread Starter
    Fanatic Member
    Join Date
    Jul 2002
    Posts
    665
    I wounder if I can use ShellExecute to search inside a pdf document for a specific text. If it's possible, how?

  3. #3
    pathfinder NotLKH's Avatar
    Join Date
    Apr 2001
    Posts
    2,397
    Without useing Adobe Acrobat SDK or some sort of plugin, I'm afraid its extremely hard.
    Near impossible if the PDF is useing subset fonts.

    Impossible if the fonts have been converted to outlines.

    IMHO,
    -Lou

    Unless you're looking for Document Structuring elements, which are normally text strings. Then its pretty simple. Of course, I don't think you meant you were looking for things such as startxref.


  4. #4

    Thread Starter
    Fanatic Member
    Join Date
    Jul 2002
    Posts
    665
    Have you or somone else a example to do this with Adobe Acrobat SDK or some sort of plugin?

  5. #5
    pathfinder NotLKH's Avatar
    Join Date
    Apr 2001
    Posts
    2,397
    Nope. I'd have to develop it, since I have'nt used SDK for about 3 years.

    If you have the Adobe Acrobat Application, {Not the reader, but the full app}, then I'd suggest downloading the Adobe Acrobat SDK 5 development kit from Adobes site {5.0 is still publicly available, but 6.0 isn't. you'd have to spend 1000 bucks to become registered as a developer before you could get 6.0}.
    Once you get the 5.0 SDK, with the manuals, you'll find documentation for developing in VB. They have methods for extracting text, and even methods for saveing PDF's as text via VB.

    for example:
    VB Code:
    1. GetText: BSTR GetText(long nTextIndex);
    2. Description
    3.     Gets the text from the specified element of a text selection.
    4.     To obtain all text in a text selection, use PDTextSelect.GetNumText
    5.     to determine the number of elements in the text selection,
    6.     then use this method in a loop to obtain each of the elements.
    7. Parameters
    8.     nTextIndex The element of the text selection to get.
    9. Return Value
    10.     The text, or an empty string if nTextIndex is greater than the
    11.     number of elements in the text selection.
    12. Related Methods
    13.     AVDoc.ClearSelection
    14.     AVDoc.SetTextSelection
    15.     AVDoc.ShowTextSelect
    16.     PDPage.CreatePageHilite
    17.     PDDoc.CreateTextSelect
    18.     PDPage.CreateWordHilite
    19.     PDTextSelect.Destroy
    20.     PDTextSelect.GetBoundingRect
    21.     PDTextSelect.GetNumText
    22.     PDTextSelect.GetPage

    -Lou

  6. #6

  7. #7

    Thread Starter
    Fanatic Member
    Join Date
    Jul 2002
    Posts
    665
    Thanks NotLKH!

    I have tried but without any success. I have also tried the sample app you upload here, without any success. Can show me a simple form with a working code wich does this?
    I would be very greateful if you could do this.

  8. #8
    pathfinder NotLKH's Avatar
    Join Date
    Apr 2001
    Posts
    2,397
    searchpdf won't work if you haven't installed the SDK on your system.

    It also {I think} won't work if you don't have the complete Adobe Acrobat app. And I think it has to be 3.0 or higher.

    So, try getting Acrobat SDK 5.0 from Adobes site {Find the Downloads area, ummm, just looking at their site. Not so good. Perhaps I'll refresh...
    lol!!!


    Here's a link:

    Complete SDK 5.0 for Windows {22 meg}

    Note:

    To access this content:

    You must have an ASN Web Account. If you already have an ASN Web Account, please login now.
    The ASN Web Account is Free, so go ahead and register.

    -Lou

  9. #9

    Thread Starter
    Fanatic Member
    Join Date
    Jul 2002
    Posts
    665
    I have Acrobat SDK 5.0 installed at my computer already. But it still not working. Can't you show me a simple working form with two textboxes and one commandbutton, where I put the searchword in the first textbox and the path (C:\pdfFile.pdf) to the pdf file in the second textbox. I search with the commandbutton. If it finds the searchword a MsgBox should pop up.

  10. #10
    pathfinder NotLKH's Avatar
    Join Date
    Apr 2001
    Posts
    2,397
    Hmm. It seemed to work for me, although it looks too complex at this time to understand exactly what it does.

    Now, inside your sample VB Apps, I think you'll find the project VBjsoFindWord helpful.

    It lets you select a PDF to search, and lets you input text to search for.

    in the FindWordJSO sub,tweaking the code a bit to the following:

    VB Code:
    1. If MsgBox("The word is found: Count " & nCount & " And it is on Page " & (1 + i) & ".  Continue?", vbYesNo) = vbNo Then
    2.            GoTo TheEnd
    3.       End If
    and running the project,
    When I select the file:

    "C:\Program Files\Adobe\Acrobat 5.0 SDK\Documentation\RELEASENOTES.PDF"

    and search for "new", it says it finds 3 instances, on pages 1, 3, and 4.


    Does this work for you?

    -Lou

  11. #11

    Thread Starter
    Fanatic Member
    Join Date
    Jul 2002
    Posts
    665
    Even though I have SDK 5.0 installed on my computer and Reference to Adobe Acrobat 5 Type Library is activated, I get a error:
    "ActiveX component can't create object"

    It stops on the row:
    Set gApp = CreateObject("AcroExch.App")

    Why?
    Last edited by Pirre001; Oct 11th, 2003 at 05:40 PM.

  12. #12
    pathfinder NotLKH's Avatar
    Join Date
    Apr 2001
    Posts
    2,397
    posted earlier
    If you have the Adobe Acrobat Application, {Not the reader, but the full app},

    again, posted earlier
    It also {I think} won't work if you don't have the complete Adobe Acrobat app. And I think it has to be 3.0 or higher

    Do you have the complete Adobe Acrobat Application?
    Not the Reader, but Acrobat 3.x, 4.x, 5.x, or 6.x?

    -Lou

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Featured


Click Here to Expand Forum to Full Width

Survey posted by VBForums.