Results 1 to 2 of 2

Thread: extract the pdf information

  1. #1

    Thread Starter
    Junior Member
    Join Date
    Sep 2003
    Posts
    27

    extract the pdf information

    Dear sir,
    I have a set of pdf files (300 files). I want to write a searching program that: allow user to search the desire pdf files by typing some criteria as: text, layout, image, ...Example: search all matching pdf files that contain the string "multimedia", or search all matching pdf files that contain the string "portal" and the size of this file is larger 100 pages, or search all matching pdf files that contain the images with a certain histogram.
    The main issue that I encounter is: I donot know how to extract the text, image, layout of a pdf files. Would you please give me your instruction on how to build the such program.
    Please mail to me at: [email protected]
    Thank you in million.
    hoang.

  2. #2
    Fanatic Member twanvl's Avatar
    Join Date
    Dec 2001
    Posts
    771
    You can find the specifications for the .pdf format on http://www.wotsit.org. I had a look, and it is a very large (500 page) document; happy reading

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width