Skip to main content
Solved

Detemine number of pages in a pdf


jdh
Contributor
Forum|alt.badge.img+28
  • Contributor

Is there a way to determine the number of pages in a pdf other than reading them all as prf_page_metadata and sampling the last feature? Some of my pdfs are over 5000 pages, and this seems inefficient, especially as this information appears in the Document Properties in Acrobat.

 

 

Setting the Parameter Pages to Read to -1 produces an error.

Best answer by debbiatsafe

Hi @jdh

I'm pleased to say the PDF reader will now output the total number of pages contained within a PDF document FME 2019.1 (builds 19599 and higher). The number of pages is contained within an attribute called num_pages on the feature emitted from the pdf_document_info_metadata feature type.

You can download the most recent FME installers at www.safe.com/downloads

View original
Did this help you find an answer to your question?

4 replies

debbiatsafe
Safer
Forum|alt.badge.img+20

Hi @jdh

Unfortunately, the PDF reader does not currently read the total number of pages in a file as an attribute. FMEENGINE-60427 has been created to track this enhancement request. I will update this post once this feature has been added.

I would suggest continuing to use the method you have described (read all pages as features using the pdf_page_metadata feature type and sampling the last) to find the total number of pages.

Please note the Pages to Read parameter supports integers and page ranges (eg. 1, 3- would read everything but the second page). Negative index (eg. -1) are not supported within this parameter.


ebygomm
Influencer
Forum|alt.badge.img+31
  • Influencer
  • June 19, 2019

If you're able to use ArcPy, PDFDocument has a pageCount property


jdh
Contributor
Forum|alt.badge.img+28
  • Author
  • Contributor
  • June 19, 2019
ebygomm wrote:

If you're able to use ArcPy, PDFDocument has a pageCount property

Unfortunately on the server environment this is going to be running on, there can be no external python dependencies :(.

 

I had considered using a variant of pypdf in a scripted parameter, but see first point :(

debbiatsafe
Safer
Forum|alt.badge.img+20
  • Safer
  • Best Answer
  • July 12, 2019

Hi @jdh

I'm pleased to say the PDF reader will now output the total number of pages contained within a PDF document FME 2019.1 (builds 19599 and higher). The number of pages is contained within an attribute called num_pages on the feature emitted from the pdf_document_info_metadata feature type.

You can download the most recent FME installers at www.safe.com/downloads


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings