Solved

Detemine number of pages in a pdf

  • 17 June 2019
  • 4 replies
  • 10 views

Badge +22
  • Contributor
  • 1963 replies

Is there a way to determine the number of pages in a pdf other than reading them all as prf_page_metadata and sampling the last feature? Some of my pdfs are over 5000 pages, and this seems inefficient, especially as this information appears in the Document Properties in Acrobat.

 

 

Setting the Parameter Pages to Read to -1 produces an error.
icon

Best answer by debbiatsafe 12 July 2019, 22:11

View original

4 replies

Userlevel 3
Badge +17

Hi @jdh

Unfortunately, the PDF reader does not currently read the total number of pages in a file as an attribute. FMEENGINE-60427 has been created to track this enhancement request. I will update this post once this feature has been added.

I would suggest continuing to use the method you have described (read all pages as features using the pdf_page_metadata feature type and sampling the last) to find the total number of pages.

Please note the Pages to Read parameter supports integers and page ranges (eg. 1, 3- would read everything but the second page). Negative index (eg. -1) are not supported within this parameter.

Userlevel 1
Badge +21

If you're able to use ArcPy, PDFDocument has a pageCount property

Badge +22

If you're able to use ArcPy, PDFDocument has a pageCount property

Unfortunately on the server environment this is going to be running on, there can be no external python dependencies :(.

 

I had considered using a variant of pypdf in a scripted parameter, but see first point :(
Userlevel 3
Badge +17

Hi @jdh

I'm pleased to say the PDF reader will now output the total number of pages contained within a PDF document FME 2019.1 (builds 19599 and higher). The number of pages is contained within an attribute called num_pages on the feature emitted from the pdf_document_info_metadata feature type.

You can download the most recent FME installers at www.safe.com/downloads

Reply