Skip to main content
Solved

How to retrieve width and height of JPG and PDFs?

  • September 20, 2018
  • 3 replies
  • 92 views

0xbox0
Contributor
Forum|alt.badge.img+4

How to retrieve width and height of JPG and PDFs?

Best answer by mac_sp

For the JPG, you can extract the width and height from the IFMERaster information:

Running your JPG features through a GeometryExtractor into Well Known Text will return a Polygon geometry type with the four corners of your image, as such:

POLYGON ((0 -960,0 0,528 0,528 -960,0 -960))

Your image width is simply the non-zero x coordinate and your height is the non-zero y coordinate.

For the PDF, you need to read rasterized pages in the reader parameters:

This will provide the Raster extents for each page (the 'pdf_page_number' attribute tracks the page number) and you can complete the above steps with the GeometryExtractor to derive the height and width.

EDIT: A CoordinateExtractor could be used to produce a similar result, simply depends on the user preference.

View original
Did this help you find an answer to your question?

3 replies

david_r
Celebrity
  • September 20, 2018

Have a look at the RasterPropertyExtractor


  • Best Answer
  • September 20, 2018

For the JPG, you can extract the width and height from the IFMERaster information:

Running your JPG features through a GeometryExtractor into Well Known Text will return a Polygon geometry type with the four corners of your image, as such:

POLYGON ((0 -960,0 0,528 0,528 -960,0 -960))

Your image width is simply the non-zero x coordinate and your height is the non-zero y coordinate.

For the PDF, you need to read rasterized pages in the reader parameters:

This will provide the Raster extents for each page (the 'pdf_page_number' attribute tracks the page number) and you can complete the above steps with the GeometryExtractor to derive the height and width.

EDIT: A CoordinateExtractor could be used to produce a similar result, simply depends on the user preference.


jakemolnar
Forum|alt.badge.img
  • September 20, 2018
mac_sp wrote:

For the JPG, you can extract the width and height from the IFMERaster information:

Running your JPG features through a GeometryExtractor into Well Known Text will return a Polygon geometry type with the four corners of your image, as such:

POLYGON ((0 -960,0 0,528 0,528 -960,0 -960))

Your image width is simply the non-zero x coordinate and your height is the non-zero y coordinate.

For the PDF, you need to read rasterized pages in the reader parameters:

This will provide the Raster extents for each page (the 'pdf_page_number' attribute tracks the page number) and you can complete the above steps with the GeometryExtractor to derive the height and width.

EDIT: A CoordinateExtractor could be used to produce a similar result, simply depends on the user preference.

@mac_sp, @0xbox0, I'll just add that for PDF you can also use "Non-Spatial > Metadata Objects To Read > Pages" to get a feature describing each page, which includes the size of the page in points (if that's what you're interested in).

 

 

Additionally, you may be interested in the native size of each image contained within the PDF. In that case, you should use the "Spatial > Read Images" option to read each image object, and then look at the raster properties as usual.

 


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings