I’ve imported a topographic map that is in GeoPDF format. I want to convert to GeoTIFF. I see advice posted to put an ImageRasterizer between the reader and writer. My question is how to set up the ImageRasterizer “rows and columns” data. Where do I get this data? Thanks
Bill
Page 1 / 1
The rows and columns parameters in the ImageRasterizer determine how big, in pixels, the resulting raster is going to be. So it’s really up to you (obviously the larger you create the raster, the longer it’ll take).
What I generally do is use a BoundingBoxAccumulator / BoundsExtractor to figure out the extents of the area, then calculate that to a useful “meters per pixel” resolution and use that to determine the rows/columns. That way it’s a dynamic process.
Thank you for the guidance. Is there a transformer/procedure in FME that lets me query the original geopdf file to find out the maximum size of the embedded image in x, y? These were originally scanned, then converted to geopdf, and if i knew the resolution they were scanned at, that would help me set the rasterizer inputs...or am I off here?
I’m still not able to convert a GeoPDF into a GeoTiff and preserve the lat/long info. When I use “spacial”, I get a whole folder full of individual clips from the original. non-spacial give me the GeoTIFF, but in page coordinates instead of UTM. A year ago I had figured out how to reassemble these clips into a GeoTiff that preserved the registration with the ground location. I don’t have that workspace any longer, and have lost a few brain cells along the way.
I can do this using gdal_translate, and it works fine. I know it can be done in FME. I’ve included a sample GeoPDF file. Can someone hold my hand on this one transformation so I can continue on to converting them to CADRG.
Thanks
Bill
So this was a bit of a challenge.
But I like challenges, especially if they concern old maps and unknown projections and stuff
It turns out the PDF contains a lot (221) of little images that together make up the map and its collar. About 1/3rd of them don’t actually have a coordinate system set and I bet that that is what’s been causing you the trouble.
I’m using the PDF2D reader with the default settings, manually expose the fme_basename attribute and then use a CoordinateSystemExtractor to extract the coordinate system info to an attribute. The tiles that don’t have a coordinate system set (i.e. are not georeferenced) are filtered out using a TestFilter. The ones that do are then mosaicked with the RasterMosaicker.
When viewing the PDF in another geo-aware software I did get a warning about a shearing factor on the coordinate system (potentially because the original map was not aligned exactly when scanning) which may cause issues but I’m choosing to ignore that. I also saw the coordinate system is Indian 1960 / UTM48N so I’m using the CoordinateSystemSetter to set that before writing to GeoTIFF with the fme_basename attribute as the filename (so the resulting .tiff will get the same name as the original .pdf)
Workspace (2023.2) is attached.
Thank you very much!!! and a Hand Salute!
I’m glad it wasn’t something simple I was overlooking. I’m going to work with your workspace and convert a few more GeoPDFs to gain some familiarity with the process.
I wonder what gdal-translate was doing “under the covers” to convert it to a GeoTIFF
Were the non-referenced pieces of that image from the borders and collar info?