Solved

How to convert PDF to TIFF format?

  • 25 July 2019
  • 9 replies
  • 41 views

Badge

Hi Everyone,

 

I am fairly new to FME, so please be patient with me! I have some As-Builts in PDF format that I need to convert to TIFF format as part of a monthly update to an external stakeholder (individual PDFs to individual TIFFs). I have attempted to use the Adobe Geospatial PDF reader and translate it to the TIFF/GEOTIFF writer, but to no avail. I keep getting an error message "Failed to obtain raster from feature. Only features with raster geometry are expected"; this is despite having 'Read Rasterized Pages' set to 'Yes'. I would appreciate any help and suggestions that can be offered!

 

Thanks in advance!

icon

Best answer by mark2atsafe 29 July 2019, 19:48

View original

9 replies

Userlevel 4
Badge +25

You could try an ImageRasterizer to turn the vector information into raster and then write that to a TIFF

Badge

You could try an ImageRasterizer to turn the vector information into raster and then write that to a TIFF

Thanks - I already tried that, but it did not work

Userlevel 4
Badge +25

Thanks - I already tried that, but it did not work

Ok, can you share one of the PDF's here so we can have a look?

Userlevel 4
Badge +25

So when you're trying to find the problem with an FME workspace, the key part is finding *where* the issue occurs. Here we want to know if the reader is failing to produce valid data, or the reader produces valid data but the writer fails to write it.

So, can you open the FME Data Inspector tool and in there open the PDF? Use the same parameter settings as in Workbench. That way you can see what is being produced. In this case it should be fairly straightforward: either it is producing a raster feature or it is not.

If it does produce rasters, then go back to Workbench and turn on feature caching using Run > Enable Feature Caching. Then click on the PDF object on the canvas and on the buttons that pop-up, choose Run Just This (a tooltip will show which option that is).

You should get a green icon on the object once the workspace has run. Click on that to inspect what the reader is producing. If it's not producing raster as the Data Inspector did, then the parameters might be different. I would suggest re-adding the reader, this time making sure to use the non-spatial options as the reader is added (rather than just changing them later). Then try running the workspace again.

If it is producing raster, then the GeoTIFF writer would seem to be at fault, although I really can't imagine what the issue would be.

Anyway, I've managed to get this working, so I think it is possible. So keep trying and - as RedGeographics suggested - if you can share one of the PDF files, it would really help.

Userlevel 4
Badge +25

So when you're trying to find the problem with an FME workspace, the key part is finding *where* the issue occurs. Here we want to know if the reader is failing to produce valid data, or the reader produces valid data but the writer fails to write it.

So, can you open the FME Data Inspector tool and in there open the PDF? Use the same parameter settings as in Workbench. That way you can see what is being produced. In this case it should be fairly straightforward: either it is producing a raster feature or it is not.

If it does produce rasters, then go back to Workbench and turn on feature caching using Run > Enable Feature Caching. Then click on the PDF object on the canvas and on the buttons that pop-up, choose Run Just This (a tooltip will show which option that is).

You should get a green icon on the object once the workspace has run. Click on that to inspect what the reader is producing. If it's not producing raster as the Data Inspector did, then the parameters might be different. I would suggest re-adding the reader, this time making sure to use the non-spatial options as the reader is added (rather than just changing them later). Then try running the workspace again.

If it is producing raster, then the GeoTIFF writer would seem to be at fault, although I really can't imagine what the issue would be.

Anyway, I've managed to get this working, so I think it is possible. So keep trying and - as RedGeographics suggested - if you can share one of the PDF files, it would really help.

One other thought, when you have non-spatial (rasterized pages) turned on, be sure to turn off the spatial settings. There might be non-raster features sneaking through in there. Either that or turn off/delete any layer labelled pdf_no_layer.

Userlevel 4
Badge +25

So when you're trying to find the problem with an FME workspace, the key part is finding *where* the issue occurs. Here we want to know if the reader is failing to produce valid data, or the reader produces valid data but the writer fails to write it.

So, can you open the FME Data Inspector tool and in there open the PDF? Use the same parameter settings as in Workbench. That way you can see what is being produced. In this case it should be fairly straightforward: either it is producing a raster feature or it is not.

If it does produce rasters, then go back to Workbench and turn on feature caching using Run > Enable Feature Caching. Then click on the PDF object on the canvas and on the buttons that pop-up, choose Run Just This (a tooltip will show which option that is).

You should get a green icon on the object once the workspace has run. Click on that to inspect what the reader is producing. If it's not producing raster as the Data Inspector did, then the parameters might be different. I would suggest re-adding the reader, this time making sure to use the non-spatial options as the reader is added (rather than just changing them later). Then try running the workspace again.

If it is producing raster, then the GeoTIFF writer would seem to be at fault, although I really can't imagine what the issue would be.

Anyway, I've managed to get this working, so I think it is possible. So keep trying and - as RedGeographics suggested - if you can share one of the PDF files, it would really help.

See this tweet for a short video demo of reading PDF pages as rasters: https://twitter.com/FMEEvangelist/status/1155930516991774720

Badge

I have provided a PDF - thanks!

 

PR3125-As-built-W-3125-A - WE 2.pdf

Badge

So when you're trying to find the problem with an FME workspace, the key part is finding *where* the issue occurs. Here we want to know if the reader is failing to produce valid data, or the reader produces valid data but the writer fails to write it.

So, can you open the FME Data Inspector tool and in there open the PDF? Use the same parameter settings as in Workbench. That way you can see what is being produced. In this case it should be fairly straightforward: either it is producing a raster feature or it is not.

If it does produce rasters, then go back to Workbench and turn on feature caching using Run > Enable Feature Caching. Then click on the PDF object on the canvas and on the buttons that pop-up, choose Run Just This (a tooltip will show which option that is).

You should get a green icon on the object once the workspace has run. Click on that to inspect what the reader is producing. If it's not producing raster as the Data Inspector did, then the parameters might be different. I would suggest re-adding the reader, this time making sure to use the non-spatial options as the reader is added (rather than just changing them later). Then try running the workspace again.

If it is producing raster, then the GeoTIFF writer would seem to be at fault, although I really can't imagine what the issue would be.

Anyway, I've managed to get this working, so I think it is possible. So keep trying and - as RedGeographics suggested - if you can share one of the PDF files, it would really help.

Thank you very much Mark, I tried it again following your suggestions and I was able to get it working! I also added the Attribute Exposer and Part Extractor transformers to be able to retain the same file name when converting several PDFs at the same time. Greatly appreciate your time and effort to provide assistance!

Badge +9

Thank you very much Mark, I tried it again following your suggestions and I was able to get it working! I also added the Attribute Exposer and Part Extractor transformers to be able to retain the same file name when converting several PDFs at the same time. Greatly appreciate your time and effort to provide assistance!

Hello @watts, are you able to share the workbench that you used to accomplish this as I am trying to do a similar thing but just can't get the TIFF output working as expected.

Reply