Skip to main content
ReleasedFME 2018.x

PDF Reader

Related products:FME Form
xiaomengatsafe
rylanatsafe
siennaatsafe
nathanatsafe
+75
  • xiaomengatsafe
    xiaomengatsafe
  • rylanatsafe
    rylanatsafe
  • hollyatsafe
  • siennaatsafe
    siennaatsafe
  • nathanatsafe
    nathanatsafe
  • fmelizard
    fmelizard
  • redgeographics
    redgeographics
  • takashi
    takashi
  • danilo_fme
    danilo_fme
  • dustin
    dustin
  • erik_jan
    erik_jan
  • sigtill
    sigtill
  • itay
    itay
  • tomf
    tomf
  • todd_davis
    todd_davis
  • davideagle
    davideagle
  • stalknecht
    stalknecht
  • jelle
    jelle
  • geosander
    geosander
  • kennyo
    kennyo
  • philippeb
    philippeb
  • fhilding
  • ciarab
    ciarab
  • courtney_m
    courtney_m
  • revesz
    revesz
  • paalped
    paalped
  • franco69
    franco69
  • gerhard
  • marko
  • gschleusner
    gschleusner
  • dunuts
  • adieporter
    adieporter
  • arnebrucksch
  • makt
    makt
  • ml56067
    ml56067
  • dmatranga
  • geospatiallover
    geospatiallover
  • cwarren
    cwarren
  • dfresh
    dfresh
  • zubairsm
  • mygis
    mygis
  • luigibr
  • kd
  • ekkischeffler
    ekkischeffler
  • jatoxa
    jatoxa
  • lau
    lau
  • roland.martin
    roland.martin
  • zzupljanin
    zzupljanin
  • ngstoke
    ngstoke
  • dannymatranga
    dannymatranga
  • jneujens
    jneujens
  • taojunabc
    taojunabc
  • setld_solutions
    setld_solutions
  • adriano
    adriano
  • jpvo
    jpvo
  • mb_fdfa
    mb_fdfa
  • wicki
    wicki
  • howard_l
  • ville
  • mostafabahloul
  • drose
  • derek
  • richardsnyder3
  • wellis11
  • akituo
  • lpalli
    lpalli
  • vyaenec
    vyaenec
  • battlezone77
  • dom
  • zu
  • rbh22988
  • maarten
    maarten
  • juanfrasan
  • fme4me
  • marten_m
  • westdakota
  • davidwesstrom
  • sonya_k
  • m
  • roger7467

fmelizard
Safer
Complimenting the PDF writer (which is being unified from having separate 2D/3D variants), this one would read vector/raster features out of geospatial PDFs
This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

21 replies

gschleusner
Contributor
Forum|alt.badge.img+2
  • Contributor
  • January 27, 2015
Would also use to compare non-geospatial vector PDFS. Also would like to use one of the read PDF files as a template over which additional content might be added.

redgeographics
Celebrity
Forum|alt.badge.img+58

This would make some of my projects a lot easier. For a certain client I have to design a map in vector and then deliver individual layers as PNG files (sometimes in excess of 20000 pixels per side). My current workflow is to write out individual PDF's from Illustrator and rasterize them in Photoshop. This often takes a lot of time and requires some manual actions by me (open file, set size, save file). If I could just run it through FME that would at the very least save me the manual work.


Forum|alt.badge.img+5
  • January 8, 2016

Would be very useful. We get lots of site plans and data in PDF's, FME could save lots of time in digisiting sites.


ciarab
Contributor
Forum|alt.badge.img+9
  • Contributor
  • February 25, 2016

This would be very useful at the moment. We have upcoming requests to read PDF so would be interested in having a reader to analyse geospatial PDF


fmelizard
Safer
Forum|alt.badge.img+20
  • Author
  • Safer
  • February 25, 2016

I can confirm that we've been laying the groundwork for this. Won't be in 2016.1, but I'd be surprised if we didn't a form of PDF reading by end of calendar 2016. @ciarab can I ask you to send a couple sample PDFs into support@safe.com so we can be sure your scenario is targetted?


geospatiallover
Participant
Forum|alt.badge.img+6

I never realized I have to do this but true enough I have are projects that would require this. Thanks for the update @daleatsafe. If you need some more PDFs to try let me know.


sigtill
Supporter
Forum|alt.badge.img+25
  • Supporter
  • August 24, 2016

Forum|alt.badge.img+1
  • March 10, 2017

When are we expecting this (pdf reader), if at all

Thanks


Forum|alt.badge.img

We have been using A-PDF Data extractor to extract data from pdfs. We use a system caller to connect to the app. We hope to see a similiar feature directly in FME without the need of a 3rd party app.


erik_jan
Contributor
Forum|alt.badge.img+22
  • Contributor
  • April 17, 2017

At this moment I have no need for a PDF reader.

But I will vote for it as it might speed up the improvements for the PDF writer that I do need:

https://knowledge.safe.com/idea/38680/better-pdf-writer-support.html


dannymatranga
Contributor
Forum|alt.badge.img+6

Any notable progress on the PDF reader?


helmoet
Forum|alt.badge.img+8
  • July 16, 2017

I tried to read text from a pdf file using a PythonCaller and the pdfminer plugin, and it went pretty well. For a start? Like this:

import fme import fmeobjects import sys import chardet from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter from pdfminer.pdfpage import PDFPage from pdfminer.converter import XMLConverter, HTMLConverter, TextConverter from pdfminer.layout import LAParams from cStringIO import StringIO # Template Function interface: # When using this function, make sure its name is set as the value of # the 'Class or Function to Process Features' transformer parameter def processFeature(feature): data = FME_MacroValues['SourcePdfFile'] fp = file(data, 'rb') rsrcmgr = PDFResourceManager() retstr = StringIO() codec = 'utf-8' laparams = LAParams() device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams) # Create a PDF interpreter object. interpreter = PDFPageInterpreter(rsrcmgr, device) # Process each page contained in the document. for page in PDFPage.get_pages(fp): interpreter.process_page(page) data = retstr.getvalue() e = chardet.detect(data) u = None try: if e['confidence'] > 0.3: u = unicode(data, e['encoding']) except: pass if u: feature.setAttribute('pdfcontent', u) else: feature.setAttribute('pdfcontent', data) pass

stalknecht
Contributor
Forum|alt.badge.img+21
  • Contributor
  • July 17, 2017

There is also a custom reader at the hub:

PDF2TextReader

'


13gamat
Participant
Forum|alt.badge.img+3
  • Participant
  • August 11, 2017

is there a PDF to Excel reader in FME?


13gamat
Participant
Forum|alt.badge.img+3
  • Participant
  • August 11, 2017

Please build something for PDF converter!


paalped
Contributor
Forum|alt.badge.img+5
  • Contributor
  • December 7, 2017

I use poppler to read PDF as Raster. Basically it just converts pdf files to jpgs and then u read the jpg.

https://poppler.freedesktop.org/


fmelizard
Safer
Forum|alt.badge.img+20
  • Author
  • Safer
  • December 10, 2017
We might have had something to do with that long ago....

 

 


dustin
Influencer
Forum|alt.badge.img+31
  • Influencer
  • December 15, 2017

I'm late to the party, but I vote for this. My primary use would be change detection between two GeoPDF's.


fmelizard
Safer
Forum|alt.badge.img+20
  • Author
  • Safer
  • January 4, 2018

Hi all -- what better way to start the year than to try out the new PDF reader in FME 2018 betas. Builds 18236 and later have it. Get it from http://www.safe.com/download and let us know what you think. @ciarab @marko @redgeographics @geospatiallover @gschleusner @sigtill @cartoscro @dannymatranga @zubairsm FYI


ciarab
Contributor
Forum|alt.badge.img+9
  • Contributor
  • January 4, 2018

 

@croningarrett our long awaited PDF reader ;)

ottadini
Supporter
Forum|alt.badge.img+5
  • Supporter
  • May 16, 2018
Me also, but on MS windows the latest binary I could find was for v0.51, quite a way behind the latest. Not that it seems to matter for simple image extraction.

 

 


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings