Question

PDF Reader split by layer


Hello FME guru's! I'm looking for a way to seperate a single page pdf with multiple layers so I can input a reduced raster dataset into Potrace to improve the vectorising results. Can anyone point me in the right direction?

Pdf example here:


5 replies

Userlevel 4
Badge +30

Hi @chris1

Do you want to separate by layers?

If yes, you can see the attachment template file.

pdf.fmwt

Thanks,

Danilo

Userlevel 3
Badge +17

Hi @chris1

Thanks for your question! You may have exposed a possible bug where features from only one layer are read when using a dynamic PDF reader (Single Merged Feature Type). I have raised a PR with our development team and will post updates here.

PDF layers are represented as feature types so Danilo's workspace template is good if you are looking to read features from specific layers.

However, FME will read features from the pdf linked as vectors so it may not be necessary to use Potrace to vectorise features. Using the GeometryFilter, you can see features read from the reader are either lines or text.

Hi @chris1

Thanks for your question! You may have exposed a possible bug where features from only one layer are read when using a dynamic PDF reader (Single Merged Feature Type). I have raised a PR with our development team and will post updates here.

PDF layers are represented as feature types so Danilo's workspace template is good if you are looking to read features from specific layers.

However, FME will read features from the pdf linked as vectors so it may not be necessary to use Potrace to vectorise features. Using the GeometryFilter, you can see features read from the reader are either lines or text.

Thanks for the explanation Debbi, you are exactly right that i'm using single merged features. This is because I want the input pdf types to be as flexible as possible both raster and vector pdf and any combination thereof. By separating the raster layers I can reduce the amount of overlapping geometry (raster) into Potrace. All vector data loaded from pdf will bypass potrace and be cleaned and validated.

 

 

Hi @chris1

Do you want to separate by layers?

If yes, you can see the attachment template file.

pdf.fmwt

Thanks,

Danilo

Thank you for the template Danilo it works perfectly. And is a good workaround for a "static" run. However the problem arises when dynamically reading the PDF's.

 

 

Userlevel 3
Badge +17
Thanks for the explanation Debbi, you are exactly right that i'm using single merged features. This is because I want the input pdf types to be as flexible as possible both raster and vector pdf and any combination thereof. By separating the raster layers I can reduce the amount of overlapping geometry (raster) into Potrace. All vector data loaded from pdf will bypass potrace and be cleaned and validated.

 

 

Hi @chris1

I'm pleased to report that the PDF reader in Single Merger Feature Type mode will now read features from all feature layers in the latest FME 2018.1 and 2019.0 Release Candidate.

You can access the installers at: safe.com/downloads

Thank you for your patience as we worked to resolve this issue.

Reply