In Development

Better PDF writer support

Related products:Integrations

8 years ago
January 18, 2017
26 replies
811 views

erik_jan
Contributor
2181 replies

I need to create a PDF report containing:

A logo in the header
A repeating title per page
A random number of images (from JPEG files) with a fixed number per page
A table of attributes (from an Excel spreadsheet)

To create this dynamic output is very hard (if at all possible) using FME.

I tried using the PDFPageFormatter and some of the items can be created (but are not straight forward).

I would like to have better support for creating PDF reports as output format.

Page 1 / 2

+18

fmelizard
Safer
3725 replies
8 years ago
February 9, 2017

Safe PR#26207

mark_f
325 replies
8 years ago
February 9, 2017

+1 from me from a request for a user to drive PDF production using Excel to create a single page for each Excel row with a map plus table of information. The table would also contain a free form text field so variable length text that wraps line after line into a paragraph.

Map is and static text is easy but tables and variable length text isn't so.

jeffshobbs
5 replies
8 years ago
February 13, 2017

+1 from me as well. I was able to create a simple PDF report using the HTMLReportGenerator and then using the custom transformer that takes HTML and goes to PDF. It worked alright for a very simple report, but it wouldn't scale for more advanced PDF reporting needs.

setld_solutions
Contributor
27 replies
8 years ago
March 20, 2017

As a "right now" solution: multiple one-page .PDF outputs can be combined via PDF Toolkit (via .BAT). Not a great long-term fix.

lorenrouth
Contributor
46 replies
3 years ago
December 12, 2021

I work in design and construction. We use pdfs all the time. They are:

Very large. 100-500+MB is common.
A variety of data formats on one page, ie. Raster, Vector, Text, Image...
A variety of page sizes. ie., in the same pdf you may have both 8.5x11, 11x17 and 24x36
Have bookmarks
Often authored in BlueBeam.

These PDFs need to be manipulated for different applications and workflows. Actions such as:

Fan out by Bookmark or other criteria
Extract text /metadata and write to CSV/Excel/?

The PDF Reader can read more than the Writer can write.

What if you just want to break up a large PDF by text or bookmark criteria, leaving the format intact?

I am trying to do that right now and it has been a real struggle. Once the pdf is read into FME, there doesn't seem to be a way to put it back together in the same way.

If you are going to have a PDF Writer, it needs to handle all the types of data found in one. There is a real need for this functionality because PDFs are the file format of choice in the A/E/C sector.

+11

jovitaatsafe
Safer
635 replies
9 months ago
August 16, 2024

Hello!

The Safe Software Team is actively investigating improvements to PDF writing, potentially as a new additional writer with a focus on text-based documents.

We're in the early stages and we'd love to hear from you on your needs for PDF writing if you're able to share any insight on the questions below:

What does your organization need to do with PDF writing?
What do your PDFs need to contain (for example: text, hyperlinks, images, charts...)?

Any use cases you're able to share will help inform our product development team as this project shapes.

If you're interested in chatting further, or in providing a PDF document sample, please Submit a Ticket and we will follow up with you.

Thank you!

esietinga
Contributor
6 replies
9 months ago
August 21, 2024

Output multipage reports with multiple maps, tables, images, graphs and accompanying text

+15

jasperwis
Enthusiast
53 replies
9 months ago
August 21, 2024

Hi @jovitaatsafe

Recently, we needed to translate our HTML validation reports to PDF’s. We initially chose HTML because the current PDFPageFormatter would have been too challenging to use given our dynamic output.

Our biggest challenge HTML>PDF, was to get the tables to split properly around page breaks.

So I would like to add these to the list:

proper splitting of tables around page breaks
the option to write more complex tables (rowspan/colspan)

+10

spatialexjames
Contributor
27 replies
9 months ago
August 21, 2024

I use FME to create extensive multi-page PDF reports, combining map info, text, photos, logos, contents pages, etc… There is definitely room for improvement though as I have had to develop a lot of workarounds.

Multi-page support is currently fine, but initially hard to figure out - you need to embed the PDFPageFormatters within custom transformers and use a group-by on them while specifying the page number you want it to go to (so you have to manually figure out page numbers first and re-assign them after this transformer).

Better support for text (paragraphs) would be nice. At the moment you have to guess where line breaks should be by counting the number of characters for a specific font size and page size. Solving how/when to line-breaks when adding in paragraphs would be helpful. It uses XHTML too, which is okay but limited - custom hyperlinks don’t work unless specifying the whole URL (which is often too long and will spill off a page).

Figuring out the actual size that text is going to be once written to the PDF is imperfect, as you need to use a TextStroker to estimate the size in FME to figure out positioning, but then write to the PDF as a point with XHTML specifying font and size (and these do not match). Technically that could be a niche improvement for the TextStroker to interpret the XHTML in the same way as the PDF...

Writing page navigation doesn’t seem possible. I’d like to be able to link users to specific pages within the PDF when they click on a text link in the contents page.

The PDF tooltip is currently buggy, and will often not work when there is more than 1 tooltip provided as it can link to the wrong info (which is a shame as that could have solved the hyperlinking issue). Use-case for me was producing a floor plan map, with hyperlink boxes per room/asset to external documents (but they’d end up linking to the same info).

Better chart support would be nice so these can be inserted into reports, such as being able to use HTML straight out of a report generator. It’s very hard to make charts look professional as it stands.

The same goes for table support - it’s very tricky to get a table to look good in a PDF report straight out of FME (current workaround is to write data using an Excel template, then use an Excel to PDF print tool (not supported on FME Flow Hosted), then to read in the PDF and re-write it using FME...).

Providing some basic out-of-the-box symbols to use for PDF maps would be a bonus.

That’s what comes to mind for now, hope it helps!

+10

felipeverdu
Enthusiast
57 replies
9 months ago
August 21, 2024

you should be able to create a PDF the same way as HTML report generator:

header logos tables figures text and so on.

I usually go to the HTML report generator and convert the results to PDF but mostly it would fail

Felipe

+11

gabriel_hirsch
Contributor
86 replies
9 months ago
August 21, 2024

We still need the opportunity to write PDF/A as this idea explains:

PDF Writer: Support PDF/A and PDF/E Open
21 Votes

braggken
Supporter
25 replies
9 months ago
August 22, 2024

We use this writer to generate daily automatic security Site Reports (Maps and Tables) to ensure the safety of our teams.

This is very important writer for us but as it stands we are on the verge of switching over to the Esri REST API for PDF maps/reports because making nice looking PDF Maps with FME is just too painful. In way this is understandable because FME is not really a cartographic tool - but if you are going to invest time is this writer - cartographic support is where I would value it most.

IMHO focus on the current writer or the transformers which support it rather than a new writer.
Not sure I would ever use FME to write text based PDFs. The current writer could be super useful if it was improved

Tables and text boxes! There is that "TableAdder" custom transformer which does this but it really does not work very well, I don't think anyone has really ever updated it.
Point symbols are also really painful - people suggest using MapnikRasterizer which seems like a very hacky work-around. Add icon support to the PDFStyler
Page layout. The current PDFPageFormatter is just kind of weird, not sure how to fix it

+49

redgeographics
Celebrity
3640 replies
9 months ago
August 22, 2024

Some of the issues we run into:

Text in a table or box, we need to do word-wrap ourselves and it’s really annoying
Aligning various parts of a layout, especially if they’re in different boxes in the PDFPageFormatter
Creating mutliple pages with map content is a pain, we need to offset every page’s content to the same origin (technically the PDFPageFormatter needs a hard-coded page number, but by changing the pdf_page_number attribute later we can actually have it create multiple pages)

david_r
8337 replies
9 months ago
August 22, 2024

lorenrouth wrote:

The PDF Reader can read more than the Writer can write.

Adding to this, the PDF reader has issues with not correctly grouping individual characters into words, especially for narrow fonts. This is was a major hassle trying to parse text from building plans, as FME would only return individual characters and not words, and we finally had to resort to using https://pdfminersix.readthedocs.io/en/latest/ where it’s possible to tweak these settings. I would love for FME to have similar functionality!

For the PDF writer, a big challenge is to ensure page flow, meaning objects that are positioned relative to other objects that may be dynamic in size, over multiple pages. That, together with support for tables and more intelligent paragraph handling (including line breaks) would be fantastic.

+17

gisbradokla
Enthusiast
132 replies
9 months ago
August 26, 2024

we use publish from autodesk products. as it reproduces our dwg layout exactly to pdf.

We also could benefit from enhanced pdf writing in that we need to add authorization stamps to pdf and have the output exactly like the input with added stamp. this is difficult to do without spending LOTs of time formatting and styling the pdf again.

@gisbradokla

+11

jovitaatsafe
Safer
635 replies
9 months ago
August 26, 2024

Thank you everyone for your helpful feedback, feel free to continue to join the discussion, I’ll add some responses below.

So far, I’m hearing a lot of support for potentially making use of HTML such as the HTMLReportGenerator, improving support around tables, text, point symbology, and making it easier to set page breaks and better general layouting.

Forgive me for not replying to each individual message, I really appreciate you taking the time to share and it’s a relief to hear we’ve identified some of the same things, as well as heard some new ones (like point symbology!).

@spatialexjames I appreciate how hard some of those tasks have been, shoutout for a very comprehensive answer, really helped me fill out my chart! I expect the team will have more questions after a first pass through.

@gabriel_hirsch thanks for bringing up PDF/A. I’m not sure this one will be in scope for this writer, but it’d be great if we could learn more about the use cases for future considerations. From the idea, I see AEC customers and some municipalities being represented, can you share more about how many organizations in which industries might benefit from having this supported, and what they currently do to output PDF/A? Feel free to shoot me an email to talk further at jovita.chan AT safe.com

@braggken Hopefully with some better text management we’ll be able to make the writer a bit more accessible to FME users for future reports! Good to know about the difficulty around point symbols, thanks!

@gisbradokla Are the authorization stamps like a watermark or annotation? How does your organization currently go about adding in the authorization stamp?

Thanks folks, great discussion (:

hannupekkaranta
Contributor
11 replies
9 months ago
August 27, 2024

Support for PDF/A writing!

+17

gisbradokla
Enthusiast
132 replies
9 months ago
August 27, 2024

jovitaatsafe wrote:

@gisbradokla Are the authorization stamps like a watermark or annotation? How does your organization currently go about adding in the authorization stamp

we insert autocad block at specific coordinates/scale onto layout

we use autodesk publish command (export pdf)

then remove block from dwg

access to the stamp is controlled per user

@gisbradokla

+15

LizAtSafe
Safer
1503 replies
9 months ago
August 28, 2024

Open→Gathering Interest

fdw
Contributor
15 replies
8 months ago
September 10, 2024

Would like to be able to set the pdf meta data directly in the writer. Currently use pikePDF to update the metadata after writing.
And yes, tables please!

vadis
Contributor
9 replies
8 months ago
September 21, 2024

Thanks for taking initiative on this front, @jovitaatsafe! PDF is usually a dead end node for the data but intrinsicly used of a public sector customers. In most cases it is conversion from editable formats lile .xlsx, .docx or HTML that a .pdf takes form. PdfTk and LibreOffice convertor libraries work magic combined with SystemCaller, however there are hinders and unnecessary steps with installations that may not work in certain environments. Ii would be grand to have all packaged in one native transformer.

And yes, PDF/A would open a door where a long queue is patiently building up.

kevs-2021
Contributor
2 replies
8 months ago
September 23, 2024

Think of PDF as a dog tag and a lightweight viewer combined. Everyone can read it, it states what we know of map data at a point in time in our (project) workflow.

We need a greater support of pdf writing to display the map data at correct scale and symbology to avoid the bottleneck of producing maps in reports tools like word, or cad drawing programs. Ideally we want to ship the map data as a database together with a pdf that shows you the content. Here will a PDF-A support help us greatly achieve a five-star delivery.

If you want to get a peak of our workflows (AEC industry, 3000+ employees), think of it as a huge GIT tree: we are constantly working on different pipelines, alternatives and this is true for one single project we deliver to our clients every day as it is true for our company as a whole. So a pdf would fit in this workflow as the code viewer in git: you always can see visually what our delivered data was about at a precise time.

Yes we will need to use templates, so that we can focus only on updating the symbology and scale toghether with metadatas.

Would it be enough improving an html writer in place for a better pdf? No. Because you will not have the “snapshot” aspect of the pdf which you can attach a report or an email, archive and view so easily.

+11

jovitaatsafe
Safer
635 replies
7 months ago
October 24, 2024

Update: The project is in development!

I just wanted to thank everyone for taking the time to share your use cases and needs! Our development team has scoped out a plan taking into consideration the information you’ve all provided, and they are now actively working on the first phase of it.

Keep in mind that the first phase when it comes out to beta will contain some of the things that we really want to add in, while other features may be added later or may be unplanned. As always we’re open to feedback as we go through that process!

Metadata:

Some basic metadata support is being targeted. While I can’t promise that it’ll make it into a future plan, I’d love to hear more about what fields are important to you in PDF metadata when you’ve got post processes like pikePDF @fdw.

+15

LizAtSafe
Safer
1503 replies
5 months ago
December 17, 2024

Gathering Interest→In Development

+15

LizAtSafe
Safer
1503 replies
1 month ago
April 5, 2025

The following idea has been merged into this idea:

html to pdf Archived
1 Votes

All the votes have been transferred into this idea.

Page 1 / 2

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Better PDF writer support