It's been a few years, and the PDF writer may have improved in the meantime, but in a slightly similar case, what we did was read the PDF, extract the relevant information, and created a jsonObject that indicated which pages belonged to which report, and then split the original pdf outside of FME in c#. This was part of a web application, and you could presumably do the splitting in a PythonCaller instead.
You're going to be better off splitting it outside FME. You will run into a lot of issues with fonts and alignments in FME.
Take a look here and it will give you an idea of how to split the file in Adobe Acrobat if you have a copy.
https://helpx.adobe.com/acrobat/how-to/split-pdf-file.html
I solved a similar problem by using python:
import PyPDF2
def split_pdf(input_pdf_path, output_folder):
# Open the input PDF file
pdf_file = open(input_pdf_path, 'rb')
pdf_reader = PyPDF2.PdfReader(pdf_file)
# Ensure the output folder exists
import os
if not os.path.exists(output_folder):
os.makedirs(output_folder)
# Loop through each page and save it as a separate PDF
for page_num in range(len(pdf_reader.pages)):
pdf_writer = PyPDF2.PdfWriter()
pdf_writer.add_page(pdf_reader.pagesrpage_num])
output_pdf_path = os.path.join(output_folder, f'page_{page_num + 1}.pdf')
with open(output_pdf_path, 'wb') as output_pdf_file:
pdf_writer.write(output_pdf_file)
# Close the input PDF file
pdf_file.close()
if __name__ == '__main__':
input_pdf_path = 'input.pdf' # Replace with your input PDF file path
output_folder = 'output_pages' # Replace with the output folder path
split_pdf(input_pdf_path, output_folder)
Similar issue but to the further complicate things the number of pages in the split pdf is random. Need to read text of the pdf since John doe has 2 pages, but mary smith has 4 pages etc. Ideally if we can read the user name we want to copy that out to the split files names. So john_doe.pdf mary_smith . pdf. etc.
Read the pdf with FME. Figure out if the name is on the page and use the python script provided to split the pages.
Do you have an issue splitting a PDF based on text on pages? Then it would help if you understood what splitting PDF is. When dividing an enormous PDF file into multiple little ones, it is necessary to separate the PDF Files. This splitting method can be useful when your device does not have a lot of storage space. In this case, I prefer the tool I used to split my PDF files into many files, which can also be useful to you, namely the Pcinfotools PDF Splitter Tool. This software is better suited for splitting PDFs and offers a variety of functionality such as copying, printing, and modifying files. While formatting, it maintains the integrity of the PDF file and can be useful for people who want a trusted application but are concerned about the content of the PDF document because some apps can delete the content of the PDFs and do not attach the attachments that are still attached to the file; they can even misplace or delete them. So, for these issues, this program can be useful for those who want to keep their PDF files secure. This software offers a free demo version and is compatible with all Windows versions.