Extract images from a scanned PDF and overlay on

Question

Hello,

I’m a beginner to FME and need help with how to exctract the images from a scanned pdf to fit with the OCR text I got from Tesseract. The idea is to create new pdf files from the scanned ones where you can copy the text and so on, and I also want to keep the original layout of the text and images. I have attached two images to illustrate what I mean. One is of the original scanned pdf and the other is my OCR extracted text and where I want to put the images. Also attached is my workspace.

Thanks!

evieatsafe · Answer

Hi@filipforsbergit looks like your attachments didn’t come through, could you try attaching them again? Along with your workspace, you may want to attach some sample data so that we are able to run it. Or perhaps include the Feature Cache in a template workspace!

Extract images from a scanned PDF and overlay on

1 reply

Community Stats

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute