Skip to main content

Hello, 

I’m a beginner to FME and need help with how to exctract the images from a scanned pdf to fit with the OCR text I got from Tesseract. The idea is to create new pdf files from the scanned ones where you can copy the text and so on, and I also want to keep the original layout of the text and images. I have attached two images to illustrate what I mean. One is of the original scanned pdf and the other is my OCR extracted text and where I want to put the images. Also attached is my workspace. 

 

Thanks!

Hi @filipforsberg it looks like your attachments didn’t come through, could you try attaching them again? Along with your workspace, you may want to attach some sample data so that we are able to run it. Or perhaps include the Feature Cache in a template workspace! 


Reply