Skip to main content
Question

PDF words are misspelled

  • May 22, 2025
  • 2 replies
  • 43 views

mohamedalsobh
Contributor
Forum|alt.badge.img+6

I was extracting tables from a PDF file by following the steps in  https://support.safe.com/hc/en-us/articles/25407564475277-Extracting-Text-and-Tabular-Data-from-PDF#h_01HW3Z9Z37Q33NQQ7R0XBGEVB0. However, I encountered a challenge some words are misspelled or broken when extracted, even though they appear correctly in the PDF viewer. I suspect this is because the PDF was exported from a graphic design app such as Adobe Illustrator. I realized this when I copied text from the PDF and pasted it into a Word document the same issue occurred (correct me if I’m wrong) is there a transformer can help fix or reconstruct the text? 

 

2 replies

raghavendrans
Enthusiast
Forum|alt.badge.img+20

@mohamedalsobh If the text that when extracted has issues is something for which you can build a lookup table using AttributeValueMapper to replace with correct text, then you could try this approach.

https://docs.safe.com/fme/html/FME-Form-Documentation/FME-Transformers/Transformers/attributevaluemapper.htm

For the mangled text string, create the correct string using the AttributeValueMapper.

Happy FME:-) ing

Cheers

SRG

 

 


mohamedalsobh
Contributor
Forum|alt.badge.img+6
  • Author
  • Contributor
  • May 25, 2025

@mohamedalsobh If the text that when extracted has issues is something for which you can build a lookup table using AttributeValueMapper to replace with correct text, then you could try this approach.

https://docs.safe.com/fme/html/FME-Form-Documentation/FME-Transformers/Transformers/attributevaluemapper.htm

For the mangled text string, create the correct string using the AttributeValueMapper.

Happy FME:-) ing

Cheers

SRG

 

 

Thanks for your assistance I was trying to correct a word in a sentence and I used the StringReplacer it worked well