Question

PDF reader characters

Forum|Forum|6 years ago
March 6, 2019
3 replies
40 views

+14

oliver.morris
Contributor

Hi, I am reading text from a pdf using the PDF reader and in the text I see xEF, xBF, xBE - what are these and how do I remove them.

Thank you

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

+54

danilo_fme
Celebrity
Forum|Forum|6 years ago
March 6, 2019

Hi @olivermorris

Could you share a amount of this information?

Thanks,

Danilo

Partner Solutial Brazil - www.solutial.com.br

Upvote

david_r
Forum|Forum|6 years ago
March 7, 2019

It could be either non-printable characters or unicode characters that aren't encodable in the active encoding. xEF could e.g. be an ï (letter i with diaeresis): http://www.fileformat.info/info/unicode/char/ef/index.htm

But without knowing the context it's hard to be certain.

You could try using an AttributeEncoder to see if that gives you the expected result.

Upvote

+14

oliver.morris
Author
Contributor
Forum|Forum|6 years ago
March 7, 2019

Thanks for the help, an example below.

I had some more of a search around and after adding:

String replacer and copying the black sections to replace they then were removed. As @david_r suggests I think they are just non printable characters.

All sorted, thanks

Upvote

PDF reader characters

3 replies

Community Stats

Latest FME

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded