Skip to main content
Solved

The best way how to check encoding of TXT file

  • November 29, 2019
  • 2 replies
  • 4510 views

lazarlubomir
Contributor
Forum|alt.badge.img+10

Hello everyone,

Im thinking about the best way how to check encoding of ASCII (.txt) file in FME Desktop.

But, is is possible, at all?

Now, I have TXT reader in my workspace nd then I use stringsearcher transformer to check unsupported letters. Im not sure if it is the best way...

Thank You so much!

Lubo

Best answer by david_r

Unless the text file contains a BOM (byte-order mark) in the first two characters to indicate Unicode, there really is no definitive way of knowing which encoding a text file contains. If you can make a lot of fairly hard assumptions, you can assume that it's probably this or that, but in reality even that is stretching it a bit.

If the text files does contain a BOM header, FME should be able to auto-detect it correctly, although it may depend on your specific use case.

See also: https://softwareengineering.stackexchange.com/questions/187169/how-to-detect-the-encoding-of-a-file

 

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

2 replies

david_r
Celebrity
  • 8394 replies
  • Best Answer
  • November 29, 2019

Unless the text file contains a BOM (byte-order mark) in the first two characters to indicate Unicode, there really is no definitive way of knowing which encoding a text file contains. If you can make a lot of fairly hard assumptions, you can assume that it's probably this or that, but in reality even that is stretching it a bit.

If the text files does contain a BOM header, FME should be able to auto-detect it correctly, although it may depend on your specific use case.

See also: https://softwareengineering.stackexchange.com/questions/187169/how-to-detect-the-encoding-of-a-file

 


lazarlubomir
Contributor
Forum|alt.badge.img+10
  • Author
  • Contributor
  • 165 replies
  • November 29, 2019

Unless the text file contains a BOM (byte-order mark) in the first two characters to indicate Unicode, there really is no definitive way of knowing which encoding a text file contains. If you can make a lot of fairly hard assumptions, you can assume that it's probably this or that, but in reality even that is stretching it a bit.

If the text files does contain a BOM header, FME should be able to auto-detect it correctly, although it may depend on your specific use case.

See also: https://softwareengineering.stackexchange.com/questions/187169/how-to-detect-the-encoding-of-a-file

 

@david_r,

thank You so much, I thought that the solution is hardly reachable...