Skip to main content
Solved

The best way how to check encoding of TXT file

  • November 29, 2019
  • 2 replies
  • 4160 views

lazarlubomir
Contributor
Forum|alt.badge.img+7

Hello everyone,

Im thinking about the best way how to check encoding of ASCII (.txt) file in FME Desktop.

But, is is possible, at all?

Now, I have TXT reader in my workspace nd then I use stringsearcher transformer to check unsupported letters. Im not sure if it is the best way...

Thank You so much!

Lubo

Best answer by david_r

Unless the text file contains a BOM (byte-order mark) in the first two characters to indicate Unicode, there really is no definitive way of knowing which encoding a text file contains. If you can make a lot of fairly hard assumptions, you can assume that it's probably this or that, but in reality even that is stretching it a bit.

If the text files does contain a BOM header, FME should be able to auto-detect it correctly, although it may depend on your specific use case.

See also: https://softwareengineering.stackexchange.com/questions/187169/how-to-detect-the-encoding-of-a-file

 

View original
Did this help you find an answer to your question?

2 replies

david_r
Evangelist
  • Best Answer
  • November 29, 2019

Unless the text file contains a BOM (byte-order mark) in the first two characters to indicate Unicode, there really is no definitive way of knowing which encoding a text file contains. If you can make a lot of fairly hard assumptions, you can assume that it's probably this or that, but in reality even that is stretching it a bit.

If the text files does contain a BOM header, FME should be able to auto-detect it correctly, although it may depend on your specific use case.

See also: https://softwareengineering.stackexchange.com/questions/187169/how-to-detect-the-encoding-of-a-file

 


lazarlubomir
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • November 29, 2019
david_r wrote:

Unless the text file contains a BOM (byte-order mark) in the first two characters to indicate Unicode, there really is no definitive way of knowing which encoding a text file contains. If you can make a lot of fairly hard assumptions, you can assume that it's probably this or that, but in reality even that is stretching it a bit.

If the text files does contain a BOM header, FME should be able to auto-detect it correctly, although it may depend on your specific use case.

See also: https://softwareengineering.stackexchange.com/questions/187169/how-to-detect-the-encoding-of-a-file

 

@david_r,

thank You so much, I thought that the solution is hardly reachable...


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings