Skip to main content
Solved

encoding and language driver ID


alc33
Contributor
Forum|alt.badge.img+10
  • Contributor

Hi!

 

I'm trying to define the encoding for my shp but directly on the dbf and not on the cpg. I choose UTF-8 on the parameters and I always have a .cpg file.

 

I looked on the shp writer help and I saw this sentence :

"The output Shapefile .dbf will contain the language driver ID for the selected or detected encoding, if no language driver ID is available for the encoding, a .cpg file may be generated instead."

 

I don't understand what is a "language driver ID ". Can you explain ?

 

Thanks!

Best answer by debbiatsafe

Hi @alc33​ 

I asked our development team for clarification.

 

Language drivers are used to determine how to sort and display characters in tables. In a DBF file, the language driver ID is stored in the file header, at byte offset 29.

 

When FME writes a .dbf file with a .cpg, the language driver ID is set to 0. If the encoding does have a driver ID, no .cpg file is created.

 

The reason FME creates a .cpg file for UTF-8 encoded files is because there is no language driver ID for UTF-8. The DBase version Shapefile uses predates the introduction of UTF-8, so there probably was never a consideration made for it later on. Modern applications know to check for the .cpg file and figure things out from there.

 

I hope this information helps!

View original
Did this help you find an answer to your question?

2 replies

debbiatsafe
Safer
Forum|alt.badge.img+20
  • Safer
  • Best Answer
  • December 16, 2020

Hi @alc33​ 

I asked our development team for clarification.

 

Language drivers are used to determine how to sort and display characters in tables. In a DBF file, the language driver ID is stored in the file header, at byte offset 29.

 

When FME writes a .dbf file with a .cpg, the language driver ID is set to 0. If the encoding does have a driver ID, no .cpg file is created.

 

The reason FME creates a .cpg file for UTF-8 encoded files is because there is no language driver ID for UTF-8. The DBase version Shapefile uses predates the introduction of UTF-8, so there probably was never a consideration made for it later on. Modern applications know to check for the .cpg file and figure things out from there.

 

I hope this information helps!


alc33
Contributor
Forum|alt.badge.img+10
  • Author
  • Contributor
  • December 16, 2020

Hi!

Thank you very much. It's perfect!

 


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings