Skip to main content

Hi!

 

I'm trying to define the encoding for my shp but directly on the dbf and not on the cpg. I choose UTF-8 on the parameters and I always have a .cpg file.

 

I looked on the shp writer help and I saw this sentence :

"The output Shapefile .dbf will contain the language driver ID for the selected or detected encoding, if no language driver ID is available for the encoding, a .cpg file may be generated instead."

 

I don't understand what is a "language driver ID ". Can you explain ?

 

Thanks!

Hi @alc33​ 

I asked our development team for clarification.

 

Language drivers are used to determine how to sort and display characters in tables. In a DBF file, the language driver ID is stored in the file header, at byte offset 29.

 

When FME writes a .dbf file with a .cpg, the language driver ID is set to 0. If the encoding does have a driver ID, no .cpg file is created.

 

The reason FME creates a .cpg file for UTF-8 encoded files is because there is no language driver ID for UTF-8. The DBase version Shapefile uses predates the introduction of UTF-8, so there probably was never a consideration made for it later on. Modern applications know to check for the .cpg file and figure things out from there.

 

I hope this information helps!


Hi!

Thank you very much. It's perfect!

 


Reply