Solved

Text encoding/decoding

  • 14 September 2015
  • 4 replies
  • 21 views

Hi all,

 

 

If I have GML file (utf-8 encoding) with (e.g.) LATIN SMALL LETTER C WITH ACUTE which I want to insert into SQL Server table with SQL_Latin1_General_CP1_CI_AS collation, and want to be able to export that (imported) value from table to gml file, what should I do?

I tried text encoders/decoders, but without success.

icon

Best answer by david_r 15 September 2015, 10:28

View original

4 replies

Userlevel 4
Hi

 

 

If you can give us a small, but complete GML block here, it might be easier to help you along.

 

 

David
Hi David,

 

 

GML features are too big for posting here, but here is a piece of two features with parent elements (all elements of interest is at the level of these two; all features are in 

 

KAT_CadastreFeatureCollectionMembers element) 
 <?xml version="1.0" encoding="utf-8" ?> <fgu:KAT_CadastreFeatureCollection>   <fgu:KAT_CadastreFeatureCollectionMembers>     <fgu:KAT_Katastarska_Opcina gml:id="LOCAL_ID_1">       <fgu:oid>1100100146866</fgu:oid>       <fgu:sifra>71051</fgu:sifra>       <fgu:naziv>Banovi?i</fgu:naziv>     </fgu:KAT_Katastarska_Opcina>     <fgu:KAT_Toponim gml:id="LOCAL_ID_120970">       <fgu:oid>1100100144884</fgu:oid>       <fgu:tip>O0902006</fgu:tip>       <fgu:naziv>Brezi?ki potok</fgu:naziv>     </fgu:KAT_Toponim>   </fgu:KAT_CadastreFeatureCollectionMembers> </fgu:KAT_CadastreFeatureCollection>
 Tag of interest is <fgu:naziv> These two contain two different characters, but all need to go into table with SQL_Latin1_General_CP1_CI_AS collation in that way so that I can read and write it back into GML file. How to implement this into FME (import and export  from GML in two different workflows)?

 

 

Thanks!
Userlevel 4
Hi

 

 

Thanks for the example and welcome to the wonderful world of character encodings ;-) The issue here is that the file is saved as UTF-8 (https://en.wikipedia.org/wiki/Unicode), but the contents are to be written to a table configured for ISO-Latin (https://en.wikipedia.org/wiki/ISO/IEC_8859) (which is a much smaller character set), and some of your special characters (such as ? and ?) aren't supported by the subset ISO-Latin1, which according to the error message, is what the target table is configured for.

 

 

You could try to convert the attributes to ISO-Latin2 (where these characters are supported) and hope that the database can handle them in some way, but it is a bit of a stretch:

 

 

 

 

If this doesn't work, consider removing the accents entirely (see https://knowledge.safe.com/AnswersQuestionDetail?id=906a0000000ckT7AAI) or use a StringPairReplacer for a more targeted approach, e.g. replacing ? with ch, etc

 

 

David

 

 
Well done! Thanks for explanation and solution, a lot.

Reply