Question

UTF-8 in Sqlite3 database writer

  • 8 August 2014
  • 5 replies
  • 22 views

Hi,

 

I am translating a CSV file to a Sqlite3 database. Some of the entries have accented characters, e.g. "Mhaol Mhòr" - a mountain in Scotland, but after processing it is displaying as "A' Mhaol Mh�r" in the target database when I query the record.

 

 

I can't see anywhere in the writer where I can specify the character encoding.

 

 

Any help or advice much appreciated, thanks.

 

 

Jason

5 replies

Userlevel 3
Badge +13
Hi Jason,

 

I suspect this is a bug in FME.

 

If my reading is correct, SQLite 3 doesn't care about what text it's given and stores stuff in UTF-8 by default (configurable to UTF-16 somehow). See: https://www.sqlite.org/version3.html

 

 

This if it's not going in properly I'd suggest it's FME. Report it to Safe.

 

Cheers,

 

Jonathan
Thanks Jonathan, will report. I tried manual import directly in Sqlite3 using the ".import" command and making sure to specify UTF8 with:- 
 PRAGMA encoding = "UTF-8";
 Also, viewing results in the command prompt confused me more as the default font couldn't handle display of accented chars anyway. Viewing results in a GUI (SQLiteStudio) confirmed the manual import correctly stored these characters.
Badge +3
Hi, we had same issue last week.

 

 

It appeared for me when migrating to fme2014.

 

 

 

After some time we found out it is the csv reader. I don't think it is a bug.

 

 

In fme 2014 u can set the intepretation character encoding of the csv reader. (fme 2013 did not have this afaik).

 

 

My system encoding on windows7 is windows 1252 Latin.

 

The csv reader was not capable of recocnizing the input as utf-8.

 

Converting (i use tcl convert function in a attribute creator for that btw. ) did not help, as it would convert the misintepretated characters.

 

 

Standard the encoding is not set (or set to automatic)

 

U should set your csv reader to UTF-8

 

 

Badge +3
I used this in an issue where i had mixed encding guids wich resulted in the fact that FeatureMerger would not match correctly (only smae encodings where matched).

 

 

 @Evaluate([encoding convertfrom {@Value(GUID)}])
 

 

This is a standard setting to UTF-8. ( look up reference for all settings)

 

U can use this in a creator for instance.
Thanks Gio,

 

I set the CSV Reader character encoding to "Windows-1252" (Latin 1) and this seemed to have resolved the issue. Viewing the target data in SQLiteStudio, I  can now see the correct characters being displayed. 

 

 

Jason

Reply