Skip to main content
Question

UTF-8 in Sqlite3 database writer


Hi,

 

I am translating a CSV file to a Sqlite3 database. Some of the entries have accented characters, e.g. "Mhaol Mhòr" - a mountain in Scotland, but after processing it is displaying as "A' Mhaol Mh�r" in the target database when I query the record.

 

 

I can't see anywhere in the writer where I can specify the character encoding.

 

 

Any help or advice much appreciated, thanks.

 

 

Jason

5 replies

fmelizard
Contributor
Forum|alt.badge.img+17
  • Contributor
  • August 11, 2014
Hi Jason,

 

I suspect this is a bug in FME.

 

If my reading is correct, SQLite 3 doesn't care about what text it's given and stores stuff in UTF-8 by default (configurable to UTF-16 somehow). See: https://www.sqlite.org/version3.html

 

 

This if it's not going in properly I'd suggest it's FME. Report it to Safe.

 

Cheers,

 

Jonathan

  • Author
  • August 11, 2014
Thanks Jonathan, will report. I tried manual import directly in Sqlite3 using the ".import" command and making sure to specify UTF8 with:- 
 PRAGMA encoding = "UTF-8";
 Also, viewing results in the command prompt confused me more as the default font couldn't handle display of accented chars anyway. Viewing results in a GUI (SQLiteStudio) confirmed the manual import correctly stored these characters.

gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • August 11, 2014
Hi, we had same issue last week.

 

 

It appeared for me when migrating to fme2014.

 

 

 

After some time we found out it is the csv reader. I don't think it is a bug.

 

 

In fme 2014 u can set the intepretation character encoding of the csv reader. (fme 2013 did not have this afaik).

 

 

My system encoding on windows7 is windows 1252 Latin.

 

The csv reader was not capable of recocnizing the input as utf-8.

 

Converting (i use tcl convert function in a attribute creator for that btw. ) did not help, as it would convert the misintepretated characters.

 

 

Standard the encoding is not set (or set to automatic)

 

U should set your csv reader to UTF-8

 

 


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • August 11, 2014
I used this in an issue where i had mixed encding guids wich resulted in the fact that FeatureMerger would not match correctly (only smae encodings where matched).

 

 

 @Evaluate([encoding convertfrom {@Value(GUID)}])
 

 

This is a standard setting to UTF-8. ( look up reference for all settings)

 

U can use this in a creator for instance.

  • Author
  • August 18, 2014
Thanks Gio,

 

I set the CSV Reader character encoding to "Windows-1252" (Latin 1) and this seemed to have resolved the issue. Viewing the target data in SQLiteStudio, I  can now see the correct characters being displayed. 

 

 

Jason

Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings