Skip to main content
Question

Serbian Cyrilic to Latin

  • February 26, 2016
  • 2 replies
  • 29 views

I have OSM data with booth Cyrilic and Latin Serbian characters. I want to translate it all to Latin. So, ? ? ? would become ? ? dž. I tried TextEncoder, AttributeEncoder, PythonCaller, but unsuccessfully.

Can somebody help me with this?

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

2 replies

geosander
Forum|alt.badge.img+7
  • February 26, 2016

What is the encoding of the incoming attribute(s) that hold the Cyrillic characters? UTF-8?

I don't know what you did with the PythonCaller btw, but you could consider changing the Python Interpreter to 3.4+ if you didn't do that already. The latest Python 3 versions are less problematic with character encoding/decoding.


Forum|alt.badge.img
  • February 26, 2016

Hi @aleksandar,

you data must be in UTF to preserve both - Cyrillic and Latin Serbian. And I guess, you would like it to be saved in Win-1250.

I would suggest replacing ? ? ? with ? ? dž using StringPairReplacer first. After this it should be possible to save data in Win-1250 encoding without any extra steps (i.e. TextEncoder or AttributeEncoder shouldn't be needed anymore as the Writer will deal with the encoding).