Skip to main content
Question

String Replacer doesn't catch Unicode Character

  • March 29, 2018
  • 3 replies
  • 41 views

denizturan1985
Participant
Forum|alt.badge.img

Hello all,

I have a problem with catching Turkish Unicode characters in StringReplacer and StringSearcher transformers. "\\w" function doesn't catch " Ç ç ? ? ? ? Ö ö ? ? Ü ü" letters. It works when I put these letters in the search bar. Also, I checked the same letters on www.regexpal.com and it didn't work too.

Is there any solution to catch these letters with "\\w"? Otherwise, I need to change these letters with other special characters which don't exist in Turkish Alphabet at the beginning and the end of the Workspace.

Thanks

Deniz

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

3 replies

bruceharold
Supporter
Forum|alt.badge.img+19
  • Supporter
  • 349 replies
  • March 29, 2018

Try \\X

From a quick read of PERL unicode support

If you can resort to Python it gets a bunch easier with the -U flag.


denizturan1985
Participant
Forum|alt.badge.img
  • Author
  • Participant
  • 11 replies
  • March 30, 2018

Hello @bruceharold

 

Thank you for your answer. \\X selects everything including whitespace. My strings include letters, whitespaces, dots and, numbers. I am going to select everything and, exclude numbers whitespaces and, dots. Also, thank you for the Python advice. All the doors open to Python, I should start to learn.

Thank you

Deniz


david_r
Celebrity
  • 8394 replies
  • April 3, 2018

The "\w" metacharacter (usually) only matches the set [a-z, A-Z, 0-9, _], that's why your special characters aren't included. 

You best bet is to specify your own set, e.g.

[\wç?ü...etc...]