Skip to main content
Question

String Replacer doesn't catch Unicode Character


Forum|alt.badge.img

Hello all,

I have a problem with catching Turkish Unicode characters in StringReplacer and StringSearcher transformers. "\\w" function doesn't catch " Ç ç ? ? ? ? Ö ö ? ? Ü ü" letters. It works when I put these letters in the search bar. Also, I checked the same letters on www.regexpal.com and it didn't work too.

Is there any solution to catch these letters with "\\w"? Otherwise, I need to change these letters with other special characters which don't exist in Turkish Alphabet at the beginning and the end of the Workspace.

Thanks

Deniz

bruceharold
Contributor
Forum|alt.badge.img+17

Try \\X

From a quick read of PERL unicode support

If you can resort to Python it gets a bunch easier with the -U flag.


Forum|alt.badge.img

Hello @bruceharold

 

Thank you for your answer. \\X selects everything including whitespace. My strings include letters, whitespaces, dots and, numbers. I am going to select everything and, exclude numbers whitespaces and, dots. Also, thank you for the Python advice. All the doors open to Python, I should start to learn.

Thank you

Deniz


david_r
Evangelist

The "\w" metacharacter (usually) only matches the set [a-z, A-Z, 0-9, _], that's why your special characters aren't included. 

You best bet is to specify your own set, e.g.

[\wç?ü...etc...]


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings