Question

string case change and search

  • 22 November 2016
  • 10 replies
  • 11 views

Hi All,

I am working on data/ names wherein I am changing the case of each word to title case using the string case changer. But there are few cases/ articles like le,la,d', with accents which i dont want to change it to the tile case if they are present in between. eg :le rocher hôtel - le restaurant' in this i want to change everythng to title case except "-le" which is in between.I want first "le" to be in tile case. correct name should be like this Le Rocher Hôtel - le Restaurant. I am using string searcher to identify these articles but its not giving me correct result.

 


10 replies

Userlevel 4

Sounds like a job for Python, this one.

Userlevel 2
Badge +17

Hi @dandekarpriya, if the rule is "any character that follows to a hyphen + a space should not be changed to upper case", this might work.

  1. StringReplacer 1: Replace every "-<space>" with "-X".
  2. StringCaseChanger: Change case with "Full Title Case".
  3. StringReplacer 2: Replace every "-X" with "-<space>".

Hi @dandekarpriya, if the rule is "any character that follows to a hyphen + a space should not be changed to upper case", this might work.

  1. StringReplacer 1: Replace every "-<space>" with "-X".
  2. StringCaseChanger: Change case with "Full Title Case".
  3. StringReplacer 2: Replace every "-X" with "-<space>".

 

but some times i have cases without hyphen.

 

Eg Canal De Mozambique. in this i dont want to change my "de"to "De"
Userlevel 2
Badge +17

 

but some times i have cases without hyphen.

 

Eg Canal De Mozambique. in this i dont want to change my "de"to "De"
"de" always should be lowercase? Or is there any rule to determine which should be used - lowercase or uppercase?

 

Badge +16

possibly identifying all the words and changing them back after the case change can be an option that you should consider.

That depends naturally on the number of words that need to change back.

Hi @dandekarpriya, if the rule is "any character that follows to a hyphen + a space should not be changed to upper case", this might work.

  1. StringReplacer 1: Replace every "-<space>" with "-X".
  2. StringCaseChanger: Change case with "Full Title Case".
  3. StringReplacer 2: Replace every "-X" with "-<space>".
If it is present at the beginning of the name then it needs to be change to title case. . Eg de canal mozambique so this can be change to De Canal Mozambique. But in other case where it is present in between like this "Canal De Mozambique" i want to change it to Canal de Mozambique

 

 

Userlevel 2
Badge +17
If it is present at the beginning of the name then it needs to be change to title case. . Eg de canal mozambique so this can be change to De Canal Mozambique. But in other case where it is present in between like this "Canal De Mozambique" i want to change it to Canal de Mozambique

 

 

Maybe you can add another pair of StringReplacers to resolve that. e.g.

 

  1. Replace "<space>de<space>" with "<space>Xde<space>" before case changing.
  2. Replace "<space>Xde<space>" with "<space>de<space>"after case changing.
However, if there were more exceptional cases other than "-<space><any>" and "<space>de<space>", scripting could also be an option as @david_r mentioned at first.

 

Userlevel 2
Badge +17
If it is present at the beginning of the name then it needs to be change to title case. . Eg de canal mozambique so this can be change to De Canal Mozambique. But in other case where it is present in between like this "Canal De Mozambique" i want to change it to Canal de Mozambique

 

 

This may be a more flexible solution.

 

1. StringReplacer before StringCaseChanger

 

  • Text to Match: \\s(le\\s|la\\s|de\\s|d')
  • Replacement Text: <space>X\\1
  • Use Regular Expressions: yes
2. StringReplacer after StringCaseChanger

 

  • Text to Match: \\sX(le\\s|la\\s|de\\s|d')
  • Replacement Text: <space>\\1
  • Use Regular Expressions: yes
Badge +3

lol, just link the French dictionary to your works space using a featuremerger in French locale..

( Un Mergeur de Feature)

Badge

Hi @dandekarpriya,

I agree with @takashi and @itay (and really love @gio's suggestion :)) I would try using StringCaseChanger and then replace

  • all <space>Le<space> with <space>le<space>
  • all <space>La<space> with <space>la<space>
  • all <space>D' with <space>d'

As there are not that many options, they can be easily dealt with one by one.

De Canal Mozambique is a more complicated case though... Unless you have a 'dictionary of names' I don't think you will be able to process all the names correctly without doing some manual validation.

Reply