Skip to main content
Solved

Regex replace multiple consecutive string occurrence with single instance or the same string


Forum|alt.badge.img

Hi,

(I am using FME Desktop 2015.1)

Can anyone help with the regex needed to replace multiple consecutive occurrences of a flag which I have created with a single instance; the flag may appear as a multiple occurrence at many positions throughout the text. Each time I need to replace the many instances with a single instance. Please see examples below.

Please note that as in the first flag grouping in example 1 it may or may not have spaces either side of the grouping. At present the 'flag' is (TMP_flag) but if this syntax (namely the parenthesis) cause an issue I can change them.

Example 1

I am text before(TMP_flag)(TMP_flag)I am text in the middle (TMP_flag)(TMP_flag)(TMP_flag) I am text at the end(TMP_flag)

Becomes

I am text before(TMP_flag)I am text in the middle (TMP_flag) I am text at the end(TMP_flag)

Example 2

(TMP_flag)

Becomes

(TMP_flag)

Example 3

(TMP_flag)(TMP_flag)(TMP_flag)

Becomes

(TMP_flag)

Example 4

Some more text(TMP_flag)(TMP_flag)

Becomes

Some more text(TMP_flag)

 

Thanks in advance,

Rob

Best answer by takashi

Hi @rob14, the StringReplacer with this setting might help you.

  • Text to Match: (\\(TMP_flag\\))+
  • Replacement Text :\\1
  • Use Regular Expressions: yes
View original
Did this help you find an answer to your question?

7 replies

takashi
Influencer
  • Best Answer
  • April 4, 2017

Hi @rob14, the StringReplacer with this setting might help you.

  • Text to Match: (\\(TMP_flag\\))+
  • Replacement Text :\\1
  • Use Regular Expressions: yes

Forum|alt.badge.img
  • Author
  • April 4, 2017

Hi @takashi,

Thanks very much, that has worked a treat.

(As an aside I had already used one of your previous posted answers to remove multiple consectutive spaces \\s{2,}. I had tried to tweak this to remove the flags but had not managed it.)

Thanks again,

Rob


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • April 4, 2017

Hi,

U can do that using regsub in a (for instance) attribute creator using the Arithmitic editor.

[regsub -all -line {(\\(TMP_flag\\))\\1*} )} {\\1}]

test is the attribute I used to hold your examples.

nd

  1. I capture (TMP_flag) by regexp (\\(TMP_flag\\)) whereby enclosing braces are escaped \\( and \\) and the outer braces define the capture group.
  2. The curly braces are needed because your string contain spaces, braces and underlines (double quotes will yield a error)
  3. the \\1 denotes the (first) capture group. Needs to be in curly braces because of the braces in the capture group.
  4. switch "-all" finds all matches and "-line" to do it till end of line.

Stringreplacer transformer can alos be used:

Text to match: (\\(TMP_flag\\))\\1*

Replacement text : \\1

Use regular expressions: Yes

For explanation see

https://www.tcl.tk/man/tcl8.4/TclCmd/regsub.htm

A must to have in your favorites list.


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • April 4, 2017

lol...I took too much time typing.


Forum|alt.badge.img
  • Author
  • April 4, 2017

Hi @gio

Thanks for the answers and the explanation of the regex processing.

I need to spend more time on understanding how it works as it is very powerful, when you know what you are doing. :-)

Rob

 

,

Hi @gio

Thanks for the answers and the accompanying explanation as to the Regex processing.

I need to spend more time learning/working with this to develop my skills, it is a really powerful tool when you know what you are doing. :-) (one day hopefully)

 

 

Rob


Forum|alt.badge.img
  • Author
  • April 4, 2017

Hi @gio

 

I had problems with double posting now!

gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • April 4, 2017

..I am honored ;)


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings