Question

Identify string within text and remove all string after it

  • 30 January 2017
  • 5 replies
  • 63 views

Badge

Hi,

This question is similar to the one posted here, however it is slightly different..

I am trying to replace string in a text_line_data attribute where the section of text i want to replace begins with a certain phrase and removes that + everything after it. So for example,

text_line _data:

 

"I like to use FME 2017"

I would like to identify all text which begin with "use", and delete everything (including that), after it. So, the result would be:

 

Remain: "I like to"

 

Removed: "use FME 2017".

The regular expression of ^ only applies to the start of the string, so would only recognize "I" as the beginning of the string.

Is there some way of setting the transformer to find a text string anywhere within an attribute and delete all string after it?


5 replies

Badge +7

Hi @johnwk

I think you are looking for a so called 'positive lookahead' where everything before a specific regex is kept:

https://regex101.com/r/I6nlPY/1

.*(?=use)

You can use this regex in the StringSearcher.

Userlevel 2
Badge +16

Or you can use the StringSearcher (search for use) to find the location in the string.

Then use the SubStringExtractor (using the index from the StringSearcher) to retain only the part you need.

Userlevel 1
Badge +21

You could also use a regular expression in a string replacer to replace use and everything after it with nothing which make the regex a bit simpler

use.+

Badge +7

You could also use a regular expression in a string replacer to replace use and everything after it with nothing which make the regex a bit simpler

use.+

 

Great idea!!

 

Just a small idea, isn't it better to use 'use.*' (* instead of +) because 'use' might also be the last word
Badge

You could also use a regular expression in a string replacer to replace use and everything after it with nothing which make the regex a bit simpler

use.+

Much appreciated everyone! As I'm a regex novice, I tried a combination of @egomm and @jeroenstiers suggestion of 'use.*'. This worked perfectly! Thanks again.

Reply