Skip to main content
Solved

Regular expression in StringReplacer

  • February 10, 2020
  • 9 replies
  • 485 views

umapper1
Contributor
Forum|alt.badge.img+5

Hi there!

I'm trying to get a Regex search and replace going but my poor regex skills are letting me down. I'm trying to obtain the first 3 (or 4) digits in an existing attribute string and into a new attribute using the StringReplacer but can't get the right formula. I only want the first set of numeric values.

PS. I'm sure this can be done in a single AttributeCreator transformer using a string function too?

 

Any help would be appreciated!

Best answer by erik_jan

And this will work too, getting the first number in _first_match (in my example 600):

View original
Did this help you find an answer to your question?

9 replies

erik_jan
Contributor
Forum|alt.badge.img+17
  • Contributor
  • February 10, 2020

How about another approach:

Use AttributeSplitter (split on whitespace) to split into a list.

The use the StringSearcher on _list{0} using the Regex \\d to find the number.

Hope this helps.


erik_jan
Contributor
Forum|alt.badge.img+17
  • Contributor
  • Best Answer
  • February 10, 2020

And this will work too, getting the first number in _first_match (in my example 600):


Forum|alt.badge.img+2
  • February 10, 2020

@umapper1 Unfortunately StringReplacer doesn't honour substrings - so you're () don't help. Try:

\\d{3,4}\\s

But then in your replacement string, add <space> after the replacement value


umapper1
Contributor
Forum|alt.badge.img+5
  • Author
  • Contributor
  • February 10, 2020
erik_jan wrote:

And this will work too, getting the first number in _first_match (in my example 600):

Perfect thanks!


takashi
Influencer
  • February 11, 2020

Yes you can do that with a string expression too.

For example, this expression returns the first 3 or 4 consecutive digits in a specified string. Assuming an attribute called "_line" stores a text line such as "CLH800 040PVC TEL TOP T101".

@ReplaceRegEx(@Value(_line),"^\D*(\d{3,4}).*",\1)

takashi
Influencer
  • February 11, 2020
takashi wrote:

Yes you can do that with a string expression too.

For example, this expression returns the first 3 or 4 consecutive digits in a specified string. Assuming an attribute called "_line" stores a text line such as "CLH800 040PVC TEL TOP T101".

@ReplaceRegEx(@Value(_line),"^\D*(\d{3,4}).*",\1)

Just be aware the expression returns the same string as the input string if it doesn't contain a matched part. If there could be a case where the input string wouldn't match the regex, it would be better to use the StringSearcher transformer.


umapper1
Contributor
Forum|alt.badge.img+5
  • Author
  • Contributor
  • February 11, 2020
takashi wrote:

Yes you can do that with a string expression too.

For example, this expression returns the first 3 or 4 consecutive digits in a specified string. Assuming an attribute called "_line" stores a text line such as "CLH800 040PVC TEL TOP T101".

@ReplaceRegEx(@Value(_line),"^\D*(\d{3,4}).*",\1)

Thanks Takashi! this is actually what I was chasing - the expression


jelle
Contributor
Forum|alt.badge.img+16
  • Contributor
  • February 11, 2020
markatsafe wrote:

@umapper1 Unfortunately StringReplacer doesn't honour substrings - so you're () don't help. Try:

\\d{3,4}\\s

But then in your replacement string, add <space> after the replacement value

Hi Mark,

just out of curiosity, what do you mean with 'StringReplacer doesn't honour substrings '? In the example below, I swap 'street 24' to '24 street', by referring to '\\2' as the second substring and '\\1' as the first.

 


Forum|alt.badge.img+2
  • February 11, 2020
jelle wrote:

Hi Mark,

just out of curiosity, what do you mean with 'StringReplacer doesn't honour substrings '? In the example below, I swap 'street 24' to '24 street', by referring to '\\2' as the second substring and '\\1' as the first.

 

@jelle - you're correct. I completely forgot about that hidden gem!

From our documentation:

If replacement text contains \\#, where # is a digit between 1 and 9, then it is replaced in the substitution with the portion of string that matched the n-th parenthesized subexpression of the regular expression.

So in the StringReplacer the regex could be: \\d{3,4}(\\s)

and the replacement text: @Value(_replacement)\\1

where \\1 preserves the <space>


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings