Skip to main content

Hi there!

I'm trying to get a Regex search and replace going but my poor regex skills are letting me down. I'm trying to obtain the first 3 (or 4) digits in an existing attribute string and into a new attribute using the StringReplacer but can't get the right formula. I only want the first set of numeric values.

PS. I'm sure this can be done in a single AttributeCreator transformer using a string function too?

 

Any help would be appreciated!

How about another approach:

Use AttributeSplitter (split on whitespace) to split into a list.

The use the StringSearcher on _list{0} using the Regex \\d to find the number.

Hope this helps.


And this will work too, getting the first number in _first_match (in my example 600):


@umapper1 Unfortunately StringReplacer doesn't honour substrings - so you're () don't help. Try:

\\d{3,4}\\s

But then in your replacement string, add <space> after the replacement value


And this will work too, getting the first number in _first_match (in my example 600):

Perfect thanks!


Yes you can do that with a string expression too.

For example, this expression returns the first 3 or 4 consecutive digits in a specified string. Assuming an attribute called "_line" stores a text line such as "CLH800 040PVC TEL TOP T101".

@ReplaceRegEx(@Value(_line),"^\D*(\d{3,4}).*",\1)

Yes you can do that with a string expression too.

For example, this expression returns the first 3 or 4 consecutive digits in a specified string. Assuming an attribute called "_line" stores a text line such as "CLH800 040PVC TEL TOP T101".

@ReplaceRegEx(@Value(_line),"^\D*(\d{3,4}).*",\1)

Just be aware the expression returns the same string as the input string if it doesn't contain a matched part. If there could be a case where the input string wouldn't match the regex, it would be better to use the StringSearcher transformer.


Yes you can do that with a string expression too.

For example, this expression returns the first 3 or 4 consecutive digits in a specified string. Assuming an attribute called "_line" stores a text line such as "CLH800 040PVC TEL TOP T101".

@ReplaceRegEx(@Value(_line),"^\D*(\d{3,4}).*",\1)

Thanks Takashi! this is actually what I was chasing - the expression


@umapper1 Unfortunately StringReplacer doesn't honour substrings - so you're () don't help. Try:

\\d{3,4}\\s

But then in your replacement string, add <space> after the replacement value

Hi Mark,

just out of curiosity, what do you mean with 'StringReplacer doesn't honour substrings '? In the example below, I swap 'street 24' to '24 street', by referring to '\\2' as the second substring and '\\1' as the first.

 


Hi Mark,

just out of curiosity, what do you mean with 'StringReplacer doesn't honour substrings '? In the example below, I swap 'street 24' to '24 street', by referring to '\\2' as the second substring and '\\1' as the first.

 

@jelle - you're correct. I completely forgot about that hidden gem!

From our documentation:

If replacement text contains \\#, where # is a digit between 1 and 9, then it is replaced in the substitution with the portion of string that matched the n-th parenthesized subexpression of the regular expression.

So in the StringReplacer the regex could be: \\d{3,4}(\\s)

and the replacement text: @Value(_replacement)\\1

where \\1 preserves the <space>


Reply