Solved

Dynamic Multi string replace

  • 9 January 2020
  • 5 replies
  • 4 views

Badge +1

Hi community.

 

I am looking to improve some workbenches and have them slightly more automated.

The one I am currently working on revolves around some height of building data and is extracting the class from a string (example below). Currently I am using the AttributeValueMapper to convert this out manually but would be fantastic if there was a way to do this without setting it manually (if new data/classes are added we don't need to come back in and edit the Values to map.

This is a screenshot of what is coming out of the AttributeValueMapper. The Condition attribute is the Source Value and the lay_class attribute is the destination value.

The class I am extracting is the capital letters (one with number, but there will be more in the future) towards the end of the string. I know there is a regex expression to extract uppercase letters which is the closest to an automated solution that I have found (the condition includes more than one uppercase character) so it may be a combo of that with another expression.

Any ideas would be fantastic.

Cheers.

icon

Best answer by arnold_bijlsma 9 January 2020, 17:54

View original

5 replies

Badge +3

This is essentially what StringSearcher is for, however, yes, RegEx is the way to go with it for generalising a lookup.

 

To generalise the RegEx, you probably should define what "Rules" the lay_class letter obeys. Such as:

  1. It is always bounded by word boundary characters like " " or "("
  2. it is always capital letters with or without a Number that may may 0 or more digits long
  3. etc.

Once you have that, then that it is a matter of coding that into the Regular Expression. For example, this RegEx using the above rules works for your samples:

\b[A-Z]+\d*\b

This states it must:

  • Be one or more consecutive Capital Letters -> [A-Z]+
  • Be bounded by Word Boundary Characters -> \b{SearchString}\b
  • Can have a digit included that occurs after the letter-> {SearchString}\d
  • The digit suffix can be 0 or more digit characters long -> {SearchString}\d*

The different permutations in your source data can be trialled in any number of online RegEx testers like https://regex101.com/

Badge +3

Assuming you're looking for:

  1. A single capital letter (e.g. "J" or "O") OR
  2. A single capital letter followed by a single number (e.g. "T1") OR
  3. Two capital letters (e.g. "RL")

then the following RegEx should work in your StringSearcher:

[A-Z][A-Z0-9]?

See also https://regex101.com/r/4S87sg/3

heightofbuildingclass.fmw

Badge +1

Assuming you're looking for:

  1. A single capital letter (e.g. "J" or "O") OR
  2. A single capital letter followed by a single number (e.g. "T1") OR
  3. Two capital letters (e.g. "RL")

then the following RegEx should work in your StringSearcher:

[A-Z][A-Z0-9]?

See also https://regex101.com/r/4S87sg/3

heightofbuildingclass.fmw

Thanks for this Arnold.

Worked like a charm, I had a moment the first run where all I was getting was the word 'of' as the extracted string but when I went back and ensured that case sensitive was set to yes everything went perfect.

 

Thanks very much mate.

Badge +1

This is essentially what StringSearcher is for, however, yes, RegEx is the way to go with it for generalising a lookup.

 

To generalise the RegEx, you probably should define what "Rules" the lay_class letter obeys. Such as:

  1. It is always bounded by word boundary characters like " " or "("
  2. it is always capital letters with or without a Number that may may 0 or more digits long
  3. etc.

Once you have that, then that it is a matter of coding that into the Regular Expression. For example, this RegEx using the above rules works for your samples:

\b[A-Z]+\d*\b

This states it must:

  • Be one or more consecutive Capital Letters -> [A-Z]+
  • Be bounded by Word Boundary Characters -> \b{SearchString}\b
  • Can have a digit included that occurs after the letter-> {SearchString}\d
  • The digit suffix can be 0 or more digit characters long -> {SearchString}\d*

The different permutations in your source data can be trialled in any number of online RegEx testers like https://regex101.com/

Hi bwn.

 

Thanks for this, it was just about perfect.

We just needed to add the possibility of the second letter with a second [A-Z].

Thanks very much for the reply mate.

Badge +3

Hi bwn.

 

Thanks for this, it was just about perfect.

We just needed to add the possibility of the second letter with a second [A-Z].

Thanks very much for the reply mate.

@michaelbreen Oops, I missed the last example of "RL". In this case, all that is needed is a "+" to [A-Z] to indicate it can be 1 or more consecutive capital letters at the start. The "\\d*" bit will then say the number suffix is optional and can be zero or more digits long.

I've modified the Answer above. As you say too, you will need StringSearcher to be set to respect capitalisation.

Reply