Skip to main content

I have an Address: "198 whitten farm Apt 3". I would like to count the number of blank spaces in the address. if it has more than 3 blank spaces I want to eliminate whatever is there after the 3 blank spaces.

A StringSearcher transformer with these settings will (in this case) return "198 whitten farm" in the _first_match attribute (you can pick another name or even the address.

The regular expression ^[^\\s]*\\s[^\\s]*\\s[^\\s]* is:

start, then any number of non-space characters, followed by a space, followed by any number of non-space characters, followed by a space ending in any number of non-space characters.

Hope this makes sense.


A StringSearcher transformer with these settings will (in this case) return "198 whitten farm" in the _first_match attribute (you can pick another name or even the address.

0684Q00000ArLtCQAV.png

The regular expression ^p^\s]*\se^\s]*\se^\s]* is:

start, then any number of non-space characters, followed by a space, followed by any number of non-space characters, followed by a space ending in any number of non-space characters.

Hope this makes sense.

A variant of @erik_jan 's very good RegEx is alternatively:

(\S+\s+){0,2}\S+

This is for a little more general purpose by:

  • Trimming off any leading spaces that may exist in the string
  • Dealing with situations where you may only have 1 or 2 words rather than just 3 or more, so it starts by looking for between 0 to 2 combinations of having 1 or more sequential Non-Whitespace characters followed by 1 or more sequential space characters followed by a 1 or more length of Non-Whitespace characters. So this can deal with say "198 whitten" or "whitten" as variants in the address string

 

 

 


A StringSearcher transformer with these settings will (in this case) return "198 whitten farm" in the _first_match attribute (you can pick another name or even the address.

The regular expression ^[^\\s]*\\s[^\\s]*\\s[^\\s]* is:

start, then any number of non-space characters, followed by a space, followed by any number of non-space characters, followed by a space ending in any number of non-space characters.

Hope this makes sense.

Thank you. That worked

 


Reply