I have an Address: "198 whitten farm Apt 3". I would like to count the number of blank spaces in the address. if it has more than 3 blank spaces I want to eliminate whatever is there after the 3 blank spaces.
A StringSearcher transformer with these settings will (in this case) return "198 whitten farm" in the _first_match attribute (you can pick another name or even the address.
The regular expression ^[^\\s]*\\s[^\\s]*\\s[^\\s]* is:
start, then any number of non-space characters, followed by a space, followed by any number of non-space characters, followed by a space ending in any number of non-space characters.
Hope this makes sense.
A StringSearcher transformer with these settings will (in this case) return "198 whitten farm" in the _first_match attribute (you can pick another name or even the address.
The regular expression ^p^\s]*\se^\s]*\se^\s]* is:
start, then any number of non-space characters, followed by a space, followed by any number of non-space characters, followed by a space ending in any number of non-space characters.
Hope this makes sense.
A variant of @erik_jan 's very good RegEx is alternatively:
(\S+\s+){0,2}\S+
This is for a little more general purpose by:
- Trimming off any leading spaces that may exist in the string
- Dealing with situations where you may only have 1 or 2 words rather than just 3 or more, so it starts by looking for between 0 to 2 combinations of having 1 or more sequential Non-Whitespace characters followed by 1 or more sequential space characters followed by a 1 or more length of Non-Whitespace characters. So this can deal with say "198 whitten" or "whitten" as variants in the address string
A StringSearcher transformer with these settings will (in this case) return "198 whitten farm" in the _first_match attribute (you can pick another name or even the address.
The regular expression ^[^\\s]*\\s[^\\s]*\\s[^\\s]* is:
start, then any number of non-space characters, followed by a space, followed by any number of non-space characters, followed by a space ending in any number of non-space characters.
Hope this makes sense.
Thank you. That worked