Skip to main content

Is there a way using a regular expression to split a string of text such as an address by a number of words/spaces? Or even more functional, split a line at the first space after a certain number of characters?

If you want to split the attribute at every space, you can use the AttributeSplitter, with a space as the delimiter.

If you want to be more specific, you can use a StringSearcher with a regular expression of something like this, but replace the _ with spaces:

^[^_]{3,}_

This will start at the beginning of the string and find the first 3 or more characters other than a space, followed by a single space. You can replace the 3 with any number. You can then use the string functions within an attribute manager to split the string.


Basically I need to split a text string at the first space after 19 characters.

I was thinking of using a stringlengthcalculator to find the length of the strings and only send ones that are longer than 20 characters through the necessary transformers. The ones that are longer will go through an attributesplitter that splits the line at the first space after the 19th character. Then I calculate the length of the string in _list(1) and if it is longer than 20 I send it through another series of identical transformers to split that line into multiple lines at the first space after the 19th character. I then continue to do this until I have however many lines of text to manipulate.

For example.

Address = Ideal Place Construction 1234 North Fulton Street Pickerington Ohio 12345 USA

- would be split into -

_list(0) = Ideal Place Construction

_list(1) = 1234 North Fulton Street

_list(2) = Pickerington Ohio 12345

_list(3) = USA

I could then rebuild the string using a stringconcatenator to be

Ideal Place Construction

1234 North Fulton Street

Pickerington Ohio 12345

USA


Basically I need to split a text string at the first space after 19 characters.

 

 

I was thinking of using a stringlengthcalculator to find the length of the strings and only send ones that are longer than 20 characters through the necessary transformers. The ones that are longer will go through an attributesplitter that splits the line at the first space after the 19th character. Then I calculate the length of the string in _list(1) and if it is longer than 20 I send it through another series of identical transformers to split that line into multiple lines at the first space after the 19th character. I then continue to do this until I have however many lines of text to manipulate.

 

 

For example.

 

 

Address = Ideal Place Construction 1234 North Fulton Street Pickerington Ohio 12345 USA

 

- would be split into -

 

_list(0) = Ideal Place Construction

 

_list(1) = 1234 North Fulton Street

 

_list(2) = Pickerington Ohio 12345

 

_list(3) = USA

 

 

I could then rebuild the string using a stringconcatenator to be

 

 

 

Ideal Place Construction

 

1234 North Fulton Street

 

Pickerington Ohio 12345

 

USA

 

 

 


Hi @gmbutler2, is this question related to How do I find the first space?

Hi @gmbutler2, the StringReplacer with this setting transforms a single line address string into your desired multi-line string at once.

  • Mode: Replace Regular Expression
  • Text To Replace: (.{19,}?)\s
  • Replacement Text: \1<newline character>

Alternatively, this string expression does the trick too. Assuming that an attribute called "address" stores the original single line address string.

@ReplaceRegEx(@Value(address),(.{19,}?)\s,\1
)

Hi @gmbutler2, the StringReplacer with this setting transforms a single line address string into your desired multi-line string at once.

  • Mode: Replace Regular Expression
  • Text To Replace: (.{19,}?)\s
  • Replacement Text: \1<newline character>

Alternatively, this string expression does the trick too. Assuming that an attribute called "address" stores the original single line address string.

@ReplaceRegEx(@Value(address),(.{19,}?)\s,\1
)
WOW @takashi. That worked like a charm.  Thanks a bunch.  That fixes a really big headache for me.

 


Hi @gmbutler2, the StringReplacer with this setting transforms a single line address string into your desired multi-line string at once.

  • Mode: Replace Regular Expression
  • Text To Replace: (.{19,}?)\s
  • Replacement Text: \1<newline character>

Alternatively, this string expression does the trick too. Assuming that an attribute called "address" stores the original single line address string.

@ReplaceRegEx(@Value(address),(.{19,}?)\s,\1
)

This worked for me too. I was looking for a way to split a large piece of xml, in order to be able to insert it with a PLSQL


Reply