Question

Splitting a long text_line_data attribute every 100 characters for output

  • 14 March 2024
  • 7 replies
  • 78 views

Badge +1

Hi

I have a long line of text_line_data that contains payment details.

I want to save the data to a data or text file but have the body of the data split at every 100 characters and then print the next payment detail so that each persons output is on a new line until there are no more details.

I have used string padders to get each attribute and testers to create an attribute containing payment details from 1-100, 100-200, 200-300 etc. which does work but is not dynamic.  Is there an easier way to split text every 100 characters to a new line? Any help/hints would be appreciated.

Basically, I’d like to output the following long attribute text_line_data…….

1234567890123456789012345678901    00000001234TEXTT EXT TEXTTEX 01234            MR & MRS SMITH     1234567890123456789012345678901    00000001234TEXTT EXT TEXTTEX 01234            MR & MRS SMITH      1234567890123456789012345678901    00000001234TEXTT EXT TEXTTEX 01234            MR & MRS SMITH     1234567890123456789012345678901    00000001234TEXTT EXT TEXTTEX 01234            MR & MRS SMITH     1234567890123456789012345678901    00000001234TEXTT EXT TEXTTEX 01234            MR & MRS SMITH     1234567890123456789012345678901    00000001234TEXTT EXT TEXTTEX 01234            MR & MRS SMITH     

to separate lines 100 characters long, like this…...

1234567890123456789012345678901    00000001234TEXTT EXT TEXTTEX 01234            MR & MRS SMITH     

1234567890123456789012345678901    00000001234TEXTT EXT TEXTTEX 01234            MR & MRS SMITH      

1234567890123456789012345678901    00000001234TEXTT EXT TEXTTEX 01234            MR & MRS SMITH     

1234567890123456789012345678901    00000001234TEXTT EXT TEXTTEX 01234            MR & MRS SMITH     

1234567890123456789012345678901    00000001234TEXTT EXT TEXTTEX 01234            MR & MRS SMITH     

1234567890123456789012345678901    00000001234TEXTT EXT TEXTTEX 01234            MR & MRS SMITH     


7 replies

Userlevel 4
Badge +35

Split the string using a Regular Expression.

  1. Use a StringReplacer in RexEx mode to put a delimiter (I used |) after every 100th character
  2. Use an AttributeSplitter with the delimiter to create a list
  3. Use a ListExploder to split the list into individual features

 

Userlevel 3
Badge +12

Hi @scottmacdonald 

One possible approach is to use a PythonCaller to split the string in groups of 100.

Attached is a sample workspace.

 

import fme
import fmeobjects


class FeatureProcessor(object):

def input(self, feature: fmeobjects.FMEFeature):

text = feature.getAttribute('text_line_data')
group_size = 100

output_list = [text[i:i + group_size] for i in range(0, len(text), group_size)]

feature.setAttribute('_output_list{}', output_list)


self.pyoutput(feature)

 

Userlevel 3
Badge +12

I believe you can also use a StringSearcher and a ListExploder. In the StringSearcher you can put “.{100}” and then under Advanced create a list for All Matches. The ListExploder can then explode that newly created list.

Edit to add screenshot example:

 

Userlevel 4
Badge +35

I believe you can also use a StringSearcher and a ListExploder. In the StringSearcher you can put “.{100}” and then under Advanced create a list for All Matches. The ListExploder can then explode that newly created list.

Edit to add screenshot example:

 

Unfortunately when the last part is less than 100 characters in length it is not added to the list.

Userlevel 3
Badge +12

I believe you can also use a StringSearcher and a ListExploder. In the StringSearcher you can put “.{100}” and then under Advanced create a list for All Matches. The ListExploder can then explode that newly created list.

Edit to add screenshot example:

 

Unfortunately when the last part is less than 100 characters in length it is not added to the list.

That is true. In the case where the final string segment’s character length may not be 100, your solution should work better by still providing that final string segment as an output.

Badge +1

Thanks a lot for the answers, they all worked

I ended up using the first method suggested (StringReplacer>AttributeSplitter>ListExploder) which worked without needing PythonCallers etc.

Userlevel 4
Badge +35

Thanks for the feedback, it's nice to know my solution worked for you.

Reply