Skip to main content

Hello, I have a question:

 

I'm looking for an transformer to cut the value of an attribute into predefined pieces and save these pieces into new attributes.

 

In Pentaho there is an attribute called StringCutter (stringcutter.jpg).

Of course I can also use the AttributeCreator with different attribute values (i.e.

@Substring(@Value(attribute),0,2)) but I find the StringCutter very clear in the presentation.

I have tried an attributerangefilter and several others but I only get unfiltered output. Is there a transformer in FME similar to Pentaho's Stringcutter?

The AttributeSplitter is probably what you're looking for, it can take either a delimiter or a format string. A format string is what you'll need


Just to expand on the above as last time I looked it did not appear to be clearly documented.

If you have a 10 character string and you wanted to cut into 2 characters 4 characters 4 characters

in the format string box type 2s4s4s

The first 2 characters get stored in _list{0}, the first group of 4 in _list{1} and the second group of 4 in _list{2}


If you want to do this as the data is being read, I would suggest the CSV format reader (if there is a delimiter character between fields) or the CAT format reader (if the data is aligned to particular columns).


Have a look at the SubstringExtractor. It is doing exactly what you are describing.


Unfortunately there is no delimiter so I have to split these items on there fixed position in the attribute.

 

The content of an attribute:

000413FMEworldstree <89 spaces> 201504200000000000BooLax main branch 06

 

I have to use 13 SubstringExtractors to split the information correctly:

00 (SE1)

 

0413 (SE2)

 

FMEworldstree (SE3)

 

<spaces> (SE4 to SE9)*

 

20160420 (SE10)

 

0000000000 (SE11)

 

BooLax main branch (SE12)

 

06 (SE13)

 

I have tried the AttributeSplitter with a FormatString (2s4s44s84s104s108s116s118s126s127s147s149s254s).

Because of the <spaces>* the outcome of the FormatString does not match the data anymore. Variating with the other parameters will do no good.

The list is represented as

 

list0 00

 

list1 0413

 

list2 FMEworldstr

 

list4 201504200000000000BooLa

 

list5 x main branch 06

 

list6to13 remain empty

So my best option for now seems to be the SubstringExtractor.

Thanks all for your imput.


Unfortunately there is no delimiter so I have to split these items on there fixed position in the attribute.

 

The content of an attribute:

000413FMEworldstree <89 spaces> 201504200000000000BooLax main branch 06

 

I have to use 13 SubstringExtractors to split the information correctly:

00 (SE1)

 

0413 (SE2)

 

FMEworldstree (SE3)

 

<spaces> (SE4 to SE9)*

 

20160420 (SE10)

 

0000000000 (SE11)

 

BooLax main branch (SE12)

 

06 (SE13)

 

I have tried the AttributeSplitter with a FormatString (2s4s44s84s104s108s116s118s126s127s147s149s254s).

Because of the <spaces>* the outcome of the FormatString does not match the data anymore. Variating with the other parameters will do no good.

The list is represented as

 

list0 00

 

list1 0413

 

list2 FMEworldstr

 

list4 201504200000000000BooLa

 

list5 x main branch 06

 

list6to13 remain empty

So my best option for now seems to be the SubstringExtractor.

Thanks all for your imput.

Hi @perry I believe that the AttributeSplitter with Format String setting is the best way to split a string into multiple fixed length parts. The numbers specified in a format string should represent the NUMBER OF CHARACTERS for each part. e.g. if you want to split a string of 10 characters into 3 parts - 3, 5, 2 characters, the format string should be 3s5s2s. I would recommend you to check again whether your format string represents the actual format specifications.
Hi @perry I believe that the AttributeSplitter with Format String setting is the best way to split a string into multiple fixed length parts. The numbers specified in a format string should represent the NUMBER OF CHARACTERS for each part. e.g. if you want to split a string of 10 characters into 3 parts - 3, 5, 2 characters, the format string should be 3s5s2s. I would recommend you to check again whether your format string represents the actual format specifications.
Yes agreed, a single attributesplitter should do it with the correct string format.

 

 


Yes, I made a mistake in the FormatString, After correcting the AttributeSplitterParameter is works. Thnks all.


Reply