Question

How to expose the charters held within parentheses

  • 17 November 2017
  • 9 replies
  • 31 views

Badge

I'm working with data held within an attribute which contains a strings of varying length. Within the string the data is grouped within pairs of parentheses. Example of data would be (66) (12.36) (534) (1.6667)

The number of number characters within the parentheses does differ within the data hence the reason for this post. (I had thought I'd resolved my issue with by utilising a series of stringextractor transformers).

I would normally take the data into excel and replace the ( with blank then perform a text to columns with the ( character but I'm trying to learn more about FME.

I require 4 new attributes to contain only:

Attribute 1 = 66

Attribute 2 =12.36

Attribute 3 = 534

Attribute 4 = 1.667

I'm certain that this will be possible and have read some threads on Regex, unfortunately for me I'm not familiar with this language and have failed so far to use the stringsearcher or stringreplacer to resolve my query.

I'm sure other people have come across similar data cleanse issues. I'm using FME desktop 2017


9 replies

Userlevel 5

You can use a StringSearcher with the following regex:

(\d+(\.\d+)?)

0684Q00000ArLukQAF.png

Note how in the StringSearcher you'll need to specify the list name for "All matches list name" under Advanced, e.g. "_all". The list will then contain one element for each number (float or integer) found in your data.

You can then use the ListElementExtractor (from the FME Hub) to rename the list elements, e.g.:

0684Q00000ArLuxQAF.png

Result:

`Attribute1' has value `66'
`Attribute2' has value `12.36'
`Attribute3' has value `534'
`Attribute4' has value `1.6667'
Badge

You can use a StringSearcher with the following regex:

(\d+(\.\d+)?)

0684Q00000ArLukQAF.png

Note how in the StringSearcher you'll need to specify the list name for "All matches list name" under Advanced, e.g. "_all". The list will then contain one element for each number (float or integer) found in your data.

You can then use the ListElementExtractor (from the FME Hub) to rename the list elements, e.g.:

0684Q00000ArLuxQAF.png

Result:

`Attribute1' has value `66'
`Attribute2' has value `12.36'
`Attribute3' has value `534'
`Attribute4' has value `1.6667'
Dave, thanks for your help (again) I think I need a few more steps though. I've placed the StringSearcher and the ListElementExtractor as you suggest. When I place an inspector on the Output of the ListElementExtractor and run the workbench it fails. The log states "Undefined macro 'NEW_ATTRIBUTES' dereferenced in file"

 

Do I need to place 4 new AttributeCreator transformers and name them Attribute1, Attribute2 etc? If so do I place these before the ListElementExtractor or after I? How do I link them?

 

 

I really appreciate  your support with this.

 

(btw I always place an inspector when building workbenches so that I can check the results)

 

 

Userlevel 5
Dave, thanks for your help (again) I think I need a few more steps though. I've placed the StringSearcher and the ListElementExtractor as you suggest. When I place an inspector on the Output of the ListElementExtractor and run the workbench it fails. The log states "Undefined macro 'NEW_ATTRIBUTES' dereferenced in file"

 

Do I need to place 4 new AttributeCreator transformers and name them Attribute1, Attribute2 etc? If so do I place these before the ListElementExtractor or after I? How do I link them?

 

 

I really appreciate your support with this.

 

(btw I always place an inspector when building workbenches so that I can check the results)

 

 

Can you post a screenshot of your ListElementExtractor settings?
Badge

You can use a StringSearcher with the following regex:

(\d+(\.\d+)?)

0684Q00000ArLukQAF.png

Note how in the StringSearcher you'll need to specify the list name for "All matches list name" under Advanced, e.g. "_all". The list will then contain one element for each number (float or integer) found in your data.

You can then use the ListElementExtractor (from the FME Hub) to rename the list elements, e.g.:

0684Q00000ArLuxQAF.png

Result:

`Attribute1' has value `66'
`Attribute2' has value `12.36'
`Attribute3' has value `534'
`Attribute4' has value `1.6667'

 

Here is my copy

 

 

Userlevel 5

 

Here is my copy

 

 

I can't see anything.
Badge

You can use a StringSearcher with the following regex:

(\d+(\.\d+)?)

0684Q00000ArLukQAF.png

Note how in the StringSearcher you'll need to specify the list name for "All matches list name" under Advanced, e.g. "_all". The list will then contain one element for each number (float or integer) found in your data.

You can then use the ListElementExtractor (from the FME Hub) to rename the list elements, e.g.:

0684Q00000ArLuxQAF.png

Result:

`Attribute1' has value `66'
`Attribute2' has value `12.36'
`Attribute3' has value `534'
`Attribute4' has value `1.6667'
listelementbuilder.jpg

 

 

Userlevel 5

See the attached template for an example: listelementextractor.fmwt

Tested with FME 2017.1.1.0

Userlevel 3
Badge +17
Dave, thanks for your help (again) I think I need a few more steps though. I've placed the StringSearcher and the ListElementExtractor as you suggest. When I place an inspector on the Output of the ListElementExtractor and run the workbench it fails. The log states "Undefined macro 'NEW_ATTRIBUTES' dereferenced in file"

 

Do I need to place 4 new AttributeCreator transformers and name them Attribute1, Attribute2 etc? If so do I place these before the ListElementExtractor or after I? How do I link them?

 

 

I really appreciate your support with this.

 

(btw I always place an inspector when building workbenches so that I can check the results)

 

 

Unfortunately it's a bug in FME 2017.0. If you could, upgrade FME to the latest version.

 

Alternatively, you can use the AttributeRenamer instead of the ListElementExtractor in order to avoid the error.

 

 

Badge

Hi @jez

you can also use AttributeSplitter with Delimiter or Format String set to )(. This will generate a list attribute - you can then create attributes from list items if necessary. The last step will be to get rid of ( in the very first attribute and ) in the last attribute - StringReplacer will help with this task.

Reply