Skip to main content


I have spent hours
re-engineering my data but couldn’t find a best resolve to automate data to
provide statistics. Thinking, I could
have saved myself a lot of trouble with a simple transformer with set logic
using FME.

 


 


I have a range of attributes containing values
such as "Class1, Class1a, Class3, Class4a". values are picked up from
a drop down list by the surveyors, that sits in an attributed separated by
comma (,).


 


I need to count the number of items in each
attribute, assuming that they were separated by commas. So, the transformer
would need to return the value 4 for my example above and for various other
attributes using the same logic.


I need to use the logic
"If the cell is blank, value= 0, if the cell is not blank, but does not
contain a comma, value=1, otherwise, value= (number of commas +1) i.e. read the
string, if there are two commas in the string, then there must be three values,
and so on.

 


 


Logic is clear of what I'm trying to achieve but, I'm struggling on the bit that
counts the number of commas found in the string.


Anyone got an idea how and which FME transformers can
be applied ?

 


All help will be gratefully appreciated.

Try a StringSearcher with the following regex which is based on searching for all the substrings that does NOT contain a comma:

(o^,]+)

Also define a list name for the matches, e.g.

0684Q00000ArLKmQAN.png

You can then use a ListElementCounter to find the number of comma-separated values in your input string.

If you need this functionality in several places, it's fairly easy to pack it into a custom transformer.


Could you use an attributeSplitter (separator: comma) to create a list and then a listElementCounter to get the number of items in the list?


Thanks David. Magic worked well.

So much valuable. I appreciate


Yes, that's the way to go. Search the strings and store the matched items. Then use the stored items as input to ListElementCounter.

This made my life easier.


I agree that it's a good approach to split the string into individual items then count the number of them. 

Alternatively, you can also replace each item (including trailing delimiter) with a single character (e.g. 'x') then get the length (i.e. number of characters) of the resulting string ('xxx...'), which you can consider as the number of items. For example, this expression returns the number of comma-separated items in the value of 'Building Classification' attribute.

@StringLength(@ReplaceRegEx(@Value(Building Classification),".+?(,|$)",x))

Try a StringSearcher with the following regex which is based on searching for all the substrings that does NOT contain a comma:

(o^,]+)

Also define a list name for the matches, e.g.

0684Q00000ArLKmQAN.png

You can then use a ListElementCounter to find the number of comma-separated values in your input string.

If you need this functionality in several places, it's fairly easy to pack it into a custom transformer.

This is brilliant.  Exactly what I was looking for!

I have Postcode data with an attribute that has either YES or NO in it.  I'm aggregating data by Postcode Sector, then I want to report how many YESs and how many NOs there are in that Sector.  I spent hours and got nowhere putting the values into a List.  But concatenating the values in the Aggregator then using this answer did the job in 5 minutes!


This is brilliant. Exactly what I was looking for!

I have Postcode data with an attribute that has either YES or NO in it. I'm aggregating data by Postcode Sector, then I want to report how many YESs and how many NOs there are in that Sector. I spent hours and got nowhere putting the values into a List. But concatenating the values in the Aggregator then using this answer did the job in 5 minutes!

If you want to count occurrences of values in a list, the ListHistogrammer is what you want


If you want to count occurrences of values in a list, the ListHistogrammer is what you want

I tried that, having found the suggestion in a different question. But I couldn't get it to work. I just got blank values. I could have persevered but this worked instantly:

ListHistogrammer may do the counting better, but you've still got to know which element of the _histogram List NO and YES are in to be able to extract the count. And if you get other values in the data, these could upset your _histogram List, for example:

_histogram{0}.value = 'FISH'

_histogram{0}.count = 3

_histogram{1}.value = 'NO'

_histogram{1}.count = 2

_histogram{2}.value = 'YES'

_histogram{2}.count = 1

If it was only YES and NO in the data, I could probably sort the List, then assume element 0 was NO and element 1 was YES, but FISH puts a spanner in the works. If I have to test the value of each _histogram element to see if it's NO or YES, I will have to work out how many elements there are and loop through the List, testing each one.

The benefit of david_r's solution is that you only count the things you want to count, and don't have to worry about rogue values. That's also the downside though. The sum of the counts may not equal the total number of aggregated features. For example, if I aggregate 10 features but one of them has FISH instead of YES or NO, I will end up with say 4 YESs and 5 NOs, which is one short of the total. But the data could be validated upstream of this part of the process.


Reply