Skip to main content
Solved

Most efficient way to count multiple distinct values

  • October 13, 2016
  • 13 replies
  • 396 views

ebygomm
Influencer
Forum|alt.badge.img+44

I have a workspace producing various outputs, I'd like to produce a further output which details the number of distinct customers and number of distinct products for each output but this appears to be either convoluted involving multiple steps and statistics calculators or slow involving a count distinct in the inlinequerier. Wondering if I'm missing a trick

Best answer by larry

This custom transformer should do the trick.

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

13 replies

gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • 2252 replies
  • October 13, 2016

You can use listbuilder followed by a , listhistogrammer

GroupBy the Attribute you want to have distinct element counted for.


ebygomm
Influencer
Forum|alt.badge.img+44
  • Author
  • Influencer
  • 3430 replies
  • October 13, 2016

I don't think that will work, the two things are independent.

e.g. in the following I would want the customer value to be 2 and the product value to be 3


itay
Supporter
Forum|alt.badge.img+18
  • Supporter
  • 1442 replies
  • October 13, 2016

Python to the rescue? or is it R time?


Forum|alt.badge.img+7
  • 178 replies
  • October 13, 2016

Hi @egomm,

When I am in this situation I make use of a PythonCaller in order to create statistics. If you would need some help with that, just let me know!


Forum|alt.badge.img
  • 173 replies
  • Best Answer
  • October 13, 2016

This custom transformer should do the trick.


itay
Supporter
Forum|alt.badge.img+18
  • Supporter
  • 1442 replies
  • October 13, 2016

This custom transformer should do the trick.

nice, so it was python to the rescue after all....

 

 


Forum|alt.badge.img
  • 173 replies
  • October 13, 2016

This custom transformer should do the trick.

Just uploaded it to the hub: AttributeValueCounter

 

 

 


takashi
Celebrity
  • 7843 replies
  • October 14, 2016

I also think that Python scripting would be an efficient solution, but if the destination dataset is a database, SQL querying might also be worth to try.


ebygomm
Influencer
Forum|alt.badge.img+44
  • Author
  • Influencer
  • 3430 replies
  • October 14, 2016

This custom transformer should do the trick.

 

Perfect,thanks

gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • 2252 replies
  • October 14, 2016

@egomm

Your example can easily be solved by..

Matcher followed by a statisticscalculator grouped by fme_feature_type...

Put in a custom for ease of use...

When using custom explode the features so you can use attr_name as input

No need for python or whatever.

And if its in a database then of course you would use sql.


  • 17 replies
  • May 2, 2018

This custom transformer should do the trick.

I am trying to use this transformer to count all the unique values in a column. but does seem to work. need some help. My data as below.

 

A

 

1

 

1

 

1

 

2

 

3

 

0

 

0

 

 

Output should look like:

 

0 -> 2

 

1 -> 3

 

2 -> 1

 

3 -> 1

 

 


takashi
Celebrity
  • 7843 replies
  • May 2, 2018
I am trying to use this transformer to count all the unique values in a column. but does seem to work. need some help. My data as below.

 

A

 

1

 

1

 

1

 

2

 

3

 

0

 

0

 

 

Output should look like:

 

0 -> 2

 

1 -> 3

 

2 -> 1

 

3 -> 1

 

 

Hi @sobanmughal, The AttributeValueCounter counts the number of distinct values for each attribute, I don't think it's suitable to your requirement. You can just use the StatisticsCalculator here. Set these parameters and check the resulting features output from the Summary port.

 

  • Group By: A
  • Attributes to Analyze: A
  • Total Count Attribute: <an attribute name which will store desired count value>

  • 17 replies
  • May 2, 2018
Hi @sobanmughal, The AttributeValueCounter counts the number of distinct values for each attribute, I don't think it's suitable to your requirement. You can just use the StatisticsCalculator here. Set these parameters and check the resulting features output from the Summary port.

 

  • Group By: A
  • Attributes to Analyze: A
  • Total Count Attribute: <an attribute name which will store desired count value>
Thanks takashi, I have found another custom transformer that can do the same thing.

 

UniqueValueLogger

 

 

https://hub.safe.com/transformers/uniquevaluelogger