Skip to main content
Solved

Group By and pick up maximum value rows


Forum|alt.badge.img

Hi guys,

I have data which look like this:

|ID|Number|

|1|1|

|1|3|

|2|1|

|3|1|

|3|2|

|3|3|

I need to group them by ID and pick up only rows with highest Number value. To get such data:

|ID|Number|

|1|3|

|2|1|

|3|3|

Any idea how to accomplish this?

Thanks

Best answer by ebygomm

Sort by number in descending order and then a duplicate filter on the ID would also work. ID with highest number value would be output via Unique port

View original
Did this help you find an answer to your question?

8 replies

david_r
Celebrity
  • August 8, 2017

Here's one possible solution:

  • Aggregator with Group By on ID, generate list "Numbers"
  • ListSorter descending on list Numbers{}
  • ListIndexer to retrieve item 0 (highest value) from list Numbers{}

Am sure there are others.


Forum|alt.badge.img
  • Author
  • August 8, 2017
david_r wrote:

Here's one possible solution:

  • Aggregator with Group By on ID, generate list "Numbers"
  • ListSorter descending on list Numbers{}
  • ListIndexer to retrieve item 0 (highest value) from list Numbers{}

Am sure there are others.

 

It seems to work fine after fast check. Thanks

 


ebygomm
Influencer
Forum|alt.badge.img+32
  • Influencer
  • Best Answer
  • August 8, 2017

Sort by number in descending order and then a duplicate filter on the ID would also work. ID with highest number value would be output via Unique port


Forum|alt.badge.img
  • Author
  • August 8, 2017
ebygomm wrote:

Sort by number in descending order and then a duplicate filter on the ID would also work. ID with highest number value would be output via Unique port

 

Seems simpler, didn't know about this highest value in duplicated filter funcianality.

david_r
Celebrity
  • August 8, 2017
submi wrote:

 

Seems simpler, didn't know about this highest value in duplicated filter funcianality.
It's because the DuplicateFilter preserves the input order, which in this case has been established by the Sorter. Pretty elegant solution, I agree.

ebygomm
Influencer
Forum|alt.badge.img+32
  • Influencer
  • August 8, 2017

You could also use a sorter to sort in descending order followed by the sampler with a group by on the ID field and a sampling rate of 1 with a sampling type of First N Features


takashi
Influencer
  • August 8, 2017

I agree that the combination of a Sorter and a DuplcateFilter (or a Sampler) is a very elegant way, but there should be several ways as @david_r mentioned. These four ways flashed into my mind so far.

  • Basic statistics: StatisticsCalculator (Group By: ID, Attributes to Analyze: Number, Maximum Attribute: Number)
  • SQL application: InlineQuerier
  • XQuery & JSON application: Sampler(or DuplicateFilter) + JSONTemplater + JSONFlattener
  • Tcl application: Aggregator + TclCaller

FYI.


mark2atsafe
Safer
Forum|alt.badge.img+44
  • Safer
  • August 8, 2017
takashi wrote:

I agree that the combination of a Sorter and a DuplcateFilter (or a Sampler) is a very elegant way, but there should be several ways as @david_r mentioned. These four ways flashed into my mind so far.

  • Basic statistics: StatisticsCalculator (Group By: ID, Attributes to Analyze: Number, Maximum Attribute: Number)
  • SQL application: InlineQuerier
  • XQuery & JSON application: Sampler(or DuplicateFilter) + JSONTemplater + JSONFlattener
  • Tcl application: Aggregator + TclCaller

FYI.

The StatisticsCalculator was what occurred to me first too. I like that way. But like you say there are so many methods.

 

 


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings