Skip to main content

Hi guys,

I have data which look like this:

|ID|Number|

|1|1|

|1|3|

|2|1|

|3|1|

|3|2|

|3|3|

I need to group them by ID and pick up only rows with highest Number value. To get such data:

|ID|Number|

|1|3|

|2|1|

|3|3|

Any idea how to accomplish this?

Thanks

Here's one possible solution:

  • Aggregator with Group By on ID, generate list "Numbers"
  • ListSorter descending on list Numbers{}
  • ListIndexer to retrieve item 0 (highest value) from list Numbers{}

Am sure there are others.


Here's one possible solution:

  • Aggregator with Group By on ID, generate list "Numbers"
  • ListSorter descending on list Numbers{}
  • ListIndexer to retrieve item 0 (highest value) from list Numbers{}

Am sure there are others.

 

It seems to work fine after fast check. Thanks

 


Sort by number in descending order and then a duplicate filter on the ID would also work. ID with highest number value would be output via Unique port


Sort by number in descending order and then a duplicate filter on the ID would also work. ID with highest number value would be output via Unique port

 

Seems simpler, didn't know about this highest value in duplicated filter funcianality.

 

Seems simpler, didn't know about this highest value in duplicated filter funcianality.
It's because the DuplicateFilter preserves the input order, which in this case has been established by the Sorter. Pretty elegant solution, I agree.

You could also use a sorter to sort in descending order followed by the sampler with a group by on the ID field and a sampling rate of 1 with a sampling type of First N Features


I agree that the combination of a Sorter and a DuplcateFilter (or a Sampler) is a very elegant way, but there should be several ways as @david_r mentioned. These four ways flashed into my mind so far.

  • Basic statistics: StatisticsCalculator (Group By: ID, Attributes to Analyze: Number, Maximum Attribute: Number)

  • SQL application: InlineQuerier

  • XQuery & JSON application: Sampler(or DuplicateFilter) + JSONTemplater + JSONFlattener

  • Tcl application: Aggregator + TclCaller

FYI.


I agree that the combination of a Sorter and a DuplcateFilter (or a Sampler) is a very elegant way, but there should be several ways as @david_r mentioned. These four ways flashed into my mind so far.

  • Basic statistics: StatisticsCalculator (Group By: ID, Attributes to Analyze: Number, Maximum Attribute: Number)

  • SQL application: InlineQuerier

  • XQuery & JSON application: Sampler(or DuplicateFilter) + JSONTemplater + JSONFlattener

  • Tcl application: Aggregator + TclCaller

FYI.

The StatisticsCalculator was what occurred to me first too. I like that way. But like you say there are so many methods.

 

 


Reply