Skip to main content

Hi I have the following table after doing a database query

ID overlaps Count Email

0 2 1 PublicOps@sts.com

0 2 1 PrivateOps@sts.com

1 7 3 PublicOps@sts.com

1 7 3 PrivateOps@sts.com

1 7 5 PublicOps@sts.com

1 7 5 PrivateOps@sts.com

2 5 3 PublicOps@sts.com

2 5 5 PrivateOps@sts.com

For each ID, I need to retain the records with maximum count i.e.

for example above, I need to output to be

 

ID overlaps Count Email

0 2 1 PublicOps@sts.com

0 2 1 PrivateOps@sts.com

1 7 5 PublicOps@sts.com

1 7 5 PrivateOps@sts.com

2 5 5 PrivateOps@sts.com

 

Following this, I need to join output with table 2 to determine if the count is higher than the level value

for each ID. If it does retain in output, otherwise discard record from table 1

 

Table 2

--------

ID Level

0 1

1 3

2 3

 

 

Hence output expected is:

 

ID overlaps Count Email

1 7 5 PublicOps@sts.com

1 7 5 PrivateOps@sts.com

2 5 5 PrivateOps@sts.com

 

I have tried using a sorted a Sorter group by ID/Count/Address , followed by a Sampler

but I am not getting the desired output

Any suggestions?

 

A possible ways is: calculate the maximum of Count value for each ID with the StatisticsCalculator, merge Level to the features with the FeatureJointer, then filter them with the Tester according the criteria - Count = max count AND Level < Count. The FeatureMerger or the DatabaseJoiner can also be used instead of the FeatureJointer.

0684Q00000ArJvLQAV.png

Alternatively, if the two tables belong to an identical database, you could get the result by executing a single SQL statement. e.g.

select a.* from table_1 as a
inner join (select ID, max(Count) as max_count from table_1 group by ID) as b
on a.ID = b.ID and a.Count = b.max_count
inner join table_2 as c
on a.ID = c.ID and c.Level < a.Count

 


Reply