Skip to main content

I have 3 attributes that could potentially have duplicates. A, B, C. After finding the duplicates, I have to make a decision on which ones to keep based on attribute D. The scenario is as followed: if A,B,C are the same, take the record where D is not null. If D is null for both records, take any one. If D both have values, take both unless the value in D is the same then take just one. Help!

If the features do not have a geometry, this could work:

Use an Aggregator, group by A, B, C, creating a list for all other attributes.

Use the ListSorter on D (to move the object with a D value to the top of the list).

Then use the ListIndexer (index 0) and ListRemover to keep the first object in the list.

If you do have geometries I would first use the GeometryExtractor to safe the geometry in an attribute and reverse that at the end using the GeometryReplacer.


Thanks @erik_jan! That worked beautifully. I was messing with duplicateFilter and Matcher all without success. Here's my final workspace if anyone happens upon this question.


Hi @tnarladni, if I understood the requirement correctly, the DuplicateFilter can be used like this.

Note: It assumed D always stores a non-empty value or the null. If D could store the empty string or could be missing, a minor change would be necessary depending on how the empty and missing should be treated.


Hi @tnarladni, if I understood the requirement correctly, the DuplicateFilter can be used like this.

Note: It assumed D always stores a non-empty value or the null. If D could store the empty string or could be missing, a minor change would be necessary depending on how the empty and missing should be treated.

Edited the workflow (screenshot).

 

 


Hi @tnarladni, if I understood the requirement correctly, the DuplicateFilter can be used like this.

Note: It assumed D always stores a non-empty value or the null. If D could store the empty string or could be missing, a minor change would be necessary depending on how the empty and missing should be treated.

Ah...I forgot to sort D first, that was my problem.

 

 


Hi @tnarladni, if I understood the requirement correctly, the DuplicateFilter can be used like this.

Note: It assumed D always stores a non-empty value or the null. If D could store the empty string or could be missing, a minor change would be necessary depending on how the empty and missing should be treated.

I have an identical problem, but the second value is a timestamp, so I want to keep the latest.

 

 


Reply