I've written a workspace in the 2015 version of FME, with five readers, which all have one column headed 'REFERENCE'. The transformer 'DuplicateRemover' is producing two outputs: 'Unique' and 'Duplicate', but the inspectors reveal that both outputs contain unique values. In 'Duplicate', there are both duplicate values and unique values. Why is this, and how can I resolve this to produce only duplicate values in the 'Duplicate' output?
As far as I know the basic behavior hasn't changed in the recent versions, so here's from the documentation:
This transformer outputs to the Duplicate port any feature whose key attribute value is the same as one previously encountered. Any feature whose key value has not yet appeared is output via the Unique port.
The first feature with a unique key value is output via the Unique port – and subsequent features that have the same key value are output via the Duplicate port.
It works as a filter and is in FME 2016 renamed to DuplicateFilter (A fetaure is either Unique = first appearence or Duplicate = all other appearances).
See this link for more info: DuplicateFilter
If you just want all duplicate values, using a matcher is a better option.
Thank you all for your help. It makes sense now as to why my duplicate output is producing apparently unique values. Am currently working on using the matcher transformer. May need to get in touch again!
Unfortunately I'm still struggling with this issue. In essence, I want to produce a single list which features every instance of each duplicate, regardless of geometry.
Following the above discussion I rewrote my workspace, using the Matcher transformer, but found that it did not account for cases where my 'REFERENCE' attribute values were duplicated, but the geometry was unique.
Is there a non-geometry based version of the Matcher, or a way to use the Matcher but discount geometry?
If the answer to the above question is no, I feel I need a second transformer which connects the 'Matched' output to the 'NotMatched' output, and then compares the two to produce two outputs: one which contains all duplicate instances, and one which contains all cases which have no duplicates.
Thanks in advance.
With the matcher you can choose no geometry matching
With the matcher you can choose no geometry matching