Skip to main content
Question

Remove duplicates


aashnaparikh
Contributor
Forum|alt.badge.img+1

In the reader file I have multiple full names “John Doe”. 

 

I want to ensure my Writer 1 / Output 1 should only have the first entry of John Doe and the rest of the John Doe entry should be mapped to Writer 2 / Output 2. 

 

Ex.

Name Address
John Doe 31 xxx st
John Doe 41 yyy st
John Doe 51 zzz st

 

John Doe on 31 xxx st should go to output 1, and other two John Doe should go to output 2

2 replies

liamfez
Influencer
Forum|alt.badge.img+34
  • Influencer
  • April 2, 2025

Sounds like the DuplicateFilter might do what you want.

You could also use a Sorter followed by a Sampler set to First N Feature, Sampling Rate of 1, with Group Processing grouping on the “Name” attribute.

You could also use a Matcher, lots of ways to do this one. Just need to send the two different output ports to your two different writers and you should be good to go.


DanAtSafe
Safer
Forum|alt.badge.img+18
  • Safer
  • April 2, 2025

Hi ​@aashnaparikh Use a Sampler, grouping by Name, Sampling Rate of 1 and Sampling Type ‘First N Features’.


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings