Skip to main content
Question

Fuzzy search for duplicates in one column

  • July 3, 2019
  • 1 reply
  • 47 views

jayqueue
Forum|alt.badge.img

Hello,

I have a dataset

Example:

NameFirstnameAddressConcattedStringMichaelsJohnFirststreet 5Michaels_John_Firststreet 5MichaelsJonFirststreet 5Michaels_Jon_Firststreet 5MychaelJohnFirtstreet 5Mychael_John_Firtstreet 5

 

"ConcattedString" is a field I generated with AttributeCreator because I think it's easier to find duplicates.

I don't want to remove them just want to show possible candidates as a "group".

I experimented with FuzzyDuplicateRemover, Matcher, ... with no luck.

Read the https://knowledge.safe.com/articles/53183/data-qa-identifying-duplicate-attribute-values.html, but I can't figure it out.

Is it even possible? If yes, can someone give me a little push in the right direction? :-)

Using FME 2019 build 19253

 

TIA

 

-Jonathan

Did this help you find an answer to your question?

1 reply

jayqueue
Forum|alt.badge.img
  • Author
  • July 3, 2019

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings