Skip to main content
Question

Find similar records and merge them to one

  • March 3, 2022
  • 3 replies
  • 99 views

messagemauri
Contributor
Forum|alt.badge.img+6

I have a table called PROJECTS with some records sharing similar names. The objective is to find these records and merge them to one if possible.

 

Sample

PROJECTS table with Field1, Field2, Field3

 

FindSimilar 

From the above table, I want to look into Field3 and find similar names, such as Paper Company, Paper Company. and Paper Co and merge these into a single record. They are not exactly duplicates, but share a similar name. There are more records with similar names, so I can't hard-code Paper Company because there are more records alike that share similar names.

 

What transformers can I use to complete such task?

 

Thank you.

 

3 replies

danilo_fme
Celebrity
Forum|alt.badge.img+51
  • Celebrity
  • 2077 replies
  • March 4, 2022

Hi @messagemauri​ 

 

I simulated here your case and had a great results using the custom transformer FuzzyDuplicateRemover:

Workspace_Duplicate 

As you can see, I configurated the ratio of similarity for 70 %.

Results - Unique:

UniqueResults - Duplicate:

Duplicate 

 

Thanks in Advance,

Danilo


messagemauri
Contributor
Forum|alt.badge.img+6
  • Author
  • Contributor
  • 37 replies
  • March 4, 2022

Hi @messagemauri​ 

 

I simulated here your case and had a great results using the custom transformer FuzzyDuplicateRemover:

Workspace_Duplicate 

As you can see, I configurated the ratio of similarity for 70 %.

Results - Unique:

UniqueResults - Duplicate:

Duplicate 

 

Thanks in Advance,

Danilo

Thank you @danilo_fme​ . This will give me a starting point. 👍


danilo_fme
Celebrity
Forum|alt.badge.img+51
  • Celebrity
  • 2077 replies
  • March 4, 2022

Thank you @danilo_fme​ . This will give me a starting point. 👍

Awesome! Thanks your feedback @messagemauri​