Skip to main content

I have a table called PROJECTS with some records sharing similar names. The objective is to find these records and merge them to one if possible.

 

Sample

PROJECTS table with Field1, Field2, Field3

 

FindSimilar 

From the above table, I want to look into Field3 and find similar names, such as Paper Company, Paper Company. and Paper Co and merge these into a single record. They are not exactly duplicates, but share a similar name. There are more records with similar names, so I can't hard-code Paper Company because there are more records alike that share similar names.

 

What transformers can I use to complete such task?

 

Thank you.

 

Hi @messagemauri​ 

 

I simulated here your case and had a great results using the custom transformer FuzzyDuplicateRemover:

Workspace_Duplicate 

As you can see, I configurated the ratio of similarity for 70 %.

Results - Unique:

UniqueResults - Duplicate:

Duplicate 

 

Thanks in Advance,

Danilo


Hi @messagemauri​ 

 

I simulated here your case and had a great results using the custom transformer FuzzyDuplicateRemover:

Workspace_Duplicate 

As you can see, I configurated the ratio of similarity for 70 %.

Results - Unique:

UniqueResults - Duplicate:

Duplicate 

 

Thanks in Advance,

Danilo

Thank you @danilo_fme​ . This will give me a starting point. 👍


Thank you @danilo_fme​ . This will give me a starting point. 👍

Awesome! Thanks your feedback @messagemauri​ 


Reply