Skip to main content
Question

Search for partial match of names in two different datasets

  • November 13, 2024
  • 2 replies
  • 86 views

emimarquez4
Participant
Forum|alt.badge.img+5

Hello,

I would like to use FME to search for partial name matches (essentially a fuzzy string comparison) between two columns in different datasets.

The challenge I'm facing is that it’s not a straightforward ID-based comparison. I want to pick a name from a row in one dataset and search for the closest match in a column of names in another dataset.

Does anyone know a way to achieve this in FME, perhaps without relying on IDs?

Thank you!

2 replies

david_r
Celebrity
  • November 13, 2024

If you have a lot of data I’d try to do this in the database, if possible. It’s going to be a lot faster than loading everything into FME and doing all the matching there.

E.g. using Postgres: https://www.postgresql.org/docs/current/fuzzystrmatch.html

If you have to stick with FME, maybe this custom transformer can help: https://hub.safe.com/publishers/safe-lab/transformers/fuzzystringcomparer


nielsgerrits
VIP
Forum|alt.badge.img+54

How much data are we talking about? What I have done is merging the searchstring to all target records (FeatureMerger 1=1) and use the FuzzyStringComparer from the FME hub to find the best match. But it you have a lot of search and a lot of target records, this can take a while. Attached a sample matching 100 search by 100 target values.


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings