Question

Compare Two Lists with FuzzyStringComparer and Grouping

8 years ago
August 26, 2016
2 replies
51 views

rileym
28 replies

Hello kind people

I have a workspace which brings in two datasets, each of which holds a property reference number and a name. What I need to do is compare the similarity of the names held in each dataset for each property. Both datasets have multiple (but different numbers of) names per property i.e.

DATASET 1:

1, BOB

1, ANNA

2, TED

DATASET 2:

1, BOBBY

1, TINA

1, IAN

2, TIM

2, BOB

What I want to do is use FuzzyStringComparer to evaluate each record in dataset 1 against all other records in dataset 2 which share the same property reference number.

So in the example above, the highest value FuzzyStringComparer would return would be for property 1 (BOB v BOBBY).

Can anyone offer any suggestions on how best to approach this?

Thanks in advance,

Riley

takashi
7707 replies
8 years ago
August 27, 2016

Hi @rileym, you can use the FeatureMerger and ListExploder to create every combination of names with the same reference number.

FeatureMerger: Send dataset 1 features to the Requestor port, send dataset 2 features to the Supplier port. Set reference number attribute to the 'Join On' parameter, check the 'Generate List' and set a list name to the 'List Name' parameter.
ListExploder: Connect a ListExploder to the 'Merged' port and then set the list name to its 'List Attribute' parameter.

Each output feature from the ListExploder will have a combination of names from the two datasets. You can then compare them with the FuzzyStringComparer.

Alternatively, the InlineQuerier can also be used.

rileym
Author
28 replies
8 years ago
August 31, 2016

takashi wrote:

Hi @rileym, you can use the FeatureMerger and ListExploder to create every combination of names with the same reference number.

FeatureMerger: Send dataset 1 features to the Requestor port, send dataset 2 features to the Supplier port. Set reference number attribute to the 'Join On' parameter, check the 'Generate List' and set a list name to the 'List Name' parameter.
ListExploder: Connect a ListExploder to the 'Merged' port and then set the list name to its 'List Attribute' parameter.

Each output feature from the ListExploder will have a combination of names from the two datasets. You can then compare them with the FuzzyStringComparer.

Alternatively, the InlineQuerier can also be used.

thanks @takashi and sorry for the slow reply. For some reason I wasn't notified of your reply, but I'd got to the same solution eventually.

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Compare Two Lists with FuzzyStringComparer and Grouping