Skip to main content
Question

Compare every single word from one attribute with the content of second attribute.

  • October 23, 2017
  • 2 replies
  • 57 views

Forum|alt.badge.img

Hello kind people,

 

 

I am beginner with FME. I am working with data which were combined based on the similarity of the records. However, this matching is not perfect and manual review has to be done. I want to automate the review and I have question concerning comparing two records.

I would like to ask if there is any function which is able to compare two attributes for the same record e.g

NAME 1: KROGER STORE 456

 

NAME 2: KROGER FUEL CENTER

It would pass the test, where one word from the NAME 1 (Kroger) is dectected and included into NAME 2. Or word from NAME 2 is detected into NAME 1.

It should base on extracting every single word/number from NAME 1/NAME2 and comparing with every single word/number from NAME2/NAME1.

I will be grateful for help.

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

2 replies

danilo_fme
Celebrity
Forum|alt.badge.img+51
  • Celebrity
  • 2077 replies
  • October 23, 2017

Hi @vid,

I suggest you the use of custom transformer FuzzyStringComparer .

This transformer compare by similarity two attributes.

I simulated here in my machine two attributes:

After i compare using this transformer. The result of similarity will be in a output attribute: _result

Thanks,

Danilo


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • 2252 replies
  • October 23, 2017

use a string searcher on both attributes to get the "words".

Regexp: \\s*(\\w+)\\s*

Set all matches and submatches. Explode lists to separate outputs.

Merge both outputs unconditionaly (1=1).

Now u can use a tester to test word1 = word2.

(Alternatively you can use listconcatenator on one list and then use an "IN" in the tester.)