Skip to main content
Solved

How to append a field based on a comparison of string values when naming conventions differ

  • January 5, 2021
  • 5 replies
  • 35 views

bcoveney

Hi there,

 

I am trying to find a simple way to append an asset code to a list of facilities. I am running into trouble because there is no standardization of naming for these facilities leading to slightly different names (short forms, etc). I have a master list that I would like to use to compare and append the asset code to when given tabular data that 1) does not include the code and/or 2) does not match the master list naming convention.

 

I can't wrap my head around which transformer to use and would appreciate the help!

Best answer by markatsafe

@bcoveney​  Looks like a fuzzy matching problem. Try the FuzzyStringCompareFrom2Datasets that is available on the FME HUB.

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

5 replies

Forum|alt.badge.img+2
  • January 5, 2021

@bcoveney​ Do you have an example that would help explain the problem?


bcoveney
  • Author
  • January 5, 2021

@bcoveney​ Do you have an example that would help explain the problem?

Here are snippets of the data tables I am working with. First is the master list I referenced, followed by an example of the data I receive. As you can see, the naming convention for facilities is different - not shown are instances where a shortened name is expanded to its full name or vice versa. Essentially, I want to compare tabular data and attach the 4 digit code to the dataset missing it.

CapturemsCaptureI appreciate your response!


Forum|alt.badge.img+2
  • Best Answer
  • January 5, 2021

@bcoveney​  Looks like a fuzzy matching problem. Try the FuzzyStringCompareFrom2Datasets that is available on the FME HUB.


Forum|alt.badge.img+2

Fuzzy should do the trick as @Mark Stoakes​ mentions. For future reference, you will want to manipulate the data a bit before you do that. The closer you get them to begin with the better off you are in terms of getting a closer match.

 

Example of manipulating the data;

StringReplacer - replace Public School with PS

StringReplacer - strip off (FDK)

StringReplacer - replace '-' with ' - ' to split the dashed vales

 

You don't have to go overboard with the initial cleanup, but it's always a good idea to do some to save you a headache.


bcoveney
  • Author
  • January 6, 2021

Fuzzy should do the trick as @Mark Stoakes​ mentions. For future reference, you will want to manipulate the data a bit before you do that. The closer you get them to begin with the better off you are in terms of getting a closer match.

 

Example of manipulating the data;

StringReplacer - replace Public School with PS

StringReplacer - strip off (FDK)

StringReplacer - replace '-' with ' - ' to split the dashed vales

 

You don't have to go overboard with the initial cleanup, but it's always a good idea to do some to save you a headache.

I plan on working off of what @Mark Stoakes​ suggested. I did think ahead to clean up the data to ensure an easier match - glad I was on the right track at least!