Skip to main content
Solved

How to append a field based on a comparison of string values when naming conventions differ


bcoveney

Hi there,

 

I am trying to find a simple way to append an asset code to a list of facilities. I am running into trouble because there is no standardization of naming for these facilities leading to slightly different names (short forms, etc). I have a master list that I would like to use to compare and append the asset code to when given tabular data that 1) does not include the code and/or 2) does not match the master list naming convention.

 

I can't wrap my head around which transformer to use and would appreciate the help!

Best answer by markatsafe

@bcoveney​  Looks like a fuzzy matching problem. Try the FuzzyStringCompareFrom2Datasets that is available on the FME HUB.

View original
Did this help you find an answer to your question?

5 replies

Forum|alt.badge.img+2
  • January 5, 2021

@bcoveney​ Do you have an example that would help explain the problem?


bcoveney
  • Author
  • January 5, 2021
markatsafe wrote:

@bcoveney​ Do you have an example that would help explain the problem?

Here are snippets of the data tables I am working with. First is the master list I referenced, followed by an example of the data I receive. As you can see, the naming convention for facilities is different - not shown are instances where a shortened name is expanded to its full name or vice versa. Essentially, I want to compare tabular data and attach the 4 digit code to the dataset missing it.

CapturemsCaptureI appreciate your response!


Forum|alt.badge.img+2
  • Best Answer
  • January 5, 2021

@bcoveney​  Looks like a fuzzy matching problem. Try the FuzzyStringCompareFrom2Datasets that is available on the FME HUB.


Forum|alt.badge.img+2

Fuzzy should do the trick as @Mark Stoakes​ mentions. For future reference, you will want to manipulate the data a bit before you do that. The closer you get them to begin with the better off you are in terms of getting a closer match.

 

Example of manipulating the data;

StringReplacer - replace Public School with PS

StringReplacer - strip off (FDK)

StringReplacer - replace '-' with ' - ' to split the dashed vales

 

You don't have to go overboard with the initial cleanup, but it's always a good idea to do some to save you a headache.


bcoveney
  • Author
  • January 6, 2021
jlbaker2779 wrote:

Fuzzy should do the trick as @Mark Stoakes​ mentions. For future reference, you will want to manipulate the data a bit before you do that. The closer you get them to begin with the better off you are in terms of getting a closer match.

 

Example of manipulating the data;

StringReplacer - replace Public School with PS

StringReplacer - strip off (FDK)

StringReplacer - replace '-' with ' - ' to split the dashed vales

 

You don't have to go overboard with the initial cleanup, but it's always a good idea to do some to save you a headache.

I plan on working off of what @Mark Stoakes​ suggested. I did think ahead to clean up the data to ensure an easier match - glad I was on the right track at least!


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings