Skip to main content

Hi there,

 

I am trying to find a simple way to append an asset code to a list of facilities. I am running into trouble because there is no standardization of naming for these facilities leading to slightly different names (short forms, etc). I have a master list that I would like to use to compare and append the asset code to when given tabular data that 1) does not include the code and/or 2) does not match the master list naming convention.

 

I can't wrap my head around which transformer to use and would appreciate the help!

@bcoveney​ Do you have an example that would help explain the problem?


@bcoveney​ Do you have an example that would help explain the problem?

Here are snippets of the data tables I am working with. First is the master list I referenced, followed by an example of the data I receive. As you can see, the naming convention for facilities is different - not shown are instances where a shortened name is expanded to its full name or vice versa. Essentially, I want to compare tabular data and attach the 4 digit code to the dataset missing it.

CapturemsCaptureI appreciate your response!


@bcoveney​  Looks like a fuzzy matching problem. Try the FuzzyStringCompareFrom2Datasets that is available on the FME HUB.


Fuzzy should do the trick as @Mark Stoakes​ mentions. For future reference, you will want to manipulate the data a bit before you do that. The closer you get them to begin with the better off you are in terms of getting a closer match.

 

Example of manipulating the data;

StringReplacer - replace Public School with PS

StringReplacer - strip off (FDK)

StringReplacer - replace '-' with ' - ' to split the dashed vales

 

You don't have to go overboard with the initial cleanup, but it's always a good idea to do some to save you a headache.


Fuzzy should do the trick as @Mark Stoakes​ mentions. For future reference, you will want to manipulate the data a bit before you do that. The closer you get them to begin with the better off you are in terms of getting a closer match.

 

Example of manipulating the data;

StringReplacer - replace Public School with PS

StringReplacer - strip off (FDK)

StringReplacer - replace '-' with ' - ' to split the dashed vales

 

You don't have to go overboard with the initial cleanup, but it's always a good idea to do some to save you a headache.

I plan on working off of what @Mark Stoakes​ suggested. I did think ahead to clean up the data to ensure an easier match - glad I was on the right track at least!


Reply