How fuzzy do you need it to be? Is it just a question of case (i.e. HIGH STREET = High Street)? Or does it need to be more fuzzy than that (e.g. High Street = HIGH ST)?
A quick Google has come up with the FuzzyStringComparer in the FME Hub.
https://hub.safe.com/transformers/fuzzystringcomparer
If you find limitations with it (e.g. it only works on one dataset), this might be useful:
https://knowledge.safe.com/questions/3776/fuzzy-string-matching-from-two-datasets.html
Another way of merging 2 datasets into one without losing the knowledge of which features belong to which dataset is to expose fme_basename on the Reader(s). You can then connect both dataset to the same input port on a transformer (or to a Junction). Anytime you need to split the data back out (e.g. for the Requestor and Supplier inputs of FeatureMerger), you just use a Tester or AttributeFilter on fme_basename.
you could try a fuzzy stringcomparison.
It's available in python, tcl etc.
Someone put it in a custom transformer so you can download the transformer (just type fuzzy on your canvas)
You wil have to do a Cartesian set comparison (by doing a unconditional featuremerger 1=1 ) or iterate one set by the elements of the other (using custom transformer).
If you don't have huge sets, I'd go for the unconditional merger. Remember to take take care of attributename conflict when using the merger.
you could try a fuzzy stringcomparison.
It's available in python, tcl etc.
Someone put it in a custom transformer so you can download the transformer (just type fuzzy on your canvas)
You wil have to do a Cartesian set comparison (by doing a unconditional featuremerger 1=1 ) or iterate one set by the elements of the other (using custom transformer).
If you don't have huge sets, I'd go for the unconditional merger. Remember to take take care of attributename conflict when using the merger.
@gio
@tim_wood
It doesn't work, or maybe I'm doing something wrong because apparently the check is extremely fuzzy, nothing matches (the ratio value is very low), when I know I should get hits.... However, maybe a regular featuremerger does work, because when I merge on streetname, I do get matches. However, when I add the postalcode attributes to the join clause, I get very few matches.
I think this is due to differing datatypes of the attributes in the datasets I want to merge (?)
One dataset is an excel file, and I was able to easily set the type of the postal code upon loading.
However the other dataset is an FFS (see
https://knowledge.safe.com/questions/57248/wfs-data-not-coming-through.html?childToView=57489#comment-57489 on how I had to create it). I can't change the attribute type of the postal code attribute in this dataset. It's set to "buffer". How can I change this? Or is this even the problem on why the "featuremerger" transformer can't match when I add postal code to the joins?
Hi
@tim_wood
If you add postal code the join is street name and postal code.
Meaning in your case the postal code and street name don't match fully. Street name may belong to 2 or more zones, or data is wrong.
I usually join on a concatenation of postal code, house number and house letter
If available.
Different stringcoding can cause merging to fail. Change it can help.
To know what the problem is, we would need a sample data.
Maybe you can provide some?
you could try a fuzzy stringcomparison.
It's available in python, tcl etc.
Someone put it in a custom transformer so you can download the transformer (just type fuzzy on your canvas)
You wil have to do a Cartesian set comparison (by doing a unconditional featuremerger 1=1 ) or iterate one set by the elements of the other (using custom transformer).
If you don't have huge sets, I'd go for the unconditional merger. Remember to take take care of attributename conflict when using the merger.
https://docs.safe.com/fme/html/FME_Desktop_Documentation/FME_ReadersWriters/ffs/Reader_Directives.htm
"Buffers store unbounded length character or byte strings." I'm not completely sure what that means but it sounds like it's text. You could try using an AttributeCreator to copy the value into a new attribute then delete the old attribute.
I sometimes have to use an AttributeTrimmer to remove whitespace before/after the values.
For UK postcodes, the value may be written with a space in the middle or without e.g. "AA1 2BC" or "AA12BC". There are various ways to add the space if required (e.g. using a Regular Expression) or you could remove the space with a StringReplacer.