I have two datasets, 1 and 2, containing addresses. The address consist of all the 4 fields.
I'm trying to compare the two datasets, to find matches between the two. I have already used a stringreplacer, to remove all the spaces in the postalcodes, and to replace all the special characters in the streetnames.
In dataset 2, the addition2 field contains a lot of random information, as you can see. I need help compiling a python script (to be used in the python caller transformer or any other transformer) that's able to:
- detect when there's more than one character in the addition2 field (I expect that field to contain only one character) and filter that record out;
-from the filtered records, detect the first (or maybe first three?) characters in the addition2 field, and use that to recheck the address in dataset 2 with address1, to see if there's a hit.
- if there's no hit, I want fme to ignore the information in the addtion2 field, see it as an empty field, and then compare again with the address in dataset1.
I'm not a programmer, but do have the ambition to learn. However, I'm just noticing the mismatch in the addition2 field, and there's no time to get my Python learning groove on right now...
Suggestions for a workflow will do, too.