Hi Kam,
If the string always consists of: organization name (one or more any characters) <space> street name (one or more any characters, except space) <space> 'Street' <space> shire name (one or more any characters, except space) <space> lot number (one or more any characters) StringSearcher with the following expression would extract the elements: ^(.+)\\s(\^\\s]+\\sStreet)\\s(\^\\s]+)\\s(.+)$
Takashi
Hi Takashi,
I guess that is one way of doing it, but I would need to repeat this process for Road, Drive, Way, Crescent etc etc... And then it will not be very accurate as Nottingham has quite a few double worded street names, i.e. Castle Bridge Road
Is there any way of doing it outside of RegEX too? I.e look into a DBF/CSV file that contains the street name in the string and then uses that street name to be the Regex calculation
Example:
Dataset 1 containing the full string
Dataset 2 containing street names
Both datasets enter one transformer
Transformer looks at first feature and checks to see if there is a street name within the string
IF So then the string searcher value = StreetName.*
Which would then strip out the rest of the string from the street name start... and do this for every feature it finds with a street name from the dataset 2 containing street names
I hope this makes sense.
Hi,
I would first try to make a match / relation between the organisations and their respective street names. I would try to do this in the database and not using FME, using something like (untested):
select *
from organisation, address
where address.street like '%' || organisation.name || '%'
You could do this with a SQLCreator, for instance. The result should be something one row for each organisation with the matching street name, like:
org_name
street_name
It would then be a simple matter of using a StringSearcher with a regexp that returns the part of org_name that preceeds street_name.
David
if that pattern. like mr.Takashi suggests is consistent, i would strip every line from right to left.
Using a regexp for the postalcode and zip leaves city, street and org.
Stripping city usig a city dataset leaves street and org.
Then stripping streets using your adres dataset wold leave Org.
At least in your example string.
etc.