Skip to main content
Hi FME'ers 

 

 

I want to be able to extract the organisation name out of a very messy string. Currently i have applied around 20 rules but there is one rule/concept that I just can't seem to work out.

 

 

Example String: The Random Organisation Random Street Randomshire RR20 2RR

 

 

I have another dataset that contains a list of all streets, i.e. 'Random Street' will be a record in this 'Street' dataset

 

 

How do i get FME to look through the street dataset and find in the string the name of the street and strip it, including the rest following the street, i.e. Random Street Randomshire RR20 2RR

 

 

Thanks!
Hi Kam,

 

 

If the string always consists of: organization name (one or more any characters) <space> street name (one or more any characters, except space) <space> 'Street' <space> shire name (one or more any characters, except space) <space> lot number (one or more any characters)   StringSearcher with the following expression would extract the elements: ^(.+)\\s(\^\\s]+\\sStreet)\\s(\^\\s]+)\\s(.+)$

 

Takashi
Hi Takashi,

 

 

I guess that is one way of doing it, but I would need to repeat this process for Road, Drive, Way, Crescent etc etc... And then it will not be very accurate as Nottingham has quite a few double worded street names, i.e. Castle Bridge Road

 

 

Is there any way of doing it outside of RegEX too? I.e look into a DBF/CSV file that contains the street name in the string and then uses that street name to be the Regex calculation

 

 

Example:

 

Dataset 1 containing the full string

 

Dataset 2 containing street names

 

 

Both datasets enter one transformer

 

Transformer looks at first feature and checks to see if there is a street name within the string

 

IF So then the string searcher value = StreetName.*

 

 

Which would then strip out the rest of the string from the street name start... and do this for every feature it finds with a street name from the dataset 2 containing street names

 

 

 

I hope this makes sense.

 

 


Hi,

 

 

I would first try to make a match / relation between the organisations and their respective street names. I would try to do this in the database and not using FME, using something like (untested):

 

 

select *

 

from organisation, address

 

where address.street like '%' || organisation.name || '%'

 

 

You could do this with a SQLCreator, for instance. The result should be something one row for each organisation with the matching street name, like:

 

 

org_name

 

street_name

 

 

It would then be a simple matter of using a StringSearcher with a regexp that returns the part of org_name that preceeds street_name.

 

 

David
if that pattern. like mr.Takashi suggests is consistent, i would strip every line from right to left.

 

Using a regexp for the postalcode and zip leaves city, street and org.

 

Stripping city usig a city dataset leaves street and org.

 

Then stripping streets using your adres dataset wold leave Org.

 

At least in your example string.

 

 

 

etc.

Reply