Skip to main content

I have an attribute that contains a string representing a full street address (123 Somewhere St). I want to extract the number (123) and pass it to a different attribute (CustomerAddressNumber). I also need to extract all the characters (Somewhere St) and pass that to another attribute (CustomerAddressStreet). I'm sure this is simple, I just cant seem to get it straight at the moment.

Using regular expressions in the StringReplacer can do this for you. I would copy the original attribute in two new attributes and then remove the non wanted charachters with nothing.

For the numeric values this would be [0-9]*, for the alphanumeric [a-z,A-Z]*

I am sure other regular expressions are possible.


Using regular expressions in the StringReplacer can do this for you. I would copy the original attribute in two new attributes and then remove the non wanted charachters with nothing.

For the numeric values this would be [0-9]*, for the alphanumeric [a-z,A-Z]*

I am sure other regular expressions are possible.

This works well. But it leads to one additional question. How would I remove a single space left at the beginning of the Street Name ( Somewhere St)? I tried adding |\\s to the expression ([0-9]*|\\s) and get this (SomewhereSt). It removes all spaces. I'm not real familiar with regular expressions.


Using regular expressions in the StringReplacer can do this for you. I would copy the original attribute in two new attributes and then remove the non wanted charachters with nothing.

For the numeric values this would be [0-9]*, for the alphanumeric [a-z,A-Z]*

I am sure other regular expressions are possible.

Got it (^\\s|[0-9]*). Thanks for the help.


I'll also point you to this exercise in the FME Desktop training manual.

In short it takes an address like "3305 W 10th Av" and splits it up into "3305" "W 10th Av". It doesn't use regex, instead it uses an AttributeSplitter. It's not a perfect solution (it assumes a maximum of four elements to the address) but it's definitely along the lines of what you are asking for.


I would actually do it with a single StringSearcher with the expression:

^([0-9A-Z]*) ([0-9A-Z ]*)

 

(note the white space between the two parentheses)

 

 

The _match{0}.part would be the building number and the _match{1}.part would be the street.

An attributeRenamer could rename them to simple attributes. (Note that the AttributeManager does not currently support renaming single elements of a list)

 

 

That would allow for addresses like

350 5th Avenue or 221B Baker Street


I would actually do it with a single StringSearcher with the expression:

^([0-9A-Z]*) ([0-9A-Z ]*)

 

(note the white space between the two parentheses)

 

 

The _match{0}.part would be the building number and the _match{1}.part would be the street.

An attributeRenamer could rename them to simple attributes. (Note that the AttributeManager does not currently support renaming single elements of a list)

 

 

That would allow for addresses like

350 5th Avenue or 221B Baker Street

CustAddress contains "123 Somewhere St". With the StringSearcher configured as shown the result in field "_first_match" is "123 Somewhere". How do I get _match{0} and _match{1}?

CustAddress contains "123 Somewhere St". With the StringSearcher configured as shown the result in field "_first_match" is "123 Somewhere". How do I get _match{0} and _match{1}?

Click on the advanced tab and enter a name (_match) in the subexpression matches list name.

 


Click on the advanced tab and enter a name (_match) in the subexpression matches list name.

 

got it thanks.


depends on the compostion of your adresseses

something like "strname number letter postalcode" would require

(.*)\\s+(\\d+\\w{1})\\s+(\\d{4}\\w{2})

Then expose the attributes, in my case that would be matched_part{0-3}

recommend searching sites on regexp.

There are full and good tutorials out on the net.


check out the one by

Jan Goyvaerts. (regexbuddy)

(ued to be free, now not so..)

got it thanks.

Upon further reflection I would change the regex to

 

^((0-9A-Z]*) ((0-9A-Z -.]*)

ex.

 

124 Blvd. Saint-Germain


This issue is very old, but if you want one group of numbers in a simple variable, try [0-9]+ . The plus make difference in the result.


Reply