Question

Address parcing

  • 6 January 2014
  • 4 replies
  • 18 views

Hello,

 

I am looking for some advice on parcing addresses within FME workbench.

 

has anyone worked with automating address parcing to a stadardized format?

 

Thanks 

 

Pat

4 replies

Userlevel 4
Hi,

 

 

on a general basis, I would say that automated address parsing can run the gamut from quite easy to extremely complicated. Some factors that can influence the difficulty include the level of advanced parsing you need to achieve and the number of countries / regions to support.

 

 

Personally, I'd probably try to find a ready made solution rather than trying to make one myself, due to the number of pitfals and hidden complexities.

 

 

I'm a big fan of Python (and by extension the PythonCaller transformer), so my first try would probably be something involving the pyparsing module and the streetAddressParser plugin. You might have to adapt the address parser settings if you're outside of the USA, though, but it looks rather easy to do with a little bit of prior knowledge about Python.

 

 

Good luck!

 

 

David
Badge +14
David is right, this can be simple or super hard, it just depends on your requirments. Typically managing your address rules with FME is no problem at all, it gets tricky when you get addresses entered that do not conform to the rules and in those instances you almost have to prodict what the issues might be. For example you may have a few zip codes entered into your Building Number field, in these cases you need to be able to look for something that looks like a zip code (using a regular expression) then put the matched values in the right field, you don't just want to throw data away that is in the wrong field. This type of thing is possible with FME as is carrying out validation to check if that zip code is actually valid against a list of valid codes... etc etc etc!
Thanks Dave,

 

here is the complexity of this problm in a nut shell.

 

I have over 70 Countiies to process into one data base. They all have submited thier parcels and adress separatly and in different schemas. Some have parsed addresses some have full address fields that need to be parced some have zip codes others don't. place name,prefix, sufix, direction.... and all of this needs to be put into a new/ common schema. 

 

I am not working alone in fact I am a part time tech and just trying to do my part as best I can. 

 

The more I look the more I feel that there is not a one size fits all transformer or tool for what needs to be done here.

 

Thanks again 

 

Pat
Badge +14
Tricky. It sounds like you might have to treat each input differently, you are right, so quite a bit of effort and no 1 transformer will give you the silver bullet. I think the best you can do is strip out the county submissions closest to your target schema and deal with those for some quick wins. Then look at the others and for each field decide which fields 'map' yo which fields in your target schema. If you take this approach then certainly the SchemaMapper will be of some use, though you might find that mapping your data to a conflated intermediate schema might be an interesting starting point before you do the cleanse and parsing to the final schema. Best of luck.

Reply