Question

Add country attribute by searching for words

Forum|Forum|12 years ago
December 3, 2013
4 replies
72 views

dallasbarr

Hello,

I am fairly new to using FME but I wanted to solve a problem using this software.

I have a spreadsheet with all kinds of different newsarticles wich I would like to spatially assign (automatically) to a specific country. In the newsarticles, a country name is often mentioned. I was thinking if there was a way to use some kinf of search transformer which could search for country names (in the articles) and also add a new attribute value (the country name it was assigned to).

I have two readers, the spreadsheet and a shp file of all the countries. It would prove usefull to read all the different search options (all the countries) from the shp file.

Any suggestions? thx in advance

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

takashi
Forum|Forum|12 years ago
December 3, 2013

Hi,

Just an idea. Assume the spreadsheet feature has an attribute named "article", the shape feature has an attribute named "country", an article could contain multiple country names.

(FME 2013 SP4) Put a FeatureMerger transformer, send spreadsheet features to its REQUESTOR port. Send shape features to a ListBuilder to create a feature which has a list attribute containing all country names. List Name: _list And, send the created feature to SUPPLIER port of the FeatureMerger. Specify any constant value (e.g. "1") to both Requestor and Supplier "Join On" parameter of the FeatureMerger. This means performing unconditional merging. Connect a ListExploder to MERGED port of the FeatureMerger. List Attribute: _list{} Connect a StringSearcher to LIST_FOUND port of the ListExploder. Attribute: article (refer to attribute) Regular Expression: country (refer to attribute) Features having a country name and associated article will be outputted from MATCHED port of the StringSearcher. Then, you can merge associated article to the original shape feature with a second FeatureMerger using "country" as key attribute. I think this way is possible, but might be inefficient because it does exploding list attributes. If possible, also consider replacing the ListExploder and the StringSearcher with a PythonCaller.

Takashi

Why not inspect features with Visual/Data Preview and Feature/Record Information before writing them into a destination dataset?

Upvote

D

dallasbarr
Author
Forum|Forum|12 years ago
December 3, 2013

Hi Takashi,

Thank you for the response. However, I think I already have a probem at the Featuremerger transformer. When inspecting this output, all the spreadsheet articles are assigned to the same country (french polynesia in my case).

THis makes that the finale output is all unmerged, but do have a new attribute called "country" and has value: French Polynesia.

So I suspect it to be something wrong in the Featuremerger, maybe it has something to do with the unconditional merging? (French Polynesia is also the first country in the list, so maybe the constant value of 1 refers to this?)

Db

Upvote

D

dallasbarr
Author
Forum|Forum|12 years ago
December 3, 2013

Never mind I got it to work, you were right takashi. Thank you so much for your help!

Upvote

takashi
Forum|Forum|12 years ago
December 4, 2013

It's my pleasure. FYI, the ListExploder and the StringSearcher can be replaced with a PythonCaller with this script, and it will be a little more efficient.

-----

import fmeobjects, re class CountryFinder(object): def __init__(self): pass def input(self, feature): article = feature.getAttribute('article') countries = feature.getAttribute('_list{}.country') if not article or not countries: return feature.removeAttrsWithPrefix('_list') for country in countries: if re.search(country, str(article), re.IGNORECASE): newFeature = feature.cloneAttributes() newFeature.setAttribute('country', country) self.pyoutput(newFeature) def close(self): pass ----- If an article can be tokenized appropriately, it could become more efficient. However, since a country name could consist of multiple words (e.g. French Polynesia), tokenizing will not be easy.

Why not inspect features with Visual/Data Preview and Feature/Record Information before writing them into a destination dataset?

Upvote

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute