Hello, I get following error: Python Exception <error>: unbalanced parenthesis Traceback (most recent call last): File "<string>", line 16, in input File "C:\\Program Files\\FME\\fmepython27\\lib\\re.py", line 142, in search return _compile(pattern, flags).search(string) File "C:\\Program Files\\FME\\fmepython27\\lib\\re.py", line 245, in _compile raise error, v # invalid expression error: unbalanced parenthesis Error encountered while calling method `input' PythonFactory failed to process feature PythonFactory failed to process feature A fatal error has occurred. Check the logfile above for details A fatal error has occurred. Check the logfile above for details for using following script; import fmeobjects, re class FeatureProcessor(object): def __init__(self): pass def input(self, feature): SOURCE = feature.getAttribute('SOURCE') MATCH = feature.getAttribute('_list{}.MATCH') if not SOURCE or not MATCH: return feature.removeAttrsWithPrefix('_list') for MATCH in MATCH: if re.search(MATCH, SOURCE, re.IGNORECASE): newFeature = feature.cloneAttributes() newFeature.setAttribute('MATCH', MATCH) self.pyoutput(newFeature) def close(self): pass Any suggestions? Thanks in advance

Question

Python Exception : unbalanced parenthesis

11 years ago
December 4, 2013
18 replies
81 views

dallasbarr
8 replies

Hello,

I get following error:

Python Exception <error>: unbalanced parenthesis

Traceback (most recent call last):

File "<string>", line 16, in input

File "C:\\Program Files\\FME\\fmepython27\\lib\\re.py", line 142, in search

return _compile(pattern, flags).search(string)

File "C:\\Program Files\\FME\\fmepython27\\lib\\re.py", line 245, in _compile

raise error, v # invalid expression

error: unbalanced parenthesis

Error encountered while calling method `input'

PythonFactory failed to process feature

A fatal error has occurred. Check the logfile above for details

for using following script;

import fmeobjects, re class FeatureProcessor(object): def __init__(self): pass def input(self, feature): SOURCE = feature.getAttribute('SOURCE') MATCH = feature.getAttribute('_list{}.MATCH') if not SOURCE or not MATCH: return feature.removeAttrsWithPrefix('_list') for MATCH in MATCH: if re.search(MATCH, SOURCE, re.IGNORECASE): newFeature = feature.cloneAttributes() newFeature.setAttribute('MATCH', MATCH) self.pyoutput(newFeature) def close(self): pass

Any suggestions?

Thanks in advance

Anonymous
0 replies
11 years ago
December 4, 2013

Make sure that your columns are indented properly as Python uses indentation for code blocks. Each line in a code block must begin at the same column as the first line in the code block.

Use your text editor to view hidden characters and do not mix tabs and spaces. A Python-aware editor like PyScriter (Windows) or TextWrangler (Mac) can help you spot these errors.

takashi
7723 replies
11 years ago
December 4, 2013

Hi,

That error could occur when the pattern (the first argument of re.search function) contains parenthesis ( ). Check if there is such a MATCH string. The pattern string should be a valid regular expression, some special characters in the source string (i.e. MATCH in this case) will have to be escaped beforehand. For example: ( --> \\( ) --> \\)

Takashi

takashi
7723 replies
11 years ago
December 5, 2013

If MATCH may contain special characters (meta characters for regular expression), re.escape function could be useful. This function returns a string in which every characters except alphabets and digits are escaped. ----- if re.search(re.escape(MATCH), SOURCE, re.IGNORECASE): -----

The script is from my example? > Add country attribute by searching for words Sorry, I didn't notice that a country name may contain meta characters.

dallasbarr
Author
8 replies
11 years ago
December 5, 2013

Hi Takashi,

Thanks again for your help, I got the script to work but I did not yet reach my final goal.

My final goal would be to load in newsarticles (in spreadsheets) into FME (as the 'SOURCE' attribute) which would then be automatically linked to a spatial location (the 'MATCH'). These locations are not always countries but f.e. sectors (which apparently have some meta characters in them).

The problem I have now is that many different countries have the same named sectors. for example Sector 5 in Japan and Sector 5 in Belgium. This way, a newsarticle that mentions sector 5 in its body (the SOURCE) is often linked to the wrong sector 5. What would be the best way to only link sector names if there is also a match between country names? Do I implement it in the script (I have no python knowledge whatsoever) or do I use FME transformers to link the countries in advance of linking the sectors?

Kind regards,

takashi
7723 replies
11 years ago
December 5, 2013

If possible, could you please show us concrete schemas of the spreadsheet and shapefile (related field columns and some sample contents) ?

takashi
7723 replies
11 years ago
December 5, 2013

Especially I'd like to know schema of the shape file. I guess the table contains 2 attributes - country name and sector name. Does the table look like this?

dallasbarr
Author
8 replies
11 years ago
December 5, 2013

Hi Takashi,

The shape files are something indeed something like this

SECTOR | COUNTRY | FUNCTION | .....

Blok-99 | Japan | commercial|

S-24(AJ) | Japan | Industry |

Blok-99 | Iran | Nuclear |

Blok 5 | Belgium | commerical

Sector 99 | Belgium | Industry |

This is a bit different from the first exercise you helped me with (where I had to retrieve the country name from a newsarticle.

The source data has now already got tables with the sector and the country listed in, I think it would be fairly easy to match then based on these two criteria

Example:

ARTICLE | SECTOR| COUNTRY|

Nuclear powerplant 1st birthday| Blok-99 | Iran |

50% of on hello kitty | Blok-99 | Japan |

I would however like to build a script that could also use an article (with mentioning of the country and the sector in it) as an input

For example:

"In Iran, the Nuclear powerplant in Blok-99 has celebrated its first birthday"

First I would like to match SOURCE to a MATCH based on the country In this case Iran. In Iran, no two sectors have the same name, so I would like the source, that has been matched to iran, look for all the sectors in iran and see if there is a match.

What I have so far is your suggestion that succesfully matches the article with a country if its name is stated in the article.

Hope this is a bit clear :)

takashi
7723 replies
11 years ago
December 5, 2013

It's clear now. See my first post in the previous thread. > Add country attribute by searching for words Before FeatureMerger You can see that the ListBuilder creates these lists since the input table have 2 fields named SECTOR and COUNTRY. _list{}.SECTOR _list{}.COUNTRY After FeatureMerger The features from the ListExploder will have SECTOR and COUNTRY. After this, there is a StringSearcher which searches country name in the article. You don't need to change the workflow before this StringSearcher. Add a second StringSearcher to search sector name in the same article. Then, you can get articles each of which contains a correct pair of COUNTRY and SECTOR. Even if you search SECTOR before searching COUNTRY, the result will be the same. Python example would be like this.

-----

import fmeobjects, re class LocationFinder(object): def __init__(self): pass def input(self, feature): source = feature.getAttribute('SOURCE') countries = feature.getAttribute('_list{}.COUNTRY') sectors = feature.getAttribute('_list{}.SECTOR') if not source or not countries or not sectors: return feature.removeAttrsWithPrefix('_list') for country, sector in zip(countries, sectors): if re.search(re.escape(country), str(source), re.IGNORECASE) \\ and re.search(re.escape(sector), str(source), re.IGNORECASE): newFeature = feature.cloneAttributes() newFeature.setAttribute('COUNTRY', country) newFeature.setAttribute('SECTOR', sector) self.pyoutput(newFeature) def close(self): pass -----

takashi
7723 replies
11 years ago
December 5, 2013

Tips: If the shape table has many attributes other than SECTOR and COUNTRY, consider removing them using the AttributeRemover or the AttributeKeeper before the ListBuilder, so that efficiency maybe goes up.

takashi
7723 replies
11 years ago
December 5, 2013

Another approach flashed. Maybe the InlineQuerier can be used effectively. I'm home now, will try it tomorrow.

dallasbarr
Author
8 replies
11 years ago
December 5, 2013

Thank you Takashi, you are too kind!

I got the script to work (with the 2x stringsearcher)! because there are 33000 different sectors it does take a while to run, like 7 min or so. I will try to run it with a pythonscript tomorrow to see if its faster.

Thank you again!

takashi
7723 replies
11 years ago
December 6, 2013

Hi,

I think efficiency would be important in this subject. More efficient Python script can be also considered, but it would be a little complicated. I think that is not preferable on view points of understandability and maintainability. I expect the InlineQuerier will be simpler and also more efficient. InlineQuerier Settings Example Assume the shape features have attributes named SECTOR and COUNTRY, and the spreadsheet features have attributes named ARTICLE and SOURCE. Inputs Table: Location Columns: SECTOR text COUNTRY text Table: NewsArticle: Columns: ARTICLE text SOURCE text Outputs Output Port: Matched SQL Query: ----- select a.fme_feature_content, b.ARTICLE, b.SOURCE from Location as a cross join NewsArticle as b where b.SOURCE like '%'||a.SECTOR||'%' and b.SOURCE like '%'||a.COUNTRY||'%' ----- Geometry: First Feature With the settings above, "Location" and "NewsArticle" will be created as input ports of the InlineQuerier. Send the shape features to "Location", the spreadsheet features to "NewsArticle". You can get shape features having associated ARTICLE and SOURCE as attributes. "cross join" and the where clause in the SQL statement are the points. "a.fme_feature_content" selects every content (including geometry) from table "a" i.e. Location. So that, the InlineQuerier treats the shape features as something like REQUESTOR on the FeatureMerger.

"a.fme_feature_content" can be replaced with "a.*". The InlineQuerier uses SQLite internally, limitations basically depend on SQLite specifications.

If possible, let us know for future reference which solution is more efficient on the actual data.

Takashi

dallasbarr
Author
8 replies
11 years ago
December 6, 2013

Hi Takashi,

I got the inlinequerier to work and its a lot faster and simpeler than the listexploder and stringsearchers.

Whereas the stringsearchers use 3'45" of computing time, the inlinequery script does it in 9". I did not get the python script to work.

The percentage of matches is also similar, but slightly different. I will now polish the scripts up and try to improve the recognition capacities, see which one works best. I will def let you know the outcome. Thank you once more, it is so nice to be helped by someone halfway around the world.

Very kind regards

david_r
8355 replies
11 years ago
December 6, 2013

Hi,

just a quick heads-up regarding the InlineQuerier: The LIKE operator used for the matching is only case-insensitive for characters inside the ASCII range. For all others it is case sensitive, which may influence the matching.

Example:

"AFGHANISTAN" LIKE "afghanistan" = TRUE
"FØROYAR" LIKE "føroyar" = FALSE

Hopefully this will not be an issue for you.

David

takashi
7723 replies
11 years ago
December 6, 2013

David, thanks for the caution. Yes, case sensitivity could be an issue depending on the requirement. Unfortunately, it seems not to be able to control case sensitivity in the SQL syntax for the InlineQuerier. I also hope it will not be an issue in your project.

dallasbarr
Author
8 replies
11 years ago
December 6, 2013

Maybe I could use the StringCaseChanger in the beginning of the script? or does this also not affects the non-ASCII characters?

david_r
8355 replies
11 years ago
December 6, 2013

Hi,

the StringCaseChanger works for extended characters as well, so that might be a good solution.

David

dallasbarr
Author
8 replies
11 years ago
December 10, 2013

Hi all

A short heads up at my script so far I have two readers:

a. NewsArticle

ID-NA | Text | Country_S | Sector_S

1. ...

... ...

30 ...

b. Shapefiles of all different sectors

ID-S | Sector_dB | Country_dB |

1. ....

... ....

30000 .....

The mail goal is to assign the different Newsarticles to the correct sector. I am doing this by an inlinequery that Takashi suggested. This query's output are all the articles wich are both matched by country as by sector. (lets say 14 out of the 30 articles).

I would now like to use the rest of the articles (that have not been assigned to a specific sector) to be linked at the country level. I have a shapefile of all the countries.

I have come up so far with a query that links all the articles to a country, but I only need the ones that are not yet linked to a sector.

Is there an easy filter method or even better a query to efficiently do this?

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Python Exception : unbalanced parenthesis