Skip to main content

Hi. I'm trying to use a regex in the Python Caller transformer to search a string and return the search result to a new attribute. The regex works fine in the StringSearcher transformer but i'm getting a but stuck with the Python code. So far I have the following code. My table has an attribute called "text_line_data" which contains the string to search. I'm trying to extract the result of the regex search into a new attribute called PyTrackExtents but can't work out how to complete the code. Can anyone help?

import fme

import fmeobjects

import re

def AttributeExtractor(feature):

TextLine = feature.getAttribute('text_line_data')


feature.setAttribute("PyTrackExtents", TextLine)

p = re.compile('.{13}\\..{8}\\..{8}\\..{4}')

m = p.search("PyTrackExtents")

A backslash in a string literal has to be escaped by one more backslash. Alternatively, consider adding an r or R prefix to the literal. See here.

The Python Language Reference > Lexical analysis > Literals > String Literals


Hi

Try something like this:

import fme
import fmeobjects
import re
def FeatureProcessor(feature):
    text_line = feature.getAttribute('text_line_data')
    matches = re.findall('.{13}\..{8}\..{8}\..{4}', text_line)
    if matches:
        feature.setAttribute('PyTrackExtents', matchest0])

Note that there is no point in compiling the regex for each feature. If you need the (potential) speed gains from compiling your regex, you should use the class definition of the PythonCaller, where you compile the expression only once when the object is instantiated, e.g.

import fme
import fmeobjects
import re
class FeatureProcessor(object):
    def __init__(self):
        self.comp_re = re.compile('.{13}\..{8}\..{8}\..{4}')
    def input(self,feature):
        text_line = feature.getAttribute('text_line_data')
        matches = self.comp_re.findall(text_line)
        if matches:
            feature.setAttribute('PyTrackExtents', matches=0])
        self.pyoutput(feature)

Also note that the two above solutions will only return the first result, so if you want to be able to catch multiple numbers per line, you should replace this line:

            feature.setAttribute('PyTrackExtents', matchesÂ0])

with this:

            feature.setAttribute('PyTrackExtents', matches)

This will create an FME list object PyTrackEvents{} with all the numbers found.

David


If this is the order of methods you are using, it looks like you are searching in the wrong order. The line  feature.setAttribute("PyTrackExtents", TextLine) create a new attribute on the feature with the same data as the input. Python can't search the FME attributes, it needs either a string or a reference to a string variable. Right now it looks like you are searching for some GUID in the string 'PyTrackExtents' which it will never find.

Just a quick unchecked guess that might be closer to what you want to achieve:

TextLine = feature.getAttribute('text_line_data') 
matchObject = re.search('.{13}\..{8}\..{8}\..{4}', TextLine) 
if matchObject is not None:
feature.setAttribute("PyTrackExtents", matchObject.Group(0))

Note that the search can return None and that you will need to get the string from the matchObject to pass it to FME. For further reference, please check: https://docs.python.org/2/library/re.html


A backslash in a string literal has to be escaped by one more backslash. Alternatively, consider adding an r or R prefix to the literal. See here.

The Python Language Reference > Lexical analysis > Literals > String Literals

oh, sorry. Escaping is not essential in this case. See the answers from other users.


Thanks everyone. I've got it working based on the first answer. That's saved me a lot of time!

Eric



I'm now looking at using the class definition as suggested by david_r above.  Using the template in the PythonCaller I think the code would be as follows which has a couple of slight changes to the version suggested by David.  In particular the word "pass" on line 3 and at the end and the "close" statement.  Both versions seem to work ok. 

Are the extra bits important? I'm not sure what they do 

def __init__(TEST):
        TEST.comp_re = re.compile('.{13}\..{8}\..{8}\..{4}')
        pass
    def input(TEST,feature):
        text_line = feature.getAttribute('text_line_data')
        matches = TEST.comp_re.findall(text_line)
        if matches:
            feature.setAttribute('PyTrackExtents2', matchesE0])
        else:
            feature.setAttribute('PyTrackExtents2', "No matches")
        TEST.pyoutput(feature)
    def close(TEST):
        pass


I'm now looking at using the class definition as suggested by david_r above.  Using the template in the PythonCaller I think the code would be as follows which has a couple of slight changes to the version suggested by David.  In particular the word "pass" on line 3 and at the end and the "close" statement.  Both versions seem to work ok. 

Are the extra bits important? I'm not sure what they do 

def __init__(TEST):
        TEST.comp_re = re.compile('.{13}\..{8}\..{8}\..{4}')
        pass
    def input(TEST,feature):
        text_line = feature.getAttribute('text_line_data')
        matches = TEST.comp_re.findall(text_line)
        if matches:
            feature.setAttribute('PyTrackExtents2', matchesE0])
        else:
            feature.setAttribute('PyTrackExtents2', "No matches")
        TEST.pyoutput(feature)
    def close(TEST):
        pass

@david_r 



I'm now looking at using the class definition as suggested by david_r above.  Using the template in the PythonCaller I think the code would be as follows which has a couple of slight changes to the version suggested by David.  In particular the word "pass" on line 3 and at the end and the "close" statement.  Both versions seem to work ok. 

Are the extra bits important? I'm not sure what they do 

def __init__(TEST):
        TEST.comp_re = re.compile('.{13}\..{8}\..{8}\..{4}')
        pass
    def input(TEST,feature):
        text_line = feature.getAttribute('text_line_data')
        matches = TEST.comp_re.findall(text_line)
        if matches:
            feature.setAttribute('PyTrackExtents2', matchesE0])
        else:
            feature.setAttribute('PyTrackExtents2', "No matches")
        TEST.pyoutput(feature)
    def close(TEST):
        pass

The Python "pass" statement does absolutely nothing, see here for more info.

You can therefore safely remove both the "pass" statements, but since that would leave the close() method empty, you would also have to remove the method definition. 

EDIT: Btw, your usage of "TEST" various places in your code is confusing because it does not conform to standard Python coding practices. It is not incorrect as such, but migh be quite confusing for anybody picking up your code later :-)


Reply