Skip to main content

Hi all,

 

I need a hand integrating this python regex script into the python caller, I basically want to run as per below. I have it all working in python but now have been asked (as work uses FME) to do it here. I'm struggling to understand how to integrate python into the FME workflow.

 

if you could assist setting this up in a basic manner I'm sure I will get the hang of it, but have been struggling for a day or two.

 

(Also I know I could not use python, but I want to extend beyond this functionality later and prefer python)

 

My python script is around the lines of:

 

def check_filename(filename):

    pattern = r'^(&A-Z]{3}-\\d{3}-\A-Z]--A-Z]{3}-MOD-\\d{2}-\A-Z]{3}--A-Z]{3}-\\d{4})(-dA-Z]-\\d{2})?(\\.\\w{3,4})$'

    match = re.match(pattern, filename)

   

    return bool(match), filename if not match else ""

 

def check_all_files():

    for filename in filenames_:

        result, reason = check_filename(filename)

        if result:

            passes.append(filename)

        else:

            fails.append((filename, 'Not meeting schema or placed incorrect folder. Please amend to this schema.'))

 

regex__

I've simplified the workbench and attached. Any tips are appreciated.


You could do this without a PythonCaller:

  1. Expose the format attribute "fme_basename" on both readers
  2. Replace the PythonCaller with a StringSearcher on "fme_basename"
  3. The two output ports will tell you which filename matched your regex or not

This should work in your PythonCaller:

import fme
import fmeobjects
import re
 
 
def check_filename(filename):
    pattern = r'^( A-Z]{3}-\d{3}-^A-Z]-]A-Z]{3}-MOD-\d{2}--A-Z]{3}-DA-Z]{3}-\d{4})(-[A-Z]-\d{2})?(\.\w{3,4})$'
    match = re.match(pattern, filename)
    print(match)
    
    return bool(match), filename if not match else ""
 
 
class CheckFilenames(object):
    """Template Class Interface:
    When using this class, make sure its name is set as the value of the 'Class
    to Process Features' transformer parameter.
    """
 
    def __init__(self):
        """Base constructor for class members."""
        pass
 
    def input(self, feature):
        filename_ = feature.getAttribute('fme_basename')
        result, reason = check_filename(filename_)
        feature.setAttribute('result', str(result))
        feature.setAttribute('reason', reason)
        self.pyoutput(feature)
 
    def close(self):
        """This method is called once all the FME Features have been processed
        from input().
        """
        pass
 
    def process_group(self):
        """When 'Group By' attribute(s) are specified, this method is called 
        once all the FME Features in a current group have been sent to input().
 
        FME Features sent to input() should generally be cached for group-by 
        processing in this method when knowledge of all Features is required. 
        The resulting Feature(s) from the group-by processing should be emitted 
        through self.pyoutput().
 
        FME will continue calling input() a number of times followed
        by process_group() for each 'Group By' attribute, so this 
        implementation should reset any class members for the next group.
        """
        pass
 
    def has_support_for(self, support_type):
        """This method returns whether this PythonCaller supports a certain type.
        The only supported type is fmeobjects.FME_SUPPORT_FEATURE_TABLE_SHIM.
        
        :param int support_type: The support type being queried.
        :returns: True if the passed in support type is supported.
        :rtype: bool
        """
        if support_type == fmeobjects.FME_SUPPORT_FEATURE_TABLE_SHIM:
            # If this is set to return True, FME will pass features to the input() method that
            # come from a feature table object. This allows for significant performance gains
            # when processing large numbers of features.
            # To enable this, the following conditions must be met:
            #   1) features passed into the input() method cannot be copied or cached for later use
            #   2) features cannot be read or modified after being passed to self.pyoutput()
            #   3) Group Processing must not be enabled
            # Violations will cause undefined behavior.
            return False
 
        return False

After the PythonCaller, insert a Tester to check for the value of "result", either "True" or "False".

Important: "fme_basename" does not contain the file extension, so either you'll have to add it manually, or to adapt your regex to acccount for this.


This should work in your PythonCaller:

import fme
import fmeobjects
import re
 
 
def check_filename(filename):
    pattern = r'^( A-Z]{3}-\d{3}-^A-Z]-]A-Z]{3}-MOD-\d{2}--A-Z]{3}-DA-Z]{3}-\d{4})(-[A-Z]-\d{2})?(\.\w{3,4})$'
    match = re.match(pattern, filename)
    print(match)
    
    return bool(match), filename if not match else ""
 
 
class CheckFilenames(object):
    """Template Class Interface:
    When using this class, make sure its name is set as the value of the 'Class
    to Process Features' transformer parameter.
    """
 
    def __init__(self):
        """Base constructor for class members."""
        pass
 
    def input(self, feature):
        filename_ = feature.getAttribute('fme_basename')
        result, reason = check_filename(filename_)
        feature.setAttribute('result', str(result))
        feature.setAttribute('reason', reason)
        self.pyoutput(feature)
 
    def close(self):
        """This method is called once all the FME Features have been processed
        from input().
        """
        pass
 
    def process_group(self):
        """When 'Group By' attribute(s) are specified, this method is called 
        once all the FME Features in a current group have been sent to input().
 
        FME Features sent to input() should generally be cached for group-by 
        processing in this method when knowledge of all Features is required. 
        The resulting Feature(s) from the group-by processing should be emitted 
        through self.pyoutput().
 
        FME will continue calling input() a number of times followed
        by process_group() for each 'Group By' attribute, so this 
        implementation should reset any class members for the next group.
        """
        pass
 
    def has_support_for(self, support_type):
        """This method returns whether this PythonCaller supports a certain type.
        The only supported type is fmeobjects.FME_SUPPORT_FEATURE_TABLE_SHIM.
        
        :param int support_type: The support type being queried.
        :returns: True if the passed in support type is supported.
        :rtype: bool
        """
        if support_type == fmeobjects.FME_SUPPORT_FEATURE_TABLE_SHIM:
            # If this is set to return True, FME will pass features to the input() method that
            # come from a feature table object. This allows for significant performance gains
            # when processing large numbers of features.
            # To enable this, the following conditions must be met:
            #   1) features passed into the input() method cannot be copied or cached for later use
            #   2) features cannot be read or modified after being passed to self.pyoutput()
            #   3) Group Processing must not be enabled
            # Violations will cause undefined behavior.
            return False
 
        return False

After the PythonCaller, insert a Tester to check for the value of "result", either "True" or "False".

Important: "fme_basename" does not contain the file extension, so either you'll have to add it manually, or to adapt your regex to acccount for this.

amazing, thank you.


Reply