I use AttributeCreator to prepare a list of columns (A, B, C, D, etc..) for an Excel writer. My problem is that the list is 4000 items long. I have to manually expose every one of the column labels, which is not feasible. I have looked up and down and did not find a solution. Is there a way with PythonCaller to label a feature as exposed?
Â
Â
Â
Hi @fjcharp,
Could you share us your Workspace template ( .FMWT )?
Thanks,
Danilo
Hi @fjcharp,
Could you share us your Workspace template ( .FMWT )?
Thanks,
Danilo
Â
I certainly can. moe.zipÂ
The model reads in a table that I need to pivot and write to Excel. The steps are self-explanatory. The reader included here reads a small subset of the complete set (for testing purposes), but the subset is still large enough to require more columns than the ones I manually exposed.Â
You can ignore the PythonCaller component: it is work in progress that simply allowed me to examine the features output by AttributeCreator.Â
So the problem is: manually exposing an excessive number of attributes in AttributeCreator is painful and not practical. So my thought was: why don't I create the Excel column name attribute in the lists features (emulating what AttributeCreator does) and then expose these newly created attributes? That's where I hit the proverbial wall :-)Â
FredericHi @fjcharp, I think it would be easier to rename the attribute to destination Excel column name before aggregating the item features for each original CSV field name, rather than once creating a list then renaming its elements.
FME can rename the attribute to destination Excel column name and also configure destination feature type schema automatically, if you implement your workspace as a dynamic workflow. I think the method described in this article would be suitable to this case.
Dynamic Workflows: Destination Schema is Derived from a Schema Feature
Additionally, the NumToAlphaConverter custom transformer from FME Hub can be used effectively to generate a series of Excel column names based on a sequence of numbers.See the attached workspace example to learn more: moe-2.zip (FME 2017.0.1.1)Hope this helps.Hi @fjcharp, I think it would be easier to rename the attribute to destination Excel column name before aggregating the item features for each original CSV field name, rather than once creating a list then renaming its elements.
FME can rename the attribute to destination Excel column name and also configure destination feature type schema automatically, if you implement your workspace as a dynamic workflow. I think the method described in this article would be suitable to this case.
Dynamic Workflows: Destination Schema is Derived from a Schema Feature
Additionally, the NumToAlphaConverter custom transformer from FME Hub can be used effectively to generate a series of Excel column names based on a sequence of numbers.See the attached workspace example to learn more: moe-2.zip (FME 2017.0.1.1)Hope this helps.Â
Congratulations @takashiTakashi,
Wow... Thank you so much for taking the time. The modified template is quite the tutorial. I have learned a lot from it.
Frederic
In this case, replacing more transformers with Python codes could increase the performance drastically. For example:
# PythonCaller Script Example
import fmeobjects
# Get Excel Column Name (A,B,C,...) corresponding to a column index.
# index: 0-based column index
def xlsxColumnName(index):
    alpha = ''
    while 0 <= index:
        alpha = chr(index % 26 + 65) + alpha
        index = index // 26 - 1
    return alpha
class FeatureProcessor(object):
    def __init__(self):
        self.column_index = 0
        self.is_target = lambda attr: \
            not '(1900/01/00)' in attr \
            and not attr.startswith('fme_') \
            and not attr.startswith('csv_')
           Â
    def input(self, feature):
        # Collect target attribute names.
        attribs = da for a in feature.getAllAttributeNames() if self.is_target(a)]
       Â
        # First column (column index = 0) values are attribute names.
        # Assume the first incoming feature has all target attributes.
        if self.column_index == 0:
            # Create and output a schema feature at first.
            numColumns = int(FME_MacroValues 'NUM'])
            schema = fmeobjects.FMEFeature()
            schema.setAttribute('fme_feature_type_name', \
                FME_MacroValuesÂ'DestExcelSheetName'])
            schema.setAttribute('fme_schema_handling', 'schema_only')
            for i in range(numColumns):
                schema.setAttribute('attribute{%d}.name' % i, xlsxColumnName(i))
                schema.setAttribute('attribute{%d}.fme_data_type' % i, 'fme_real64')
            self.pyoutput(schema)
           Â
            item = fmeobjects.FMEFeature()
            for attr in attribs:
                item.setAttribute('_attr_name', attr)
                item.setAttribute('A', attr)
                self.pyoutput(item)
            self.column_index = 1
       Â
        # Explode the incoming feature with the target attributes.
        item = fmeobjects.FMEFeature()
        columnName = xlsxColumnName(self.column_index)
        for attr in attribs:
            item.setAttribute('_attr_name', attr)
            item.setAttribute(columnName, feature.getAttribute(attr))
            self.pyoutput(item)
        self.column_index += 1
There still is a room to improve the performance.
To be continued (maybe).
Hi @fjcharp,
Could you share us your Workspace template ( .FMWT )?
Thanks,
Danilo
Â
FredericÂ
Â
In this case, replacing more transformers with Python codes could increase the performance drastically. For example:
# PythonCaller Script Example
import fmeobjects
# Get Excel Column Name (A,B,C,...) corresponding to a column index.
# index: 0-based column index
def xlsxColumnName(index):
    alpha = ''
    while 0 <= index:
        alpha = chr(index % 26 + 65) + alpha
        index = index // 26 - 1
    return alpha
class FeatureProcessor(object):
    def __init__(self):
        self.column_index = 0
        self.is_target = lambda attr: \
            not '(1900/01/00)' in attr \
            and not attr.startswith('fme_') \
            and not attr.startswith('csv_')
           Â
    def input(self, feature):
        # Collect target attribute names.
        attribs = da for a in feature.getAllAttributeNames() if self.is_target(a)]
       Â
        # First column (column index = 0) values are attribute names.
        # Assume the first incoming feature has all target attributes.
        if self.column_index == 0:
            # Create and output a schema feature at first.
            numColumns = int(FME_MacroValues 'NUM'])
            schema = fmeobjects.FMEFeature()
            schema.setAttribute('fme_feature_type_name', \
                FME_MacroValuesÂ'DestExcelSheetName'])
            schema.setAttribute('fme_schema_handling', 'schema_only')
            for i in range(numColumns):
                schema.setAttribute('attribute{%d}.name' % i, xlsxColumnName(i))
                schema.setAttribute('attribute{%d}.fme_data_type' % i, 'fme_real64')
            self.pyoutput(schema)
           Â
            item = fmeobjects.FMEFeature()
            for attr in attribs:
                item.setAttribute('_attr_name', attr)
                item.setAttribute('A', attr)
                self.pyoutput(item)
            self.column_index = 1
       Â
        # Explode the incoming feature with the target attributes.
        item = fmeobjects.FMEFeature()
        columnName = xlsxColumnName(self.column_index)
        for attr in attribs:
            item.setAttribute('_attr_name', attr)
            item.setAttribute(columnName, feature.getAttribute(attr))
            self.pyoutput(item)
        self.column_index += 1
There still is a room to improve the performance.
To be continued (maybe).
Â
Â
Hmmm... This is a good seed for a more general-purpose pivot function. I had done something similar, but not being sufficiently experienced with schemas, I took the easy way out and wrote the output directly to a CSV file. I had also let the Aggregator create a list of columns items before invoking my Python Caller. The code is very rough... I was under the gun to get something out, so I slapped something together as fast as I could. I am going to spend some time now to clean this up and integrate your approach.Â
Â
???????????,Â
FredericÂÂ
Â
???????? ^^
Further, the CSV reader and all transformers in the previous example can be replaced with a single PythonCreator, and it has increased the performance much.
# PythonCreator Script Example
import fmeobjects, csv
class FeatureCreator(object):
    def close(self):
        # Return True if the CSV field is NOT a target to be processed.
        notTarget = lambda field: \
            '(1900/01/00)' in field
           Â
        # Return Excel Column Name (A,B,C,...) corresponding to a column index.
        # index: 0-based column index
        def xlsxColumnName(index):
            name = ''
            while 0 <= index:
                name = chr(index % 26 + 65) + name
                index = index // 26 - 1
            return name
        # Create and output a schema feature.
        schema = fmeobjects.FMEFeature()
        attrToValue = {
            'fme_feature_type_name' : FME_MacroValues 'DestExcelSheetName'],
            'fme_schema_handling' : 'schema_only',
        }
        for attr, value in attrToValue.items():
            schema.setAttribute(attr, value)
        for i in range(int(FME_MacroValuesn'NUM'])):
            schema.setAttribute('attribute{%d}.name' % i, xlsxColumnName(i))
            schema.setAttribute('attribute{%d}.fme_data_type' % i, 'fme_real64')
        self.pyoutput(schema)
       Â
        # Get cell values from source CSV table and save them into a dictionary.
        # key   = 0-based index of source field (destination row)
        # value = dictionary {destination column name : cell value}
        data = {}
        with open(FME_MacroValuesi'SourceDataset_CSV2'], 'r') as csvfile:
            nontargets = None
            for i, row in enumerate(csv.reader(csvfile)):
                # Collect non-target keys from the first row (i.e. field names).
                if i == 0:
                    nontargets = {k for k, v in enumerate(row) if notTarget(v)}
                c = xlsxColumnName(i)
                for k, v in enumerate(row):
                    if k not in nontargets:
                        data.setdefault(k, {}) c] = v
                   Â
        # Create and output data features for each row in destination Excel table.
        for k in sorted(data.keys()):
            feature = fmeobjects.FMEFeature()
            for c, v in data:k].items():
                feature.setAttribute(c, v)
            self.pyoutput(feature)
Dynamic Schema Configuration in the Excel Writer Feature Type
If you don't need to write the field names (A,B,C...) into the first row,
- set 'No' to the Output Field Names parameter,
- and
set 'Yes' to the Use Attribute Names As Column Positions parameter.