Solved

AttributeCreator and long list of attributes to expose

7 years ago
30 August 2017
10 replies
35 views

fjcharp
7 replies

I use AttributeCreator to prepare a list of columns (A, B, C, D, etc..) for an Excel writer. My problem is that the list is 4000 items long. I have to manually expose every one of the column labels, which is not feasible. I have looked up and down and did not find a solution. Is there a way with PythonCaller to label a feature as exposed?

Best answer by takashi

In this case, replacing more transformers with Python codes could increase the performance drastically. For example:

# PythonCaller Script Example
import fmeobjects

# Get Excel Column Name (A,B,C,...) corresponding to a column index.
# index: 0-based column index
def xlsxColumnName(index):
    alpha = ''
    while 0 <= index:
        alpha = chr(index % 26 + 65) + alpha
        index = index // 26 - 1
    return alpha

class FeatureProcessor(object):
    def __init__(self):
        self.column_index = 0
        self.is_target = lambda attr: \
            not '(1900/01/00)' in attr \
            and not attr.startswith('fme_') \
            and not attr.startswith('csv_')
            
    def input(self, feature):
        # Collect target attribute names.
        attribs = [a for a in feature.getAllAttributeNames() if self.is_target(a)]
        
        # First column (column index = 0) values are attribute names.
        # Assume the first incoming feature has all target attributes.
        if self.column_index == 0:
            # Create and output a schema feature at first.
            numColumns = int(FME_MacroValues['NUM'])
            schema = fmeobjects.FMEFeature()
            schema.setAttribute('fme_feature_type_name', \
                FME_MacroValues['DestExcelSheetName'])
            schema.setAttribute('fme_schema_handling', 'schema_only')
            for i in range(numColumns):
                schema.setAttribute('attribute{%d}.name' % i, xlsxColumnName(i))
                schema.setAttribute('attribute{%d}.fme_data_type' % i, 'fme_real64')
            self.pyoutput(schema)
            
            item = fmeobjects.FMEFeature()
            for attr in attribs:
                item.setAttribute('_attr_name', attr)
                item.setAttribute('A', attr)
                self.pyoutput(item)
            self.column_index = 1
        
        # Explode the incoming feature with the target attributes.
        item = fmeobjects.FMEFeature()
        columnName = xlsxColumnName(self.column_index)
        for attr in attribs:
            item.setAttribute('_attr_name', attr)
            item.setAttribute(columnName, feature.getAttribute(attr))
            self.pyoutput(item)
        self.column_index += 1

There still is a room to improve the performance.

To be continued (maybe).

View original

Did this help you find an answer to your question?

fjcharp
Author
7 replies
7 years ago
August 30, 2017

And just to expand a bit on my question... With PythonCaller, I can add the Excel column name attribute to the individual features, essentially duplicating what AttributeCreator does. But this new attribute does not get exposed, thus my question.

+41

danilo_fme
Evangelist
1997 replies
7 years ago
August 30, 2017

Hi @fjcharp,

Could you share us your Workspace template ( .FMWT )?

Thanks,

Danilo

fjcharp
Author
7 replies
7 years ago
August 30, 2017

danilo_fme wrote:

Hi @fjcharp,

Could you share us your Workspace template ( .FMWT )?

Thanks,

Danilo

I certainly can. moe.zip

The model reads in a table that I need to pivot and write to Excel. The steps are self-explanatory. The reader included here reads a small subset of the complete set (for testing purposes), but the subset is still large enough to require more columns than the ones I manually exposed.

You can ignore the PythonCaller component: it is work in progress that simply allowed me to examine the features output by AttributeCreator.

So the problem is: manually exposing an excessive number of attributes in AttributeCreator is painful and not practical. So my thought was: why don't I create the Excel column name attribute in the lists features (emulating what AttributeCreator does) and then expose these newly created attributes? That's where I hit the proverbial wall :-)

Frederic

+17

takashi
Contributor
7538 replies
7 years ago
August 31, 2017

Hi @fjcharp, I think it would be easier to rename the attribute to destination Excel column name before aggregating the item features for each original CSV field name, rather than once creating a list then renaming its elements.

FME can rename the attribute to destination Excel column name and also configure destination feature type schema automatically, if you implement your workspace as a dynamic workflow. I think the method described in this article would be suitable to this case.

Dynamic Workflows: Destination Schema is Derived from a Schema Feature

Additionally, the NumToAlphaConverter custom transformer from FME Hub can be used effectively to generate a series of Excel column names based on a sequence of numbers.See the attached workspace example to learn more: moe-2.zip (FME 2017.0.1.1)Hope this helps.

+41

danilo_fme
Evangelist
1997 replies
7 years ago
August 31, 2017

takashi wrote:

Dynamic Workflows: Destination Schema is Derived from a Schema Feature

Hi @fjcharp, the @takashi did give you a good solution your case using Dynamic Workflow.

Congratulations @takashi

fjcharp
Author
7 replies
7 years ago
August 31, 2017

Takashi,

Wow... Thank you so much for taking the time. The modified template is quite the tutorial. I have learned a lot from it.

Frederic

+17

takashi
Contributor
7538 replies
Best Answer
7 years ago
September 1, 2017

In this case, replacing more transformers with Python codes could increase the performance drastically. For example:

# PythonCaller Script Example
import fmeobjects

# Get Excel Column Name (A,B,C,...) corresponding to a column index.
# index: 0-based column index
def xlsxColumnName(index):
    alpha = ''
    while 0 <= index:
        alpha = chr(index % 26 + 65) + alpha
        index = index // 26 - 1
    return alpha

class FeatureProcessor(object):
    def __init__(self):
        self.column_index = 0
        self.is_target = lambda attr: \
            not '(1900/01/00)' in attr \
            and not attr.startswith('fme_') \
            and not attr.startswith('csv_')
            
    def input(self, feature):
        # Collect target attribute names.
        attribs = [a for a in feature.getAllAttributeNames() if self.is_target(a)]
        
        # First column (column index = 0) values are attribute names.
        # Assume the first incoming feature has all target attributes.
        if self.column_index == 0:
            # Create and output a schema feature at first.
            numColumns = int(FME_MacroValues['NUM'])
            schema = fmeobjects.FMEFeature()
            schema.setAttribute('fme_feature_type_name', \
                FME_MacroValues['DestExcelSheetName'])
            schema.setAttribute('fme_schema_handling', 'schema_only')
            for i in range(numColumns):
                schema.setAttribute('attribute{%d}.name' % i, xlsxColumnName(i))
                schema.setAttribute('attribute{%d}.fme_data_type' % i, 'fme_real64')
            self.pyoutput(schema)
            
            item = fmeobjects.FMEFeature()
            for attr in attribs:
                item.setAttribute('_attr_name', attr)
                item.setAttribute('A', attr)
                self.pyoutput(item)
            self.column_index = 1
        
        # Explode the incoming feature with the target attributes.
        item = fmeobjects.FMEFeature()
        columnName = xlsxColumnName(self.column_index)
        for attr in attribs:
            item.setAttribute('_attr_name', attr)
            item.setAttribute(columnName, feature.getAttribute(attr))
            self.pyoutput(item)
        self.column_index += 1

There still is a room to improve the performance.

To be continued (maybe).

fjcharp
Author
7 replies
7 years ago
September 1, 2017

danilo_fme wrote:

Hi @fjcharp,

Could you share us your Workspace template ( .FMWT )?

Thanks,

Danilo

???????????

Frederic

fjcharp
Author
7 replies
7 years ago
September 1, 2017

takashi wrote:

In this case, replacing more transformers with Python codes could increase the performance drastically. For example:

# PythonCaller Script Example
import fmeobjects

# Get Excel Column Name (A,B,C,...) corresponding to a column index.
# index: 0-based column index
def xlsxColumnName(index):
    alpha = ''
    while 0 <= index:
        alpha = chr(index % 26 + 65) + alpha
        index = index // 26 - 1
    return alpha

class FeatureProcessor(object):
    def __init__(self):
        self.column_index = 0
        self.is_target = lambda attr: \
            not '(1900/01/00)' in attr \
            and not attr.startswith('fme_') \
            and not attr.startswith('csv_')
            
    def input(self, feature):
        # Collect target attribute names.
        attribs = [a for a in feature.getAllAttributeNames() if self.is_target(a)]
        
        # First column (column index = 0) values are attribute names.
        # Assume the first incoming feature has all target attributes.
        if self.column_index == 0:
            # Create and output a schema feature at first.
            numColumns = int(FME_MacroValues['NUM'])
            schema = fmeobjects.FMEFeature()
            schema.setAttribute('fme_feature_type_name', \
                FME_MacroValues['DestExcelSheetName'])
            schema.setAttribute('fme_schema_handling', 'schema_only')
            for i in range(numColumns):
                schema.setAttribute('attribute{%d}.name' % i, xlsxColumnName(i))
                schema.setAttribute('attribute{%d}.fme_data_type' % i, 'fme_real64')
            self.pyoutput(schema)
            
            item = fmeobjects.FMEFeature()
            for attr in attribs:
                item.setAttribute('_attr_name', attr)
                item.setAttribute('A', attr)
                self.pyoutput(item)
            self.column_index = 1
        
        # Explode the incoming feature with the target attributes.
        item = fmeobjects.FMEFeature()
        columnName = xlsxColumnName(self.column_index)
        for attr in attribs:
            item.setAttribute('_attr_name', attr)
            item.setAttribute(columnName, feature.getAttribute(attr))
            self.pyoutput(item)
        self.column_index += 1

There still is a room to improve the performance.

To be continued (maybe).

Takashi,

Hmmm... This is a good seed for a more general-purpose pivot function. I had done something similar, but not being sufficiently experienced with schemas, I took the easy way out and wrote the output directly to a CSV file. I had also let the Aggregator create a list of columns items before invoking my Python Caller. The code is very rough... I was under the gun to get something out, so I slapped something together as fast as I could. I am going to spend some time now to clean this up and integrate your approach.

???????????,

Frederic

+17

takashi
Contributor
7538 replies
7 years ago
September 2, 2017

???????? ^^

Further, the CSV reader and all transformers in the previous example can be replaced with a single PythonCreator, and it has increased the performance much.

# PythonCreator Script Example
import fmeobjects, csv
class FeatureCreator(object):
    def close(self):
        # Return True if the CSV field is NOT a target to be processed.
        notTarget = lambda field: \
            '(1900/01/00)' in field
            
        # Return Excel Column Name (A,B,C,...) corresponding to a column index.
        # index: 0-based column index
        def xlsxColumnName(index):
            name = ''
            while 0 <= index:
                name = chr(index % 26 + 65) + name
                index = index // 26 - 1
            return name

        # Create and output a schema feature.
        schema = fmeobjects.FMEFeature()
        attrToValue = {
            'fme_feature_type_name' : FME_MacroValues['DestExcelSheetName'],
            'fme_schema_handling' : 'schema_only',
        }
        for attr, value in attrToValue.items():
            schema.setAttribute(attr, value)
        for i in range(int(FME_MacroValues['NUM'])):
            schema.setAttribute('attribute{%d}.name' % i, xlsxColumnName(i))
            schema.setAttribute('attribute{%d}.fme_data_type' % i, 'fme_real64')
        self.pyoutput(schema)
        
        # Get cell values from source CSV table and save them into a dictionary.
        # key   = 0-based index of source field (destination row)
        # value = dictionary {destination column name : cell value}
        data = {}
        with open(FME_MacroValues['SourceDataset_CSV2'], 'r') as csvfile:
            nontargets = None
            for i, row in enumerate(csv.reader(csvfile)):
                # Collect non-target keys from the first row (i.e. field names).
                if i == 0:
                    nontargets = {k for k, v in enumerate(row) if notTarget(v)}
                c = xlsxColumnName(i)
                for k, v in enumerate(row):
                    if k not in nontargets:
                        data.setdefault(k, {})[c] = v
                    
        # Create and output data features for each row in destination Excel table.
        for k in sorted(data.keys()):
            feature = fmeobjects.FMEFeature()
            for c, v in data[k].items():
                feature.setAttribute(c, v)
            self.pyoutput(feature)

Dynamic Schema Configuration in the Excel Writer Feature Type

If you don't need to write the field names (A,B,C...) into the first row,

set 'No' to the Output Field Names parameter,
and
set 'Yes' to the Use Attribute Names As Column Positions parameter.

Reply

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos

AttributeCreator and long list of attributes to expose

10 replies

Reply

Most helpful members this week

Recently Solved Questions

Coordinate system is not setting for Revit to GIS conversion

Cloned list between two sets of values based on a rank

JMS message - valid json but containing an unescaped right brace, job fails to submit

Travelling Salesman

Creating New Features from Attribute Values

Community Stats

Cookie policy

Cookie settings

Reply

Related topics

Splitting a long string to fit into a number of shorter fieldsicon

A List query in FME 2022 doesn't work in FME 2024icon

Copy AutoCAD Attribute Text from Block to GISicon

Where are the attributes created by attributecreator?icon

How to expose/use attachments from jsonicon

Most helpful members this week

Recently Solved Questions

Coordinate system is not setting for Revit to GIS conversion

Cloned list between two sets of values based on a rank

JMS message - valid json but containing an unescaped right brace, job fails to submit

Travelling Salesman

Creating New Features from Attribute Values

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings