Skip to main content

I may have encountered an issue with reading Excel sheets with attribute names that start with parts of existing FME format attribute names.

My first experience with this was while using the SchemaMapper transformer, but when I investigated the Excel reader, I noticed similar behaviour

The attribute names I wanted to use were:

  • fme_geometry_name
  • fme_type_name

If a sheet with these attribute names is loaded into the SchemaMapper transformer or the Excel reader, I notice the following behaviour:

  • SchemaMapper

     

    Dataset can be loaded and Parameter window works as expected. When clicking Next, a popup with 'Reading Dataset' appears and disappears very quickly. I am not able to finish the SchemaMapper wizard.
  • Excel Reader

     

    Dataset can be loaded, but after completing the Add Reader wizard, the FeatureType object on the canvas does not contain the added attributes.

When repeating the process with the following attribute names, everything behaves as expected:

  • geometry_name_fme
  • type_name_fme

My impression is that the issue is due to the fact that ther is (some) similarity with the FME format attributes. To my knowledge there is no restriction mentioned in documentation about this.

I added a test.xlsx with four sheets, containing data mentioned above. This file can be used to reproduce the issue.

I would like to know whether this is a real issue or maybe I am missing something

Experienced in FME Desktop versions:

  • 2016.1.0.1 (20160516 - Build 16494 - WIN64)
  • 2017.1.0.0 (20170731 - Build 17539 - WIN64)

Hi @g_karssenberg,

 

Very interesting! I have reproduced both issues you mention above and I think the best course of action is to send this to our development team for further clarification. Your question about wether FME should support attribute names that are similar to format parameter names in something to dive deeper on. Thanks for posting this and I will come back with further insights!

 


After talking with our team here, this issue is related to how we are detecting the difference between user and format or fme attributes.

 

 


In some cases, we know exactly which attributes are of which type (user, format, fme). This is common when we have well known or stored schema formats. However, in schema-less formats for example (Excel, CSV, etc.) when we need to scan the data features to determine schema, we may make assumptions about which attributes are in which categories.

Specifically, we are would often choose to strip of any fme_* attributes or format attributes, csv_*.

 

 

This tells me that in general, the best practise is to avoid naming attributes with fme_* or _* prefixes, especially for schema scanning formats.

 

 


After talking with our team here, this issue is related to how we are detecting the difference between user and format or fme attributes.

 

 


In some cases, we know exactly which attributes are of which type (user, format, fme). This is common when we have well known or stored schema formats. However, in schema-less formats for example (Excel, CSV, etc.) when we need to scan the data features to determine schema, we may make assumptions about which attributes are in which categories.

Specifically, we are would often choose to strip of any fme_* attributes or format attributes, csv_*.

 

 

This tells me that in general, the best practise is to avoid naming attributes with fme_* or _* prefixes, especially for schema scanning formats.

 

 

Thanks Brian. I will avoid these names in the future for the mentioned formats.

 

Probably this is something that should be added to the related format documentation. In that way, everybody can know that these names should be avoided and why.

 

 


Reply