Hi everyone,I have a series of regularly produced Excel files, each containing a number of sheets. The number and name of the sheets in each file isn't fixed.I can read data from all the sheets no problem by keeping the reader parameters blank - but annoyingly for these files the header starts on row 2 and in some cases junk is written into row 1. As such, FME picks up attribute names from row 1, so for one sheet the attribute for column B might end up named"995" and for the next sheet it might be "376", so I end up with a lot of attributes where there should be one - ideally just called "B".Is there any way to force the Excel reader to always use column letter as the attribute name regardless of how many sheets are in a file and what they are called?

Column letters as attribute names for Excel reader with unknown number of sheets

+20

debbiatsafe
Safer
648 replies
5 years ago
March 19, 2020

Hi @gtrathen

As mentioned in the Excel reader documentation, setting the Field Names Row value to 0 (or leaving it blank) will set column names to column letters.

If the column Field Names Row value is set to zero (0), then the column names will be set to the Excel column letters A, B, C, etc. This can be useful if there are no field names, or if they are not consistent.

I hope this information helps.

G

gtrathen
Author
4 replies
5 years ago
March 19, 2020

debbiatsafe wrote:

Hi @gtrathen

As mentioned in the Excel reader documentation, setting the Field Names Row value to 0 (or leaving it blank) will set column names to column letters.

If the column Field Names Row value is set to zero (0), then the column names will be set to the Excel column letters A, B, C, etc. This can be useful if there are no field names, or if they are not consistent.

I hope this information helps.

Hi @debbiatsafe thanks for the reply! The field names row is defined per named sheet in the file as far as I can tell - so if I set the value to 0 and run it for a series of sheets it works for the first file (and any successive file which has a sheet named the same thing) but for any other sheets that weren't in the original definition it doesn't apply. So in the image below - where the sheets are CP4, CP5, CP6 in the file used for definition of parameters, the next might be CP1, CP2, CP3 - and for those sheets which weren't covered by the parameters, it reverts to using row 2 for the field names.

Is there a way to tell the reader "whatever sheets you encounter in the file - the Field Names Row will be 0"?

+20

debbiatsafe
Safer
648 replies
5 years ago
March 19, 2020

gtrathen wrote:

Hi @debbiatsafe thanks for the reply! The field names row is defined per named sheet in the file as far as I can tell - so if I set the value to 0 and run it for a series of sheets it works for the first file (and any successive file which has a sheet named the same thing) but for any other sheets that weren't in the original definition it doesn't apply. So in the image below - where the sheets are CP4, CP5, CP6 in the file used for definition of parameters, the next might be CP1, CP2, CP3 - and for those sheets which weren't covered by the parameters, it reverts to using row 2 for the field names.

Is there a way to tell the reader "whatever sheets you encounter in the file - the Field Names Row will be 0"?

Hi @gtrathen

Try setting the Additional File Settings in the Multiple File Parameters section of the Excel reader parameter to 'Use Current Settings'. By default, this parameter is set to 'Use FME Default Settings'.

G

gtrathen
Author
4 replies
5 years ago
March 19, 2020

debbiatsafe wrote:

Hi @gtrathen

Try setting the Additional File Settings in the Multiple File Parameters section of the Excel reader parameter to 'Use Current Settings'. By default, this parameter is set to 'Use FME Default Settings'.

Hi @debbiatsafe, I get the same outcome with that toggled on - if you'll excuse my crude MS paint editing:

+20

debbiatsafe
Safer
648 replies
5 years ago
March 19, 2020

gtrathen wrote:

Hi @debbiatsafe, I get the same outcome with that toggled on - if you'll excuse my crude MS paint editing:

I wasn't able to replicate the behaviour you are seeing using 2020.0 (build 20200). What version of FME are you using? In addition, are you able to share your workspace and some sample data?

G

gtrathen
Author
4 replies
5 years ago
March 19, 2020

debbiatsafe wrote:

I wasn't able to replicate the behaviour you are seeing using 2020.0 (build 20200). What version of FME are you using? In addition, are you able to share your workspace and some sample data?

Thanks again for taking a look at this.

I'm using 2019.1.3.1 (build 19643). I've attached a stripped down workspace with a creator and two feature readers (dir/file path and then Excel) in the configuration I was using. I've also attached two sample files which are representative of the input files I am dealing with.

For me - the attributes for Sample_1 come out as B,C,D,E,F,G,H but for Sample_2 they come out as B,C,3,4,5,G,9 - taking the numeric values from the first row.

excel.fmwSample_1.xlsxSample_2.xlsx

+20

debbiatsafe
Safer
648 replies
5 years ago
March 20, 2020

gtrathen wrote:

Thanks again for taking a look at this.

I'm using 2019.1.3.1 (build 19643). I've attached a stripped down workspace with a creator and two feature readers (dir/file path and then Excel) in the configuration I was using. I've also attached two sample files which are representative of the input files I am dealing with.

For me - the attributes for Sample_1 come out as B,C,D,E,F,G,H but for Sample_2 they come out as B,C,3,4,5,G,9 - taking the numeric values from the first row.

excel.fmwSample_1.xlsxSample_2.xlsx

Hi @gtrathen

Thank you for providing the data and workspace. I believe you are seeing different behaviour due to using a FeatureReader--I was using a regular reader.

I haven't been able to get the behaviour of the regular Excel reader within a FeatureReader in regards to the Additional File Settings parameter. I will file a bug report for this issue.

As a workaround, would you consider using the Excel reader instead? I see you're using a PATH FeatureReader to filter for Excel files. This can also be done within the Excel reader itself using the Select Multiple Folders/Files option.

G

gtrathen
Author
4 replies
5 years ago
March 21, 2020

debbiatsafe wrote:

Hi @gtrathen

Thank you for providing the data and workspace. I believe you are seeing different behaviour due to using a FeatureReader--I was using a regular reader.

I haven't been able to get the behaviour of the regular Excel reader within a FeatureReader in regards to the Additional File Settings parameter. I will file a bug report for this issue.

As a workaround, would you consider using the Excel reader instead? I see you're using a PATH FeatureReader to filter for Excel files. This can also be done within the Excel reader itself using the Select Multiple Folders/Files option.

Hi @debbiatsafe - apologies I should have mentioned I was using FeatureReaders in the original question.

Using Excel readers should do the trick if they don't suffer the same issue, but I also found a workaround that is suitable in this specific instance while I was playing around.

Because the numeric attribute names that sneak in are always sequential the attributes can be sorted and then renamed with a PythonCaller modified from another question here.

Resulting in:

The output is slightly misleading as "XL_A" may not actually be column A but it is at least consistent across every file and sheet that the FeatureReader encounters. If the column names could not be sorted then the above could not be guaranteed to work.

Again, thank you very much for all your help!

+26

becchr
Influencer
106 replies
5 years ago
March 21, 2020

gtrathen wrote:

Hi @debbiatsafe - apologies I should have mentioned I was using FeatureReaders in the original question.

Using Excel readers should do the trick if they don't suffer the same issue, but I also found a workaround that is suitable in this specific instance while I was playing around.

Because the numeric attribute names that sneak in are always sequential the attributes can be sorted and then renamed with a PythonCaller modified from another question here.

Resulting in:

The output is slightly misleading as "XL_A" may not actually be column A but it is at least consistent across every file and sheet that the FeatureReader encounters. If the column names could not be sorted then the above could not be guaranteed to work.

Again, thank you very much for all your help!

thank you @gtrathen for sharing this solution/workaround

mtaftferguson
Contributor
15 replies
3 years ago
January 14, 2022

gtrathen wrote:

Hi @debbiatsafe thanks for the reply! The field names row is defined per named sheet in the file as far as I can tell - so if I set the value to 0 and run it for a series of sheets it works for the first file (and any successive file which has a sheet named the same thing) but for any other sheets that weren't in the original definition it doesn't apply. So in the image below - where the sheets are CP4, CP5, CP6 in the file used for definition of parameters, the next might be CP1, CP2, CP3 - and for those sheets which weren't covered by the parameters, it reverts to using row 2 for the field names.

Is there a way to tell the reader "whatever sheets you encounter in the file - the Field Names Row will be 0"?

@debbiatsafe I'm in a similar position -- is the bug still outstanding? I'm using 2021.1 and am also finding it impossible to read inconsistently-formatted Excels with a FeatureReader without ending up with a lot of junk attribute names.

+20

debbiatsafe
Safer
648 replies
3 years ago
January 19, 2022

gtrathen wrote:

Hi @debbiatsafe thanks for the reply! The field names row is defined per named sheet in the file as far as I can tell - so if I set the value to 0 and run it for a series of sheets it works for the first file (and any successive file which has a sheet named the same thing) but for any other sheets that weren't in the original definition it doesn't apply. So in the image below - where the sheets are CP4, CP5, CP6 in the file used for definition of parameters, the next might be CP1, CP2, CP3 - and for those sheets which weren't covered by the parameters, it reverts to using row 2 for the field names.

Is there a way to tell the reader "whatever sheets you encounter in the file - the Field Names Row will be 0"?

Hi @mtaftferguson

Our development team has evaluated the issue and is planning on adding an option that allows you to set whether automatic or manual schemas are generated (similar to CSV). This should hopefully resolve the issue reported.

This work hasn't been implemented yet but is on our team's radar.

mtaftferguson
Contributor
15 replies
3 years ago
January 19, 2022

gtrathen wrote:

Hi @debbiatsafe thanks for the reply! The field names row is defined per named sheet in the file as far as I can tell - so if I set the value to 0 and run it for a series of sheets it works for the first file (and any successive file which has a sheet named the same thing) but for any other sheets that weren't in the original definition it doesn't apply. So in the image below - where the sheets are CP4, CP5, CP6 in the file used for definition of parameters, the next might be CP1, CP2, CP3 - and for those sheets which weren't covered by the parameters, it reverts to using row 2 for the field names.

Is there a way to tell the reader "whatever sheets you encounter in the file - the Field Names Row will be 0"?

Thank you for the update! I was able to adapt the above renaming script to rename the "junk" attributes in my data so that they were at least exposable, which was sufficient in this case but it will be nicer to have that option.

Column letters as attribute names for Excel reader with unknown number of sheets

12 replies

Reply

Helpful Members This Week

Recently Solved Questions

Using one AttributeRounder for different accuracies

Create date segments of two table with overlap of times

Automate Fanout of columns/splitting attributes to different output by attribute name

Tracing Multiple Networks from Sources to Valves Without Python

FME Flow version control how to use different branch

Community Stats

Latest FME

Cookie policy

Cookie settings

Reply

Related Topics

zos api: change surface for geometric image analysisicon

Change wavelength in Geometric Image Analysisicon

How to stop Field type setting automatically changing while using python API in interactive mode?icon

OpticStudio 23.1 Release Notes

Ansys Zemax OpticStudio 2023 R1 Release Notes

Helpful Members This Week

Recently Solved Questions

Using one AttributeRounder for different accuracies

Create date segments of two table with overlap of times

Automate Fanout of columns/splitting attributes to different output by attribute name

Tracing Multiple Networks from Sources to Valves Without Python

FME Flow version control how to use different branch

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings