Skip to main content

Hi

 

i am trying to read lots of excel using the feature reader transformer,  passing in the locations form a directory path reader. all the excel have different tab names but the data always start on row 10. As the dataset is a file path the option for telling it to start on row 10 are blanked out. anyone know how i get around this?

Many thanks,

Andy

 

 

I have different ways to work around this. One is to use the Generic Reader, then filter the rows you don’t need and manually rename the columns from column0, column1 etc to the real column names.

If column names / column positions are not fixed I explode all data to features, create headers from the header cells and rebuild the tables using aggregators. But this is a last resort option as it creates a lot of features (one for every cell) and is processing time intensive.

Maybe someone else has a better solution, Excel is not my favorite format as it gives users so much freedom to build creative things which are hard to read as real data.


Thanks @nielsgerrits.  i tried that but i does not bring in every column  as column 1, column 2, etc. it names the columns based on what ever someone has decided to include in the first row, so it makes it appear the data is in more columns when they are exposed. once you get to the data in row 10 it is in consistent columns across al excel files. i can do it using the excel reader, but it means il have to add extra files every time someone adds ones.


Thanks @nielsgerrits.  i tried that but i does not bring in every column  as column 1, column 2, etc. it names the columns based on what ever someone has decided to include in the first row, so it makes it appear the data is in more columns when they are exposed. once you get to the data in row 10 it is in consistent columns across al excel files. i can do it using the excel reader, but it means il have to add extra files every time someone adds ones.

For me this does put it in col_1, col_2, col_3:

Attached workspace with featurecaching.

 


What I’ve done is creating a non-published User Parameter like this:

$(_XLS_INPUT_PATH)\*.xlsx

The *.xslx works just fine, no matter how many excels there are in the path. The $(_XLS_INPUT_PATH) is a Published User Parameter that allows the end-user to select a directory where the excels that he or she needs to process are stored. Then the Source of an ordinary XLSX-Reader is set to the Unpublished User Parameter.

Maybe that will work for you? 


Many thanks,

 

Im on FME 2022.1 and the feature reader has been upgraded for your version. Maybe its just a case of waiting till we upgrade and then using your solution. Thanks


Many thanks,

 

Im on FME 2022.1 and the feature reader has been upgraded for your version. Maybe its just a case of waiting till we upgrade and then using your solution. Thanks

Also works with 2021. Attached sample with featurecaching.

 


What I’ve done is creating a non-published User Parameter like this:

$(_XLS_INPUT_PATH)\*.xlsx

The *.xslx works just fine, no matter how many excels there are in the path. The $(_XLS_INPUT_PATH) is a Published User Parameter that allows the end-user to select a directory where the excels that he or she needs to process are stored. Then the Source of an ordinary XLSX-Reader is set to the Unpublished User Parameter.

Maybe that will work for you? 

This is a working solution as well. I personally prefer to use a Directory reader to scan a folder and be able to use a sample set or test / filter out specific files.


@s.jager  thanks!! Yep i think that will work for me. I did not think about wildcards in the file path. Doing that in the excel reader should do it!


Many thanks,

 

Im on FME 2022.1 and the feature reader has been upgraded for your version. Maybe its just a case of waiting till we upgrade and then using your solution. Thanks

Also works with 2021. Attached sample with featurecaching.

 

i just tried this and it works as you said. maybe when i was switching between generic and excel in the feature reader it was just behaving out of sorts. Sorry was equally due Top Answer.


Reply