Skip to main content

Goal: 

I am trying to standardize the raw CSV files. The output file columns should look like below.

Date	Time	Level	Temperature	Elevation

Background Info: 

Each file might have multiple tabs. I need to only process the sheets that named as 20* (2011-2012-2019-2020 etc). Sometimes there is header information in A column in some of the sheets. Thus the table header starts at different row number.

Issue: 

I am using FeatureReader to detect the schema dynamically. If the table starts with header information, the transformer reads the data as a new table. Thus it creates new column names which turns into a larger issue.

Normally there are over 200 files with multiple tabs but I've prepared a sample data and included my workbench. Looking for help.

For this reason I prefer to use the Generic Reader to dynamically read Excel files. Then you get col_1, col_2, etc... which you can rename yourself. Attached sample with your data.


For this reason I prefer to use the Generic Reader to dynamically read Excel files. Then you get col_1, col_2, etc... which you can rename yourself. Attached sample with your data.

Hi @nielsgerrits​ , I tried so many things but never thought about Generic reader. It definitely gave me a better direction, thanks a lot!


Hi @nielsgerrits​ , I tried so many things but never thought about Generic reader. It definitely gave me a better direction, thanks a lot!

Happy I could help. :)


Reply