Solved

Need help to read Excel header info and hourly data and output to SQL Server tables


Badge +4

I have over 100 Excel files with standard format: each Excel file has multiple sheets corresponding to multiple days in one month on recorded hourly well flow data, and the format for each sheet is fixed. The top part of sheet has well header info while the lower part are recorded hourly well flow data. I need to combine the well header info and hourly flow data into standard table format and load into SQL Server database table. It is easy to read the hourly flow data into. My challenge is to also read well header info from the same sheet , combine with hourly flow data, and write all these data into a table. I am looking for examples and help on similar Excel data loading. Thanks in advance.

icon

Best answer by takashi 20 December 2019, 04:12

View original

10 replies

Userlevel 2
Badge +17

https://s3.us-west-2.amazonaws.com/community-migration/QAComments/workflow-example381193.zipHi @jhu_20, if you created external tables which define schema for header (map cell - row and column to item) and body (map column to item) like the screenshot below, you could create a workspace to map the contents of header and table body read from the Excel sheets to destination tables according to the schema definition.

It's hard to explain how to implement the workspace with my poor English, sorry. See the attached workspace example and related files to learn more. workflow-example.zip (FME 2019.2.1)

Hope this helps.

Source Dataset (Simplified)

data

 

Schema Definition Example

schema-mapping-tables

Badge +4

Takashi,

Thanks for sharing your suggestions and sample workbench. I will review in details and let you know.

Badge +4

Takashi,

Thanks for sharing your suggestions and sample workbench. I will review in details and let you know.

Takashi, Thanks for your wonderful suggestion. I finally got it to work based on your suggestion and your sample workbench. However, I still have two issues that need help: (1) The default Excel reader will start at Field Names row 1 and Cell Range 2:. This does not work for me. I have to manually change to 0 (Field Names row) and B1:Y35 (cell range) for all sheets in order to get it work for me; Otherwise, I will get extra blank data columns due to formatting within original excel files (2) The number of sheets in each Excel file varies. Is any way that I can detect and automate this part. I noted that the Excel reader will detect and automatically list all sheets as inputs. I have to manually connect these sheets. Since I will need to process over 100 Excel files. It will be great to automate this part:

problem 1:

Problem 2:

Userlevel 2
Badge +17

Try the steps below.

1. When you add the Excel reader, select just a single sheet and configure Field Names Row and Cell Range appropriately,

change the Additional File Settings parameter to "Use Current Settings" and make sure that the Scan Schema in Dynamic Workflow is unchecked.

 

2. Add the reader with the Single Merged Feature Type option.

 

3. Clear Value in the Feature Types to Read parameter, in order to read all sheets from the specified Excel file.

 

4. Unlink the parameter from User Parameter optionally.

Done.

 

Hope this helps.

Userlevel 2
Badge +17

Takashi,

Thanks for sharing your suggestions and sample workbench. I will review in details and let you know.

Try the steps below.

1. When you add the Excel reader, select just a single sheet and configure Field Names Row and Cell Range appropriately,

change the Additional File Settings parameter to "Use Current Settings" and make sure that the Scan Schema in Dynamic Workflow is unchecked.

 

2. Add the reader with the Single Merged Feature Type option.

 

3. Clear Value in the Feature Types to Read parameter, in order to read all sheets from the specified Excel file.

 

4. Unlink the parameter from User Parameter optionally.

Done.

 

Hope this helps.

Badge +4

Takashi,

Thanks for your quick response. I will try your suggestions and let you know.

Badge +4

Takashi, Thanks for your great help. With your help, I finally got it to work and can load data successfully from all sheets in the Excel file. My next step is to modify this workbench and start process similar 100 plus other Excel files exactly in the same format, i.e. each file has the fixed format for head data and content and also has multiple sheets. I will have all these files with different filenames but in one folder. So my next question is how to modify the current workbench:

"H:\\Data\\ResevoirCharacterization\\Andrew\\WellFlow\\Silverchair 4H Report.xlsm"

and to process multiple Excel files in the same format:

H:\\Data\\ResevoirCharacterization\\Andrew\\WellFlowData (sample folder with multiple Excel files)

What is the quickest way to accomplish this step ? Please help. Thank you.

Userlevel 2
Badge +17

Takashi, Thanks for your great help. With your help, I finally got it to work and can load data successfully from all sheets in the Excel file. My next step is to modify this workbench and start process similar 100 plus other Excel files exactly in the same format, i.e. each file has the fixed format for head data and content and also has multiple sheets. I will have all these files with different filenames but in one folder. So my next question is how to modify the current workbench:

"H:\\Data\\ResevoirCharacterization\\Andrew\\WellFlow\\Silverchair 4H Report.xlsm"

and to process multiple Excel files in the same format:

H:\\Data\\ResevoirCharacterization\\Andrew\\WellFlowData (sample folder with multiple Excel files)

What is the quickest way to accomplish this step ? Please help. Thank you.

No need to modify the workspace. You can select all the Excel files within one or more folders as the dataset for the Excel reader.

Badge +4

Takashi, You are great and thanks again. I will check out your suggestion.

https://s3.us-west-2.amazonaws.com/community-migration/QAComments/workflow-example381193.zipHi @jhu_20, if you created external tables which define schema for header (map cell - row and column to item) and body (map column to item) like the screenshot below, you could create a workspace to map the contents of header and table body read from the Excel sheets to destination tables according to the schema definition.

It's hard to explain how to implement the workspace with my poor English, sorry. See the attached workspace example and related files to learn more. workflow-example.zip (FME 2019.2.1)

Hope this helps.

Source Dataset (Simplified)

data

 

Schema Definition Example

schema-mapping-tables

Hi Takashi

Can you provide the example workflow-example.zip (FME 2019.2.1), i can't access it?

Neil

Reply