Skip to main content
Solved

Need help to read Excel header info and hourly data and output to SQL Server tables


jhu_20
Contributor
Forum|alt.badge.img+4

I have over 100 Excel files with standard format: each Excel file has multiple sheets corresponding to multiple days in one month on recorded hourly well flow data, and the format for each sheet is fixed. The top part of sheet has well header info while the lower part are recorded hourly well flow data. I need to combine the well header info and hourly flow data into standard table format and load into SQL Server database table. It is easy to read the hourly flow data into. My challenge is to also read well header info from the same sheet , combine with hourly flow data, and write all these data into a table. I am looking for examples and help on similar Excel data loading. Thanks in advance.

Best answer by takashi

Try the steps below.

1. When you add the Excel reader, select just a single sheet and configure Field Names Row and Cell Range appropriately,

change the Additional File Settings parameter to "Use Current Settings" and make sure that the Scan Schema in Dynamic Workflow is unchecked.

 

2. Add the reader with the Single Merged Feature Type option.

 

3. Clear Value in the Feature Types to Read parameter, in order to read all sheets from the specified Excel file.

 

4. Unlink the parameter from User Parameter optionally.

Done.

 

Hope this helps.

View original
Did this help you find an answer to your question?

10 replies

takashi
Influencer
  • December 18, 2019

https://s3.us-west-2.amazonaws.com/community-migration/QAComments/workflow-example381193.zipHi @jhu_20, if you created external tables which define schema for header (map cell - row and column to item) and body (map column to item) like the screenshot below, you could create a workspace to map the contents of header and table body read from the Excel sheets to destination tables according to the schema definition.

It's hard to explain how to implement the workspace with my poor English, sorry. See the attached workspace example and related files to learn more. workflow-example.zip (FME 2019.2.1)

Hope this helps.

Source Dataset (Simplified)

data

 

Schema Definition Example

schema-mapping-tables


jhu_20
Contributor
Forum|alt.badge.img+4
  • Author
  • Contributor
  • December 18, 2019

Takashi,

Thanks for sharing your suggestions and sample workbench. I will review in details and let you know.


jhu_20
Contributor
Forum|alt.badge.img+4
  • Author
  • Contributor
  • December 20, 2019
jhu_20 wrote:

Takashi,

Thanks for sharing your suggestions and sample workbench. I will review in details and let you know.

Takashi, Thanks for your wonderful suggestion. I finally got it to work based on your suggestion and your sample workbench. However, I still have two issues that need help: (1) The default Excel reader will start at Field Names row 1 and Cell Range 2:. This does not work for me. I have to manually change to 0 (Field Names row) and B1:Y35 (cell range) for all sheets in order to get it work for me; Otherwise, I will get extra blank data columns due to formatting within original excel files (2) The number of sheets in each Excel file varies. Is any way that I can detect and automate this part. I noted that the Excel reader will detect and automatically list all sheets as inputs. I have to manually connect these sheets. Since I will need to process over 100 Excel files. It will be great to automate this part:

problem 1:

Problem 2:


takashi
Influencer
  • Best Answer
  • December 20, 2019

Try the steps below.

1. When you add the Excel reader, select just a single sheet and configure Field Names Row and Cell Range appropriately,

change the Additional File Settings parameter to "Use Current Settings" and make sure that the Scan Schema in Dynamic Workflow is unchecked.

 

2. Add the reader with the Single Merged Feature Type option.

 

3. Clear Value in the Feature Types to Read parameter, in order to read all sheets from the specified Excel file.

 

4. Unlink the parameter from User Parameter optionally.

Done.

 

Hope this helps.


takashi
Influencer
  • December 20, 2019
jhu_20 wrote:

Takashi,

Thanks for sharing your suggestions and sample workbench. I will review in details and let you know.

Try the steps below.

1. When you add the Excel reader, select just a single sheet and configure Field Names Row and Cell Range appropriately,

change the Additional File Settings parameter to "Use Current Settings" and make sure that the Scan Schema in Dynamic Workflow is unchecked.

 

2. Add the reader with the Single Merged Feature Type option.

 

3. Clear Value in the Feature Types to Read parameter, in order to read all sheets from the specified Excel file.

 

4. Unlink the parameter from User Parameter optionally.

Done.

 

Hope this helps.


jhu_20
Contributor
Forum|alt.badge.img+4
  • Author
  • Contributor
  • December 20, 2019

Takashi,

Thanks for your quick response. I will try your suggestions and let you know.


jhu_20
Contributor
Forum|alt.badge.img+4
  • Author
  • Contributor
  • December 23, 2019

Takashi, Thanks for your great help. With your help, I finally got it to work and can load data successfully from all sheets in the Excel file. My next step is to modify this workbench and start process similar 100 plus other Excel files exactly in the same format, i.e. each file has the fixed format for head data and content and also has multiple sheets. I will have all these files with different filenames but in one folder. So my next question is how to modify the current workbench:

"H:\\Data\\ResevoirCharacterization\\Andrew\\WellFlow\\Silverchair 4H Report.xlsm"

and to process multiple Excel files in the same format:

H:\\Data\\ResevoirCharacterization\\Andrew\\WellFlowData (sample folder with multiple Excel files)

What is the quickest way to accomplish this step ? Please help. Thank you.


takashi
Influencer
  • December 23, 2019
jhu_20 wrote:

Takashi, Thanks for your great help. With your help, I finally got it to work and can load data successfully from all sheets in the Excel file. My next step is to modify this workbench and start process similar 100 plus other Excel files exactly in the same format, i.e. each file has the fixed format for head data and content and also has multiple sheets. I will have all these files with different filenames but in one folder. So my next question is how to modify the current workbench:

"H:\\Data\\ResevoirCharacterization\\Andrew\\WellFlow\\Silverchair 4H Report.xlsm"

and to process multiple Excel files in the same format:

H:\\Data\\ResevoirCharacterization\\Andrew\\WellFlowData (sample folder with multiple Excel files)

What is the quickest way to accomplish this step ? Please help. Thank you.

No need to modify the workspace. You can select all the Excel files within one or more folders as the dataset for the Excel reader.


jhu_20
Contributor
Forum|alt.badge.img+4
  • Author
  • Contributor
  • December 24, 2019

Takashi, You are great and thanks again. I will check out your suggestion.


  • September 24, 2020
takashi wrote:

https://s3.us-west-2.amazonaws.com/community-migration/QAComments/workflow-example381193.zipHi @jhu_20, if you created external tables which define schema for header (map cell - row and column to item) and body (map column to item) like the screenshot below, you could create a workspace to map the contents of header and table body read from the Excel sheets to destination tables according to the schema definition.

It's hard to explain how to implement the workspace with my poor English, sorry. See the attached workspace example and related files to learn more. workflow-example.zip (FME 2019.2.1)

Hope this helps.

Source Dataset (Simplified)

data

 

Schema Definition Example

schema-mapping-tables

Hi Takashi

Can you provide the example workflow-example.zip (FME 2019.2.1), i can't access it?

Neil


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings