Question

define reader attribute names via commandline

  • 4 October 2016
  • 6 replies
  • 2 views

Badge +4

Hi,

is it possible to define the reader attribute names via commandline or via an external file?

I have excelsheets where the column name is different e.g. somtimes they are called FMEA, von_km etc. The excel sheet is uploaded by users on a website. On the website they specify the worksheetname and the column name how they are called in their excel file.

An other problem is that the they are not always in the same collumn position.

Thank you for your help.


6 replies

Userlevel 5
Badge +25

Tough one! If the order of the columns would always be the same you could ignore the line with the column names and just use col0, col1 and so on.

However, if the order changes too it'll become very difficult to work with the data. This may be a case where the AttributeDereferencer could come in handy if you can somehow parse the attribute name from that website. You'll also need to set it up as a generic reader with a single merged feature type and you will not have access to your attributes when designing the workspace.

Hope this help...

Badge +4

I think I found one solution. I wrote a python script which gets the information from the website, than it extracts the column position from the excelfile of the wanted field. In FME I used a generic reader and an AttributeExploder, with this I get all columns positions and values. With a tester and a published tester value I can filter the desired columns. The python script passes the extracted column to the subprocess.list2cmdline and starts via subprocess the fme workspace :)

Userlevel 4
Badge +25

OK @clemensb, I have a solution, and it took a surprisingly short time to get to work!

Here's a screenshot:

The Reader is an Excel reader in dynamic mode (single feature type). It's VERY IMPORTANT to set the parameter Scan Schema in Dynamic Workflow to Yes.

There is a published parameter through which to define the column to use. The website would pass that to the workspace.

The other reader is a schema reader that reads the list of attributes and merges them onto the Excel rows.

The ListSearcher searches for the column name in the list of attributes. The AttributeDereferencer (nice one Hans) copies the value of the chosen column into an attribute called result.

In short, if I specify ColB as the column to use, I get values b1, b2, b3, b4 (which are the cell values for column B in my source data). But you can change the column names - and positions - and this should still work OK.

If there are multiple sheets in your Excel file, then you would need to filter out the correct one from the schema and in the Excel reader.

Workspace attached here: dynamicexcel.fmwt

Userlevel 4
Badge +25
Nice question. I love puzzles like this one - especially when I can come up with an answer!

 

 

Userlevel 4
Badge +25

OK @clemensb, I have a solution, and it took a surprisingly short time to get to work!

Here's a screenshot:

The Reader is an Excel reader in dynamic mode (single feature type). It's VERY IMPORTANT to set the parameter Scan Schema in Dynamic Workflow to Yes.

There is a published parameter through which to define the column to use. The website would pass that to the workspace.

The other reader is a schema reader that reads the list of attributes and merges them onto the Excel rows.

The ListSearcher searches for the column name in the list of attributes. The AttributeDereferencer (nice one Hans) copies the value of the chosen column into an attribute called result.

In short, if I specify ColB as the column to use, I get values b1, b2, b3, b4 (which are the cell values for column B in my source data). But you can change the column names - and positions - and this should still work OK.

If there are multiple sheets in your Excel file, then you would need to filter out the correct one from the schema and in the Excel reader.

Workspace attached here: dynamicexcel.fmwt

Oh. If you are calling this from the command line you can just use the syntax --FEATURE_TYPES "SheetName" --ColName "ColName" substituting the correct values.

 

 

 

Badge +4

OK @clemensb, I have a solution, and it took a surprisingly short time to get to work!

Here's a screenshot:

The Reader is an Excel reader in dynamic mode (single feature type). It's VERY IMPORTANT to set the parameter Scan Schema in Dynamic Workflow to Yes.

There is a published parameter through which to define the column to use. The website would pass that to the workspace.

The other reader is a schema reader that reads the list of attributes and merges them onto the Excel rows.

The ListSearcher searches for the column name in the list of attributes. The AttributeDereferencer (nice one Hans) copies the value of the chosen column into an attribute called result.

In short, if I specify ColB as the column to use, I get values b1, b2, b3, b4 (which are the cell values for column B in my source data). But you can change the column names - and positions - and this should still work OK.

If there are multiple sheets in your Excel file, then you would need to filter out the correct one from the schema and in the Excel reader.

Workspace attached here: dynamicexcel.fmwt

Very nice! Thank you

 

 

Reply