Question

define reader attribute names via commandline

8 years ago
October 4, 2016
6 replies
6 views

clemensb
Contributor
30 replies

Hi,

is it possible to define the reader attribute names via commandline or via an external file?

I have excelsheets where the column name is different e.g. somtimes they are called FMEA, von_km etc. The excel sheet is uploaded by users on a website. On the website they specify the worksheetname and the column name how they are called in their excel file.

An other problem is that the they are not always in the same collumn position.

Thank you for your help.

+49

redgeographics
Celebrity
3635 replies
8 years ago
October 4, 2016

Tough one! If the order of the columns would always be the same you could ignore the line with the column names and just use col0, col1 and so on.

However, if the order changes too it'll become very difficult to work with the data. This may be a case where the AttributeDereferencer could come in handy if you can somehow parse the attribute name from that website. You'll also need to set it up as a generic reader with a single merged feature type and you will not have access to your attributes when designing the workspace.

Hope this help...

clemensb
Author
Contributor
30 replies
8 years ago
October 4, 2016

I think I found one solution. I wrote a python script which gets the information from the website, than it extracts the column position from the excelfile of the wanted field. In FME I used a generic reader and an AttributeExploder, with this I get all columns positions and values. With a tester and a published tester value I can filter the desired columns. The python script passes the extracted column to the subprocess.list2cmdline and starts via subprocess the fme workspace :)

+44

mark2atsafe
Safer
2519 replies
8 years ago
October 5, 2016

OK @clemensb, I have a solution, and it took a surprisingly short time to get to work!

Here's a screenshot:

The Reader is an Excel reader in dynamic mode (single feature type). It's VERY IMPORTANT to set the parameter Scan Schema in Dynamic Workflow to Yes.

There is a published parameter through which to define the column to use. The website would pass that to the workspace.

The other reader is a schema reader that reads the list of attributes and merges them onto the Excel rows.

The ListSearcher searches for the column name in the list of attributes. The AttributeDereferencer (nice one Hans) copies the value of the chosen column into an attribute called result.

In short, if I specify ColB as the column to use, I get values b1, b2, b3, b4 (which are the cell values for column B in my source data). But you can change the column names - and positions - and this should still work OK.

If there are multiple sheets in your Excel file, then you would need to filter out the correct one from the schema and in the Excel reader.

Workspace attached here: dynamicexcel.fmwt

+44

mark2atsafe
Safer
2519 replies
8 years ago
October 5, 2016

Nice question. I love puzzles like this one - especially when I can come up with an answer!

+44

mark2atsafe
Safer
2519 replies
8 years ago
October 5, 2016

mark2atsafe wrote:

OK @clemensb, I have a solution, and it took a surprisingly short time to get to work!

Here's a screenshot:

The Reader is an Excel reader in dynamic mode (single feature type). It's VERY IMPORTANT to set the parameter Scan Schema in Dynamic Workflow to Yes.

There is a published parameter through which to define the column to use. The website would pass that to the workspace.

The other reader is a schema reader that reads the list of attributes and merges them onto the Excel rows.

The ListSearcher searches for the column name in the list of attributes. The AttributeDereferencer (nice one Hans) copies the value of the chosen column into an attribute called result.

If there are multiple sheets in your Excel file, then you would need to filter out the correct one from the schema and in the Excel reader.

Workspace attached here: dynamicexcel.fmwt

Oh. If you are calling this from the command line you can just use the syntax --FEATURE_TYPES "SheetName" --ColName "ColName" substituting the correct values.

clemensb
Author
Contributor
30 replies
8 years ago
October 7, 2016

mark2atsafe wrote:

OK @clemensb, I have a solution, and it took a surprisingly short time to get to work!

Here's a screenshot:

The Reader is an Excel reader in dynamic mode (single feature type). It's VERY IMPORTANT to set the parameter Scan Schema in Dynamic Workflow to Yes.

There is a published parameter through which to define the column to use. The website would pass that to the workspace.

The other reader is a schema reader that reads the list of attributes and merges them onto the Excel rows.

The ListSearcher searches for the column name in the list of attributes. The AttributeDereferencer (nice one Hans) copies the value of the chosen column into an attribute called result.

If there are multiple sheets in your Excel file, then you would need to filter out the correct one from the schema and in the Excel reader.

Workspace attached here: dynamicexcel.fmwt

Very nice! Thank you

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

define reader attribute names via commandline