Skip to main content
Question

define reader attribute names via commandline


clemensb
Contributor
Forum|alt.badge.img+5

Hi,

is it possible to define the reader attribute names via commandline or via an external file?

I have excelsheets where the column name is different e.g. somtimes they are called FMEA, von_km etc. The excel sheet is uploaded by users on a website. On the website they specify the worksheetname and the column name how they are called in their excel file.

An other problem is that the they are not always in the same collumn position.

Thank you for your help.

6 replies

redgeographics
Celebrity
Forum|alt.badge.img+49

Tough one! If the order of the columns would always be the same you could ignore the line with the column names and just use col0, col1 and so on.

However, if the order changes too it'll become very difficult to work with the data. This may be a case where the AttributeDereferencer could come in handy if you can somehow parse the attribute name from that website. You'll also need to set it up as a generic reader with a single merged feature type and you will not have access to your attributes when designing the workspace.

Hope this help...


clemensb
Contributor
Forum|alt.badge.img+5
  • Author
  • Contributor
  • October 4, 2016

I think I found one solution. I wrote a python script which gets the information from the website, than it extracts the column position from the excelfile of the wanted field. In FME I used a generic reader and an AttributeExploder, with this I get all columns positions and values. With a tester and a published tester value I can filter the desired columns. The python script passes the extracted column to the subprocess.list2cmdline and starts via subprocess the fme workspace :)


mark2atsafe
Safer
Forum|alt.badge.img+43
  • Safer
  • October 5, 2016

OK @clemensb, I have a solution, and it took a surprisingly short time to get to work!

Here's a screenshot:

The Reader is an Excel reader in dynamic mode (single feature type). It's VERY IMPORTANT to set the parameter Scan Schema in Dynamic Workflow to Yes.

There is a published parameter through which to define the column to use. The website would pass that to the workspace.

The other reader is a schema reader that reads the list of attributes and merges them onto the Excel rows.

The ListSearcher searches for the column name in the list of attributes. The AttributeDereferencer (nice one Hans) copies the value of the chosen column into an attribute called result.

In short, if I specify ColB as the column to use, I get values b1, b2, b3, b4 (which are the cell values for column B in my source data). But you can change the column names - and positions - and this should still work OK.

If there are multiple sheets in your Excel file, then you would need to filter out the correct one from the schema and in the Excel reader.

Workspace attached here: dynamicexcel.fmwt


mark2atsafe
Safer
Forum|alt.badge.img+43
  • Safer
  • October 5, 2016
Nice question. I love puzzles like this one - especially when I can come up with an answer!

 

 


mark2atsafe
Safer
Forum|alt.badge.img+43
  • Safer
  • October 5, 2016
mark2atsafe wrote:

OK @clemensb, I have a solution, and it took a surprisingly short time to get to work!

Here's a screenshot:

The Reader is an Excel reader in dynamic mode (single feature type). It's VERY IMPORTANT to set the parameter Scan Schema in Dynamic Workflow to Yes.

There is a published parameter through which to define the column to use. The website would pass that to the workspace.

The other reader is a schema reader that reads the list of attributes and merges them onto the Excel rows.

The ListSearcher searches for the column name in the list of attributes. The AttributeDereferencer (nice one Hans) copies the value of the chosen column into an attribute called result.

In short, if I specify ColB as the column to use, I get values b1, b2, b3, b4 (which are the cell values for column B in my source data). But you can change the column names - and positions - and this should still work OK.

If there are multiple sheets in your Excel file, then you would need to filter out the correct one from the schema and in the Excel reader.

Workspace attached here: dynamicexcel.fmwt

Oh. If you are calling this from the command line you can just use the syntax --FEATURE_TYPES "SheetName" --ColName "ColName" substituting the correct values.

 

 

 


clemensb
Contributor
Forum|alt.badge.img+5
  • Author
  • Contributor
  • October 7, 2016
mark2atsafe wrote:

OK @clemensb, I have a solution, and it took a surprisingly short time to get to work!

Here's a screenshot:

The Reader is an Excel reader in dynamic mode (single feature type). It's VERY IMPORTANT to set the parameter Scan Schema in Dynamic Workflow to Yes.

There is a published parameter through which to define the column to use. The website would pass that to the workspace.

The other reader is a schema reader that reads the list of attributes and merges them onto the Excel rows.

The ListSearcher searches for the column name in the list of attributes. The AttributeDereferencer (nice one Hans) copies the value of the chosen column into an attribute called result.

In short, if I specify ColB as the column to use, I get values b1, b2, b3, b4 (which are the cell values for column B in my source data). But you can change the column names - and positions - and this should still work OK.

If there are multiple sheets in your Excel file, then you would need to filter out the correct one from the schema and in the Excel reader.

Workspace attached here: dynamicexcel.fmwt

Very nice! Thank you

 

 


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings