Question

?rename attributes


Badge

Hello, I would like to delete the first record/row of my table and use its values as the attribute names instead, e.g. I would like to call the second attribute "Unit" instead of "A". I can't specify that the first row is the header when I'm reading the file because I need to do some modifications in FME before the renaming. Which transformer/s should I use?

 

 

Secondly I would like to delete columns F-H by multiplying row 3, 4, 6 and 8. The aim is to have only one column that is called "Replacing".

 

 

Thank you for your help!


8 replies

Userlevel 2
Badge +12

I would just skip line one during the read.

Then use the AttributeManager to rename A to Unit, B to CellName etc and create a new attribute Replacing with value E + F + G + H.

Badge

@erik_jan Unfortunately I can't skip the first row during the read. It's part of the transformations of the excel file before the renaming (and I need to read this row in the beginning to actually know the column names because they are changing from week to week- I'll import regularly the excel file that we receive once a week).

Badge +22

So the first row contains the column names, which vary week to week, but will always contain one or more columns called "Replacing", of which you want the product of all the legitimate values. (Product not sum?)

 

 

Badge

@jdh Yes, you got it partially right. The attribute names vary, that's why I want to read the first row when importing. Then I want to use this first row to rename my attributes and delete this row as a record (in my case only 8 rows would remain). After that I want to multiply all rows that contain more than one value for "replacing". For row 3 I would like to duplicate it and then have only one value for the attribute 'replacing" per row. As a result I would like to have only 6 columns in my table (Replacement for Cancelled Cells, Unit, Cellname, Celltitle, Edition and Replacing) but 14 rows (first row deleted, 2x row 3, 4x row 4, 2x row 6, 2x row 8). Hope this time I explained it well enough.

I tried what @erik_jan suggested (to use the Attribute Manager) but this is not working. Instead of multiplying row 3, it stores all values from E to H in one attribute. That is not what I want. The result is the same number of rows.

 

Badge +22

I think it was the multiplying that threw us.

 

 

You want one feature for each instance of a column named "Replacing".

 

 

How about using a featureReader with the dataset set to an user parameter, all feature types and SingleOutputPort.

This should produce features with the attribute names from the first column. With the extra Replacing being named Replacing00, Replacing01.

Which is what you need for the ListExpressionPopulator (Source attribute expression: ^Replacing) which will take all your 'Replacing' attributes and turn them into a list.

Follow that by a ListExploder and you've 'multiplied' your rows that have multiple ReplacingValues.

 

 

You'll want to test to remove the empty values, and probably do some attribute cleanup to remove the Replacing00, Replacing01 original attributes.
Badge +22

@jdh Yes, you got it partially right. The attribute names vary, that's why I want to read the first row when importing. Then I want to use this first row to rename my attributes and delete this row as a record (in my case only 8 rows would remain). After that I want to multiply all rows that contain more than one value for "replacing". For row 3 I would like to duplicate it and then have only one value for the attribute 'replacing" per row. As a result I would like to have only 6 columns in my table (Replacement for Cancelled Cells, Unit, Cellname, Celltitle, Edition and Replacing) but 14 rows (first row deleted, 2x row 3, 4x row 4, 2x row 6, 2x row 8). Hope this time I explained it well enough.

I tried what @erik_jan suggested (to use the Attribute Manager) but this is not working. Instead of multiplying row 3, it stores all values from E to H in one attribute. That is not what I want. The result is the same number of rows.

 

I think it was the multiplying that threw us.

 

 

You want one feature for each instance of a column named "Replacing".

 

 

How about using a featureReader with the dataset set to an user parameter, all feature types and SingleOutputPort.

This should produce features with the attribute names from the first column. With the extra Replacing being named Replacing00, Replacing01.

Which is what you need for the ListExpressionPopulator (Source attribute expression: ^Replacing) which will take all your 'Replacing' attributes and turn them into a list.

Follow that by a ListExploder and you've 'multiplied' your rows that have multiple ReplacingValues.

 

 

You'll want to test to remove the empty values, and probably do some attribute cleanup to remove the Replacing00, Replacing01 original attributes.
Badge

@jdh Thanks for your answer. It's working. The only thing that I still don't know how to solve is the import (to have the first row as a header). For this question I didn't mentioned all the other difficulties I have with this excel file. The excel file is actually looking like this : https://knowledge.safe.com/questions/28361/data-e

The replacing part is only a part of the whole excel file. And the row number of the first row for the replacing part varies from to week to week (different data volume each week) so I can't specify a certain line number when importing.

Moreover in this case the replacing columns are from column E to H. Next week , it could be from E to J or K, that's why I want to avoid to rename the columns manually. I want to to avoid as well to modify the excel file manually before importing as it is potentially a source of error.

Badge +22

@jdh Thanks for your answer. It's working. The only thing that I still don't know how to solve is the import (to have the first row as a header). For this question I didn't mentioned all the other difficulties I have with this excel file. The excel file is actually looking like this : https://knowledge.safe.com/questions/28361/data-e

The replacing part is only a part of the whole excel file. And the row number of the first row for the replacing part varies from to week to week (different data volume each week) so I can't specify a certain line number when importing.

Moreover in this case the replacing columns are from column E to H. Next week , it could be from E to J or K, that's why I want to avoid to rename the columns manually. I want to to avoid as well to modify the excel file manually before importing as it is potentially a source of error.

ah, that complicates things immensely.

 

 

You should be able to adapt Takashi's solution for separating out the different parts of the excel file.

 

 

Add a sampler immediately after the variableRetriver grouped by your variable name, with 1 N features.

 

This separates the column name row from the data row.

 

 

Use a FeatureMerger with the sampled as a supplier and the notSampled as the requestor. Group by your variable name, join on 1,1 with an accumulation mode of prefix supplier (fname_).

 

 

This leaves you with only your data features, with a set of attributes containing values and a set of attributes containing the names.

 

eg

 

A: aa

 

B:12

 

C: abc

 

fname_A: Unit

 

fname_B: Cell Name

 

fname_C: Cell Title

 

 

etc.

 

 

you should be able to manipulate that into key,value pairs and then into attributes.

 

It should in theory be possible to do just with fme transformers, but it'll probably be easier within a PythonCaller.

 

Reply