Skip to main content
Solved

looping over rows, columns, perform operations on the columns and store values in new columns


I am just one week into FME (a friend recommended it to me)

I want to loop through a 761,000 rows by 25 columns excel document, for the each row

  1. count columns with values>0 and store in a column called count
  2. for the values > 0, add them and store in a new column called anim_loc
  3. return the names of the column title that has values greater than zero and store in a column called anim_names

Please how can i achieve these with FME.

Best answer by fmelizard

Hi @whizzkid247, I've attached a workspace to my post here. In it, I've calculated the count values with the Aggregator, the Anim_loc values using FME's ability to utilize Excel functions, and the Anim_names attribute with lists.

 

 

looping-excel.fmw
View original
Did this help you find an answer to your question?

13 replies

erik_jan
Contributor
Forum|alt.badge.img+18
  • Contributor
  • July 24, 2017

I think the StatisticsCalculator transformer will be your friend after reading the Excel file using the Excel reader.

For filtering out values of 0 you want to use the Tester transformer.


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • July 25, 2017

@whizzkid247

Can do it like this.

excel-non-empty-column-counting.png


lars_de_vries
Forum|alt.badge.img+10

You want to check a lot of features (25*761000). I think the best way to solve this is using a Python script within a PythonCaller transformer.

I have been looking into a solution using an AttributeExploder and a Aggregator and it works quite quickly with the sample file, like @gio suggested as well. However I used a cloner to reproduce the same number of features and it will take like an hour or more to process the 761.000 features.

Personally I am not so good with Python, but maybe someone else can help with that.

Also I have been looking for a solution within Excel, and the first two columns can be filled very easy by using the COUNTIF(range, criteria) function for Count and SUM() function for Anim_loc. I am not sure how to get the list of Anim_names.


gio wrote:

@whizzkid247

Can do it like this.

excel-non-empty-column-counting.png

Thanks i am still trying this out. I am new to FME so i need to look at the documentation as well as watch videos on how to set the parameters

 

 


gio wrote:

@whizzkid247

Can do it like this.

excel-non-empty-column-counting.png

@gio I do not understand where to set the TEST in the Tester transformer.

 

i.e. How do i input in the Test Clauses?

 

TEST @Value(_attr_name) NOT_LIKE fme_%

 

TEST @Value(_attr_name) NOT_LIKE xlsx_%

 

TEST @Value(_attr_name) NOT_= xlsx_row_id

 

TEST @Value(_attr_value) NOT_= 0

 


lars_de_vries wrote:

You want to check a lot of features (25*761000). I think the best way to solve this is using a Python script within a PythonCaller transformer.

I have been looking into a solution using an AttributeExploder and a Aggregator and it works quite quickly with the sample file, like @gio suggested as well. However I used a cloner to reproduce the same number of features and it will take like an hour or more to process the 761.000 features.

Personally I am not so good with Python, but maybe someone else can help with that.

Also I have been looking for a solution within Excel, and the first two columns can be filled very easy by using the COUNTIF(range, criteria) function for Count and SUM() function for Anim_loc. I am not sure how to get the list of Anim_names.

@lars_de_vries yes i can easily do the first two columns with excel, but i face two challenges;

 

First, excel crashes because of the size of the data, and

 

second, the most important column to my work is the last column ie. Anim_names

 

 

could you point me to a tutorial on how to use the pythonCaller transformer?

 


fmelizard
Safer
Forum|alt.badge.img+18
  • Safer
  • Best Answer
  • July 25, 2017

Hi @whizzkid247, I've attached a workspace to my post here. In it, I've calculated the count values with the Aggregator, the Anim_loc values using FME's ability to utilize Excel functions, and the Anim_names attribute with lists.

 

 

looping-excel.fmw

fmelizard wrote:

Hi @whizzkid247, I've attached a workspace to my post here. In it, I've calculated the count values with the Aggregator, the Anim_loc values using FME's ability to utilize Excel functions, and the Anim_names attribute with lists.

 

 

looping-excel.fmw
@TiaAtSafe exactly what i needed. Thank you so much!!

 

 

The count is not working properly (it was supposed to count values that are greater than zero). I can do this in excel.

 

 

Thank you again!

 


chriswilson
Supporter
Forum|alt.badge.img+11
  • Supporter
  • July 26, 2017
whizzkid247 wrote:
@lars_de_vries yes i can easily do the first two columns with excel, but i face two challenges;

 

First, excel crashes because of the size of the data, and

 

second, the most important column to my work is the last column ie. Anim_names

 

 

could you point me to a tutorial on how to use the pythonCaller transformer?

 

The python caller documentation is here: https://docs.safe.com/fme/2017.0/html/FME_Desktop_Documentation/FME_Transformers/Transformers/pythoncaller.htm

 

 

If you are in FME and add any transformer to the canvas (the large white area) you can then right click on it and select Help to bring up documentation on that transformer, which is often very good at explaining what it does.

 

 


lars_de_vries
Forum|alt.badge.img+10
Hi @whizzkid247,

 

 

I saw your remark on the Workspace provided by @TiaAtSafe. Hereby I provide you my solution, where the count does work. Still you need to take into account that for the full amount you stated in your question, the workspace will run over an hour.

 

It also shows that there are many solutions to the same problem.

 

xlsx-read-explode-and-aggregate-v000a.fmw

 


lars_de_vries wrote:
Hi @whizzkid247,

 

 

I saw your remark on the Workspace provided by @TiaAtSafe. Hereby I provide you my solution, where the count does work. Still you need to take into account that for the full amount you stated in your question, the workspace will run over an hour.

 

It also shows that there are many solutions to the same problem.

 

xlsx-read-explode-and-aggregate-v000a.fmw

 

Thank you @lars_de_vries

lars_de_vries wrote:
Hi @whizzkid247,

 

 

I saw your remark on the Workspace provided by @TiaAtSafe. Hereby I provide you my solution, where the count does work. Still you need to take into account that for the full amount you stated in your question, the workspace will run over an hour.

 

It also shows that there are many solutions to the same problem.

 

xlsx-read-explode-and-aggregate-v000a.fmw

 

@lars_de_vries Thank you once more for your assistance! Please could you explain what this (, ){2,} and (^, |, $) represents in the text to replace field of both StringReplacerParameters?

 

 


lars_de_vries
Forum|alt.badge.img+10
lars_de_vries wrote:
Hi @whizzkid247,

 

 

I saw your remark on the Workspace provided by @TiaAtSafe. Hereby I provide you my solution, where the count does work. Still you need to take into account that for the full amount you stated in your question, the workspace will run over an hour.

 

It also shows that there are many solutions to the same problem.

 

xlsx-read-explode-and-aggregate-v000a.fmw

 

Hi @whizzkid247,

 

The (,) {2,} means two or more instances of ', '. The (^, |, $) means starts with ', ' or ends with ', '. The '|' sign means or '^' means "begins with" and '$' means "ends with" .

 

 


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings