Solved

StatisticsCalculator error with python 3.6+


Hey FME'ers,

 

For the moment I am updating all my workbenches from Python 2.7 to Python 3.6+. While doing so I encountered a problem with the StatisticsCalculator. Giving me the error: ' ''<' not supported between instances of 'float' and 'str' '.

As far as I understand this is because some of the data is numerical where the others are strings. To avoid the error I tried to force all data to a string but that did not do the trick. I suspect this has to do with the fact that Python 3+ uses UTF-8 as default. Note that I only want to calculate the 'Total Count Attribute'.

In attachement a simple example of my problem. My workbench version is 2018.1 but I encountered the same problem in FME 2019

Any suggestions?

Thanks in advance!

icon

Best answer by hollyatsafe 27 August 2019, 18:59

View original

19 replies

Userlevel 1
Badge +21

This looks like one for Safe.

If you just want a total count, you could use a counter with a Count Start of 1 followed by a Sampler sampling the last feature as a workaround.

This looks like one for Safe.

If you just want a total count, you could use a counter with a Count Start of 1 followed by a Sampler sampling the last feature as a workaround.

Thanks egomm for the reply! Your solution is a very good workaround. Nevertheless I'm still curious how to solve this with the StatisticsCalculator.

Userlevel 1
Badge +21

Thanks egomm for the reply! Your solution is a very good workaround. Nevertheless I'm still curious how to solve this with the StatisticsCalculator.

Although I can see the error when i run your workspace, if I replace the statistics calculator the error disappears

Userlevel 6
Badge +32

Although I can see the error when i run your workspace, if I replace the statistics calculator the error disappears

I think I remember there was an issue with the StatisticsCalculator before. Maybe this is an old transformer?

I think I remember there was an issue with the StatisticsCalculator before. Maybe this is an old transformer?

I updated the transformer so I don't think that could be the issue.

Badge +11

Although I can see the error when i run your workspace, if I replace the statistics calculator the error disappears

Which version are you using?

Which version are you using?

StatisticsCalculator version 7 (within FME 2018.1)

Badge +2

Hi @louisput,

Thank you for bringing this issue to our attention. I have been able to reproduce the error in the latest FME Desktop beta. It looks like python doesn't naturally handle the sorting of strings and numbers so when an attribute contains both data types the transformer is currently unable process these.

I have filed bug FMEENGINE-61301 as we should be able to do better here. In the meantime I believe the solution provided by egomm is the best workaround.

 

 

Update: This issue was fixed for FME 2020.0

Hi @louisput,

Thank you for bringing this issue to our attention. I have been able to reproduce the error in the latest FME Desktop beta. It looks like python doesn't naturally handle the sorting of strings and numbers so when an attribute contains both data types the transformer is currently unable process these.

I have filed bug FMEENGINE-61301 as we should be able to do better here. In the meantime I believe the solution provided by egomm is the best workaround.

 

 

Update: This issue was fixed for FME 2020.0

Thanks for the reply!

Badge +1

This looks like one for Safe.

If you just want a total count, you could use a counter with a Count Start of 1 followed by a Sampler sampling the last feature as a workaround.

If you want the sum instead of the total count, you can do the following as a workaround for StatisticsCalculator:

  1. AttributeCreator to create a new attribute which is the same for all records
  2. ListBuilder to create a list which will be attached to all records, with the Selected Attributes set to the attribute you want to sum
  3. ListSummer to sum all the values in the list
Badge +1

This looks like one for Safe.

If you just want a total count, you could use a counter with a Count Start of 1 followed by a Sampler sampling the last feature as a workaround.

If you need to use a 'GroupBy' in the Counter, refer to @takashi solution here: https://knowledge.safe.com/content/idea/23363/allows-select-multiple-attribute-value-for-counter.html

Badge +13

Hi @louisput,

Thank you for bringing this issue to our attention. I have been able to reproduce the error in the latest FME Desktop beta. It looks like python doesn't naturally handle the sorting of strings and numbers so when an attribute contains both data types the transformer is currently unable process these.

I have filed bug FMEENGINE-61301 as we should be able to do better here. In the meantime I believe the solution provided by egomm is the best workaround.

 

 

Update: This issue was fixed for FME 2020.0

At a client of mine, in 2018 a datastream of mixed numeric and numeric+character values in an attribute was send throug a StatisticsCalculator, looking for the _min and _max value. This process stopped working after the upgrade to 2019.1.2

When building the workaround using a sorter, two samplers (first and last) and an Aggregator, I discovered that the old statistics calculator-route often did not yield the correct/expected answers.

In short, it previously did not crash, and now it does, but it never consistently yielded the correct results.

On a first glance the output of the 2018 StatisticsCalculator looks quite feasible, but looking into the 5% non-matching instances I found my sorter / sampler route to create the correct _min and _max and the old StatisticsCalculator fail.

@hollyatsafe: If interested I could try to compile a set of workspaces and data for you to analyse...

Kind regards,

Martin

Badge +2

At a client of mine, in 2018 a datastream of mixed numeric and numeric+character values in an attribute was send throug a StatisticsCalculator, looking for the _min and _max value. This process stopped working after the upgrade to 2019.1.2

When building the workaround using a sorter, two samplers (first and last) and an Aggregator, I discovered that the old statistics calculator-route often did not yield the correct/expected answers.

In short, it previously did not crash, and now it does, but it never consistently yielded the correct results.

On a first glance the output of the 2018 StatisticsCalculator looks quite feasible, but looking into the 5% non-matching instances I found my sorter / sampler route to create the correct _min and _max and the old StatisticsCalculator fail.

@hollyatsafe: If interested I could try to compile a set of workspaces and data for you to analyse...

Kind regards,

Martin

Hi @martinkoch,

Definitely, if this appeared to work in earlier version but the results were actually incorrect it would be great to see an example workspace demonstrating this. This would be useful in the testing phases of this bug once it has been marked fixed.

- Thanks, Holly

Badge +13

Hi @martinkoch,

Definitely, if this appeared to work in earlier version but the results were actually incorrect it would be great to see an example workspace demonstrating this. This would be useful in the testing phases of this bug once it has been marked fixed.

- Thanks, Holly

I already thought so, so I prepared something today.

Let me know how to send this to you (you at Safe probably have my e-mail address..)

Badge +2

I already thought so, so I prepared something today.

Let me know how to send this to you (you at Safe probably have my e-mail address..)

Awesome, thank you @martinkoch! Please could you send this to and include 'Attn Holly' in the email subject.

Hi @louisput,

Thank you for bringing this issue to our attention. I have been able to reproduce the error in the latest FME Desktop beta. It looks like python doesn't naturally handle the sorting of strings and numbers so when an attribute contains both data types the transformer is currently unable process these.

I have filed bug FMEENGINE-61301 as we should be able to do better here. In the meantime I believe the solution provided by egomm is the best workaround.

 

 

Update: This issue was fixed for FME 2020.0

I ran into this similar problem using the NullAttributeCounter transformer with a excel spreadsheet that has fields with both numerical and string data types. These fields are already set as string. What can I do to set all values to one constant value and the rest to null? I really like to use the NullAttributeCounter transformer as it worked perfectly on a separate dataset. I would like to be able to just create a new excel spreadsheet that is "1" for a value and the rest to null. I am using FME 2018 server on Citrix.

Badge +2

I ran into this similar problem using the NullAttributeCounter transformer with a excel spreadsheet that has fields with both numerical and string data types. These fields are already set as string. What can I do to set all values to one constant value and the rest to null? I really like to use the NullAttributeCounter transformer as it worked perfectly on a separate dataset. I would like to be able to just create a new excel spreadsheet that is "1" for a value and the rest to null. I am using FME 2018 server on Citrix.

Hi @jencinas​ ,

 

The NullAttributeCounter is not owned by Safe Software but rather is available from the FME Hub and was created by another FME User. Therefore if there is a problem with this transformer or you'd like to see an enhancement please feed this back to this user directly by commenting on their transformer on the Hub . Hopefully they'll be able to work with you to get this fixed.

Badge +4

Hi @louisput,

Thank you for bringing this issue to our attention. I have been able to reproduce the error in the latest FME Desktop beta. It looks like python doesn't naturally handle the sorting of strings and numbers so when an attribute contains both data types the transformer is currently unable process these.

I have filed bug FMEENGINE-61301 as we should be able to do better here. In the meantime I believe the solution provided by egomm is the best workaround.

 

 

Update: This issue was fixed for FME 2020.0

Hi @hollyatsafe​ ,

I was wondering if this fix could also be applied to the "AttributePivoter" transformer? I am getting the exact same error when trying to assign an attribute that contains both numbers and strings to the "Attribute to Analyze" parameter.

I am currently using FME(R) 2021.0.0.0 (20210305 - Build 21302 - WIN64) with I think the most up to date version of the transformer.

Cheers,

David

Badge +2

Hi @hollyatsafe​ ,

I was wondering if this fix could also be applied to the "AttributePivoter" transformer? I am getting the exact same error when trying to assign an attribute that contains both numbers and strings to the "Attribute to Analyze" parameter.

I am currently using FME(R) 2021.0.0.0 (20210305 - Build 21302 - WIN64) with I think the most up to date version of the transformer.

Cheers,

David

Hi @ponting13​ ,

Update: This issue has been fixed in FME 2021.1

 

It looks like we are aware that this is also an issue with the AttributePivoter (internal reference: FMEENGINE-48446) but currently there is no planned work for this.

 

As a workaround, perhaps you can use the StatisticsCalculator. Example 2 in the Pivot Tables using FME article demonstrates how this could be done.  

Reply