Skip to main content

Let's say I have 10000 features going thorugh my workflow and 7000 do not pass a test. I want to send a single message (to whatever) and not a message for each feature.

I know there's many ways of doing this but I'm looking for the most efficient one. Maybe my approach is wrong, but I've considered any transformer that lets me get a single feature and then I build the message with an AttributeManger or an AttributeCreator:

  • Sampler: 9999 features would go through the NotSampled port
  • StatisticsCalculator: I actually have no statistics to calculate
  • Aggregator: may be the best option
  • Any other?

 

 

HI @dms2​ ,

 

go for the Aggregator and set the "Accumulation Mode" parameter to whatever is best for you, maybe "Use Attributes from One Feature".

 

Hope that helps!


HI @dms2​ ,

 

go for the Aggregator and set the "Accumulation Mode" parameter to whatever is best for you, maybe "Use Attributes from One Feature".

 

Hope that helps!

Setting "Accumulation Mode" parameter to "Use Attributes from One Feature" is the best option I think. I guess only one feature is read so it should be faster than setting other parameter combination.


Go with the Sampler - the features going through the NotSampled port will not get processed/output if you turn off Features Caching (which you should for performance reasons). The Aggregator will wait until all features have been collected before outputting the merged feature, this will increase the amount of memory used in the process.

 

TO be sure you can do some performance testing to compare.


Go with the Sampler - the features going through the NotSampled port will not get processed/output if you turn off Features Caching (which you should for performance reasons). The Aggregator will wait until all features have been collected before outputting the merged feature, this will increase the amount of memory used in the process.

 

TO be sure you can do some performance testing to compare.

This. But why not test yourself. Using 1.000.000 features generated by a Creator, the results on my old laptop are:

  • Sampler
    • FME Session Duration: 0.7 seconds. (CPU: 0.3s user, 0.3s system)
    • END - ProcessID: 9416, peak process memory usage: 46564 kB, current process memory usage: 46564 kB
  • Aggregator (Attributes Only / Drop Incoming Attributes)
    • FME Session Duration: 3.3 seconds. (CPU: 2.7s user, 0.5s system)
    • END - ProcessID: 11440, peak process memory usage: 425876 kB, current process memory usage: 47232 kB
  • StatisticsCalculator (Analyze _creation_instance / No statistics)
    • FME Session Duration: 48.7 seconds. (CPU: 47.5s user, 0.5s system)
    • END - ProcessID: 14940, peak process memory usage: 95540 kB, current process memory usage: 88996 kB

This. But why not test yourself. Using 1.000.000 features generated by a Creator, the results on my old laptop are:

  • Sampler
    • FME Session Duration: 0.7 seconds. (CPU: 0.3s user, 0.3s system)
    • END - ProcessID: 9416, peak process memory usage: 46564 kB, current process memory usage: 46564 kB
  • Aggregator (Attributes Only / Drop Incoming Attributes)
    • FME Session Duration: 3.3 seconds. (CPU: 2.7s user, 0.5s system)
    • END - ProcessID: 11440, peak process memory usage: 425876 kB, current process memory usage: 47232 kB
  • StatisticsCalculator (Analyze _creation_instance / No statistics)
    • FME Session Duration: 48.7 seconds. (CPU: 47.5s user, 0.5s system)
    • END - ProcessID: 14940, peak process memory usage: 95540 kB, current process memory usage: 88996 kB

 

Thanks for doing the test. That saved my some precious time.

 

Anyway I thought there would be some other "more elegant" or obvious option that I hadn't considered. Often there's that 'hidden' parameter or trick that make transformers do what you thought they were not intended to do.


Reply