Skip to main content

Has anyone else noticed a significant decrease in performance when using the ListBuilder Input is Ordered by Group?

When I have 100,000 features in 10,000 groups of 10 features each, when I set the Input Is Ordered to no, the processing time is 8 seconds, even though the transformer is blocking. Whereas when set it to Ordered by Group, the processing time is 74 seconds (almost 10x slower), even through the transformer is non blocking.

No parallel processing is involved.

I assume the data is actually ordered the right way. If not this is easily explained as it would create too many lists.


I assume the data is actually ordered the right way. If not this is easily explained as it would create too many lists.

The data is correctly ordered.

 

 


Both options produce identical results, though the Ordered: No option has a much higher peak memory consumption. Which is not unexpected.

 

 


The data is correctly ordered.

 

 

I figured it would be.

 

 


I smell a bug. I know that for some processing functions when we turn on "ordered by group" another layer is added above the actual processing unit (factory in old school FME lingo). That must be causing this. I just verified this and will have a ticket/case created. In my case it goes from 2.6 seconds when we don't order by group to 77 seconds when you do. Not much of an optimization:-)


I smell a bug. I know that for some processing functions when we turn on "ordered by group" another layer is added above the actual processing unit (factory in old school FME lingo). That must be causing this. I just verified this and will have a ticket/case created. In my case it goes from 2.6 seconds when we don't order by group to 77 seconds when you do. Not much of an optimization:-)

listbuilderorderedby.fmw in case anyone wants to try this at home...

 

 


Reply