Question

Making the aggregator less 'blocking' when using multiple group by attributes

  • 6 November 2018
  • 8 replies
  • 3 views

Userlevel 1
Badge +21

A workspace has an aggregator that is aggregating features using multiple group by attributes. One of the group by attributes is ordered, but the others are not. Any suggestions on how this could be processed so that the ordered feature could be considered, rather than the aggregator holding on to everything?


8 replies

Userlevel 2
Badge +17

I think the only way is to sort the features by all the Group By attributes beforehand and set Yes to the Input is Ordered by Group parameter in the Aggregator.

Userlevel 6
Badge +33

I think the only way is to sort the features by all the Group By attributes beforehand and set Yes to the Input is Ordered by Group parameter in the Aggregator.

Or read the source data in sessions grouped by, process and aggregate the features grouped by. But not sure if this works with your data / workflow.

Or temporary write features and then read features in sessions by group by attribute. But depending on volume and storing solutions maybe to much I/O.

Userlevel 1
Badge +21

I think the only way is to sort the features by all the Group By attributes beforehand and set Yes to the Input is Ordered by Group parameter in the Aggregator.

Does that not just move the blocking problem to the sorter?

Userlevel 2
Badge +17

Does that not just move the blocking problem to the sorter?

I don't think you can avoid the blocking problem. However, the performance could be better if you inserted a Sorter before the Aggregator, so I think it's worth to try.

Userlevel 4
Badge +13

I see the issue and it is a tricky one. Once one of the attributes changes, you'd like to flush out everything in the aggregator because you know that one attribute was ordered. The only trick I could think of was to make use of custom transformers. If you make a custom xformer of just an aggregator, and then turn on its "parallel group by" to have just the one attribute you know is ordered, and indicate that it is ordered, then you *should* get what you want. I didn't try but see below and attached. Let us know if this works.

 

flushedaggy.fmw

Userlevel 1
Badge +21

I see the issue and it is a tricky one. Once one of the attributes changes, you'd like to flush out everything in the aggregator because you know that one attribute was ordered. The only trick I could think of was to make use of custom transformers. If you make a custom xformer of just an aggregator, and then turn on its "parallel group by" to have just the one attribute you know is ordered, and indicate that it is ordered, then you *should* get what you want. I didn't try but see below and attached. Let us know if this works.

 

flushedaggy.fmw

I'll take a look at that option tomorrow, I'm trying to remove the requirement to use a workspace runner and the only issue now is the aggregator.

Userlevel 1
Badge +21

I'll take a look at that option tomorrow, I'm trying to remove the requirement to use a workspace runner and the only issue now is the aggregator.

Putting the aggregator inside the custom transfomer actually adds 30 seconds to the time for half a million features. The only difference i saw in output was that the features came out the aggregator ordered by the parallel group by attribute

Userlevel 4
Badge +13

Putting the aggregator inside the custom transfomer actually adds 30 seconds to the time for half a million features. The only difference i saw in output was that the features came out the aggregator ordered by the parallel group by attribute

Hmm, that is really too bad. We're working on this part of FME right now and hope we can make it better in future. But for now it looks like this ended up being a dead end.

Reply