Skip to main content
Question

Making the aggregator less 'blocking' when using multiple group by attributes


ebygomm
Influencer
Forum|alt.badge.img+31

A workspace has an aggregator that is aggregating features using multiple group by attributes. One of the group by attributes is ordered, but the others are not. Any suggestions on how this could be processed so that the ordered feature could be considered, rather than the aggregator holding on to everything?

8 replies

takashi
Contributor
Forum|alt.badge.img+19
  • Contributor
  • November 6, 2018

I think the only way is to sort the features by all the Group By attributes beforehand and set Yes to the Input is Ordered by Group parameter in the Aggregator.


nielsgerrits
VIP
takashi wrote:

I think the only way is to sort the features by all the Group By attributes beforehand and set Yes to the Input is Ordered by Group parameter in the Aggregator.

Or read the source data in sessions grouped by, process and aggregate the features grouped by. But not sure if this works with your data / workflow.

Or temporary write features and then read features in sessions by group by attribute. But depending on volume and storing solutions maybe to much I/O.


ebygomm
Influencer
Forum|alt.badge.img+31
  • Author
  • Influencer
  • November 6, 2018
takashi wrote:

I think the only way is to sort the features by all the Group By attributes beforehand and set Yes to the Input is Ordered by Group parameter in the Aggregator.

Does that not just move the blocking problem to the sorter?


takashi
Contributor
Forum|alt.badge.img+19
  • Contributor
  • November 6, 2018
ebygomm wrote:

Does that not just move the blocking problem to the sorter?

I don't think you can avoid the blocking problem. However, the performance could be better if you inserted a Sorter before the Aggregator, so I think it's worth to try.


fmelizard
Contributor
Forum|alt.badge.img+17
  • Contributor
  • November 6, 2018

I see the issue and it is a tricky one. Once one of the attributes changes, you'd like to flush out everything in the aggregator because you know that one attribute was ordered. The only trick I could think of was to make use of custom transformers. If you make a custom xformer of just an aggregator, and then turn on its "parallel group by" to have just the one attribute you know is ordered, and indicate that it is ordered, then you *should* get what you want. I didn't try but see below and attached. Let us know if this works.

 

flushedaggy.fmw


ebygomm
Influencer
Forum|alt.badge.img+31
  • Author
  • Influencer
  • November 6, 2018
fmelizard wrote:

I see the issue and it is a tricky one. Once one of the attributes changes, you'd like to flush out everything in the aggregator because you know that one attribute was ordered. The only trick I could think of was to make use of custom transformers. If you make a custom xformer of just an aggregator, and then turn on its "parallel group by" to have just the one attribute you know is ordered, and indicate that it is ordered, then you *should* get what you want. I didn't try but see below and attached. Let us know if this works.

 

flushedaggy.fmw

I'll take a look at that option tomorrow, I'm trying to remove the requirement to use a workspace runner and the only issue now is the aggregator.


ebygomm
Influencer
Forum|alt.badge.img+31
  • Author
  • Influencer
  • November 9, 2018
ebygomm wrote:

I'll take a look at that option tomorrow, I'm trying to remove the requirement to use a workspace runner and the only issue now is the aggregator.

Putting the aggregator inside the custom transfomer actually adds 30 seconds to the time for half a million features. The only difference i saw in output was that the features came out the aggregator ordered by the parallel group by attribute


fmelizard
Contributor
Forum|alt.badge.img+17
  • Contributor
  • November 10, 2018
ebygomm wrote:

Putting the aggregator inside the custom transfomer actually adds 30 seconds to the time for half a million features. The only difference i saw in output was that the features came out the aggregator ordered by the parallel group by attribute

Hmm, that is really too bad. We're working on this part of FME right now and hope we can make it better in future. But for now it looks like this ended up being a dead end.

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings