Question

Why was the aggregate parameter in the Clipper removed (PR#81764) in 2019?


Badge

I don't want multi-part features in my data so now I have to use deaggregator after every clipper.


4 replies

Userlevel 2
Badge +16

Good point.

I would add the request to bring it back as an idea.

I will vote for it.

Userlevel 1
Badge +11

Hi @apadilla,

Thanks for linking the issue (which was resolved for 2019.0 in FMEENGINE-13364 to remove the Create Aggregates option). It looks like the option to Create Aggregates was removed because it could be confusing for some users. Relaying some comments from the issue, originally the option did something like "if my input into the Clipper didn't have an aggregate, don't give me an aggregate in the output." Our team thought that it was likely that any behaviour that a user could desire from a similar flag can be more easily obtained at the workspace level. Please feel free to add it as an Idea here to let us know what your preference is on that.

We just upgraded to 2019 and I agree, the aggregate parameter in the Clipper was invaluable to our processes and should not have been removed. Data validation software often requires that there be no Multi-Part features and using FME was the fastest most efficient way to Clip data without worrying about if Multi-Part features were being created. Now we have to run additional processes in order to get the data to meet standards.

I think improvements in documentation could have explained the concept more clearly to users rather than getting rid of the option completely for those of us that do use it.

Userlevel 4
Badge +25

Hi @kmh3bd (and @apadilla)

I'm adding an apology here, because I wrote such a lengthy answer. I tend to do that when I'm unclear what the correct response is, and edit then it as I go along. There's a "too long; didn't read" response at the foot of the answer in case you prefer that. Anyway...

Can I ask what you would like to see this parameter do on the Clipper transformer? The problem isn't what it does to simple, single geometries, but when you consider that there are (or can be) multiple levels of aggregate and sub-aggregate in either input or output.

If you look at the Aggregator and Deaggregator transformers, then there are multiple parameters and I think the feeling was that it's more flexible for our users to use these transformers (either before or after the Clipper) to handle such scenarios, instead of us trying to do it in one parameter in the Clipper.

For example, an attribute called _clipped is created in the output features. One simple way to solve the issue would be to filter features (Tester) where _clipped = yes and apply the Deaggregator to them.

Additionally, you could use an AggregateFilter before the Clipper to identify features that are not aggregates, give them a flag (AttributeCreator), and add that to the post-Clipper Tester filtering. That way you are only deaggregating features that were single to start with (Clipped = yes, and WasNotAggregate = yes).

It just makes it a bit more flexible than for FME to decide what should and shouldn't be aggregated or deaggregated. We don't really want to add all of the Aggregator and Deaggregator parameters into the Clipper, and a single parameter doesn't cover many options.

Just to emphasize how complicated it could get, these are the sort of questions our geometry team will need me to answer if I ask for that parameter back...

  • If a single feature is clipped, and there are multiple pieces inside and out, should that be an aggregate?
    • Should it be a separate aggregate for inside/outside?
    • Should it be a single aggregate, but separate sub-aggregates?
    • Should there be no aggregates at all?

In short, should we ever create an aggregate of clipped output?

  • If an aggregate is clipped, with multiple parts inside/outside, should it be deaggregated?
    • If it should be deaggregated, should all of the aggregate be fully deaggregated (multiple levels)?
    • Or should only the clipped part of the aggregate be deaggregated?
    • For example, an aggregate of all US states is clipped by a polygon in the mid Pacific. Only Hawaii (itself an aggregate of islands) falls inside the clip boundary.
      • Should Hawaii stay an aggregate, but a different feature to the rest of the US?
      • Should Hawaii stay an aggregate, in the same feature as the rest of the US, but a different sub-aggregate?
      • Should Hawaii be deaggregated into separate features (the rest stays as an aggregate)
      • Should all of the data be deaggregated into single features? (US and Hawaii)

In short, should a clip action destroy an aggregate, and to what extent?

  • If an aggregate is clipped, and there is only a single piece either inside or outside, should that single piece still be an aggregate?
    • For example, an aggregate of all US states is clipped by a polygon in the north Pacific. Only Alaska (a single polygon) falls inside the clip boundary.
      • Should Alaska be a single polygon feature?
      • Should Alaska be an aggregate feature with one piece?
    • And if the polygon cut through Alaska (or one of the Hawaiian islands in the above example)...
      • Should the outside part of Alaska be a single polygon or a single-piece aggregate?
      • Should the inside part of Alaska:
        • Be a single polygon
        • Be a single-piece aggregate
        • Remain a part of the US aggregate

In short, are single-piece aggregates valid or not?

I think the problem with the existing parameter is that it mixed up its approach, so even if it worked for your scenarios, it didn't work for other users' scenarios. And even if I answered all the above questions, it's still fitting only one particular scenario. That's why we decided to remove the parameter, to avoid all ambiguity and let users apply the Aggregator/Deaggregator to get exactly what they needed.

If we were to put a parameter back into the Clipper then I think we'd want to put a limit on what might be expected to be processed.

For example, the SpatialFilter has a parameter called Support Mode, where you can choose whether to support Aggregates or not. We could perhaps put the same on the Clipper? If so, support aggregates = yes would mean that we do as now (single features split into parts would become an aggregate). If support aggregates = no, then we wouldn't create aggregates, but also incoming aggregates would get shunted to the Rejected port. That's the trade-off.

Would that be acceptable to you?

Too Long; Didn't Read version...

  • There are many, many different scenarios for incoming and outgoing geometry
  • The old parameter supported only one specific scenario
  • Users with that scenario were OK. Everyone else was confused. That's why we removed it.
  • The Aggregator/Deaggregator give you all the options you need to handle your specific scenario
  • If we were to put a parameter back, it would need to be a very restrictive one. Like a no-aggregates-are-supported-at-all mode.

Hopefully reading all this (perhaps a couple of times!) makes it clearer? Writing it certainly helped me to understand. Do let me know if you would still like to get a parameter back. I think I can make a very good case to have a no-aggregates mode. But I don't think I can get the parameter back exactly as it was before, because its behaviour was so unpredictable depending on the incoming data structure.

Regards

Mark

Reply