Skip to main content

I have a table of 1600 polygons (featureset A) with most of those polygons overlapping ; they represent distance discs from a point of origin. I want to relate those polygons to another feature set (B) and calculate the %area of the intersection for each input polygon.

I can’t use the AreaonAreaOverlay as is since it explicitly states in the documentation that it doesn’t expect the input features to self intersect, and this is not true for A.

When I filter for a single polygon, The AreaonAreaOverlay generally gets me there, but, the problem is I want the output of all polygons.

 

Spatial relate doesn’t give me the intersection of the features, so I can’t calculate the % area overlap.

 

My preference would be to use geopandas.overlay and iterate over each polygon from A, but, my input data is stored in arc.SDE so converting them to gdf objects is more hassle than it seems worth to then convert back out into an FME feature object to then write to an SDE again.

 

I tried using arcpy.analysis.Intersect, but, the documentations indicates I should write to a gdb and I want to first concatenate the output since I  still want to capture the output for each polygon. Additionally, this function doesn’t accept the input fme feature object in the first argument position.

Any help would be really appreciated!

Maybe you can change your workflow by first performing an AreaOnAreaOverlayer on featureset A. 

This will give you a featureset with polygons that do not overlap.


..you could combine the idea of @geomancer with the Matchter (Check on geometry switched on) → this will check for duplicate geometries as well. In this you will clean up your dataset A

To compute the % overlap between polygons from dataset A and B a nice solution is provided here:
SpatialFilter - strange results? | Community (safe.com)


> Maybe you can change your workflow by first performing an AreaOnAreaOverlayer on featureset A.  

This doesn’t really work because I need to maintain the ID of the source polygon from A. The levels of overlay are deeply nested as you can see in the screenshot. This screenshot is a small area within Feature set A; hundreds of polygons can self intersect. Each polygon represents a driving distance corresponding to a certain point (which is the disc’s ID)

 

When I do the self-overlay on Featureset A, areas are lost.

This is one polygon (ID=207)

 

but after doing the area on area overlay filtering for 207 produces:

I would expect the exterior boundary to be maintained.


There is not really going to be a clean polygon that DOESNT intersect some other polygon


When using AreaOnAreaOverlayer, you can keep all the ID's of the original features in a list (enable ‘Generate List’). Later on in your workflow you can still use those ID's.

Are there any aggregate features in your dataset? AreaOnAreaOverlayer can either reject or deaggregate those aggregates. The transformer may be set to rejecting the aggregate features.


The issue wasn’t the aggregate or deaggregate, but what I actually needed was to prefix the output attributes when I use the list exploder.

One thing that is frustrating about this is that the self overlay produces an unnecessarily large number of features which then makes remaining downstream processing so much slower.

My ~1500 input features when self overlain creates about 350,000 new features, and then after list explosion we are at over 11.6M, which takes over 50G of memory. Exploding the feature list from Featureset B adds an additional layer to this. I do the explosion because I have to relate attributes from dataset C to the overlain data set, and make calculations based on the new area.

When I do an overlay of 1 feature from A on B, the max number of new features I expect for that 1 is ~50-60.

If I iterated over A, the max number of features from A overlain with B is somewhere around 70K

My next step to try is batch processing.


...

My next step to try is batch processing.

I guess having similarly dealt with complex polygon datasets with overlay challenges, note there are some other optimisation techniques that can be trialled to reduce the processing time and intermediate features.

 

To “clean” Eg. Dataset A, I think from the screenshot above, what you may be missing in the initial AreaOnAreaOverLayer is Group By A.ID , A.DistanceID.  Which should remove all initial self-intersections per Feature and clean them per Feature.

The next logical rule can introduce is that we could say that everything within a 1 km isochrone driving distance, must also include the 2 km, 3 km, 4km … etc. isochrone driving distances, and so there no need to explode out the 1 km isochrone to be part of the 2 km, 3 km… etc. isochrones.  Implicitly 1 km is a subpolygon of the 2, 3 km subpolygons etc.  We don’t need to spatially prove this with geographic intersection testing to know this is True.

What we can do is use a Descending Order Sorter + AreaOnAreaOverlayer to spit out non-overlapping driving distance delta-isochrone polygons per Driving Start Point ID

 

 

The Sorter makes sure that “Use Attributes From One Feature” picks the right Feature, being the last to enter the Transformer (at least in FME 2021)

This simplifies the polygons and minimises the attribution with zero lists needed to delta-isochrones like this.  This data simplification becomes much less Lists/Features having the Explode out in a final AreaOnAreaOverlayer intersecting the simplified A and B polygon sets.

 

 


@bwn FeatureSet A is already filtered for the largest desired disc. The issue is not that a single feature self intersects / consists of multiple polygons. It’s that each feature in the FeatureSet has extensive intersection with other neighboring features. Group by doesn’t work when overlaying B to A since B doesn’t have the unique ID in A. (This is why I have to overlay).

I have 1500 input features representing 75 mile driving ranges for 1500 points. For EACH input I need to make new calculations based on the area percent of the overlay with featureset B. Each input of A has no relationship with any other feature in A.


@bwn Group by doesn’t work when overlaying B to A since B doesn’t have the unique ID in A. (This is why I have to overlay).

 

Understand, and similarly had to grapple with this when looking at complex Overlays between 700,000 overlapping forecast polygons with 600,000 overlapping property polygons.   A raw AreaOnAreaOverlayer without Grouping produces a huge amount of features, and the resulting ListExploders produces several million features.

 

However, this can be optimised with Grouping, by first using far less expensive Spatial relationship transformers.

So lets take a heavily overlapping dataset of 1,600 Features in A and 1,600 Features in B.  Features in A overlap their neighbouring A Features, Features in B overlap their neighbouring B features, both of which we don’t want to know because we are only interested in A vs B and so need to find a way to stop AreaOnAreaOverlayer looking at these overlaps.

Sample screenshot of sample Features
 


If we do this the “naive” way, this ends up with a large amount of Overlaps:  A vs A, B vs B and A vs B (although the A vs B overlaps will be further fragmented where they are broken for the A vs A and B vs B overlaps needing further post-processing to dissolve/aggregate)

And the ListExploder output feature set gets very large.

 

To optimise this is to first establish a spatial relationship grouping table of A.ID vs B.ID

For this, going to use SpatialRelator but I’ve also use NeighbourFinder for finer control of overlap/adjacency testing.

 

This gives the Spatial Relationship Table of

 

So now we can add B.ID as an Attribute to Feature Dataset A, and similarly add A.ID as an Attribute to Feature Dataset B.  In this sample dataset there are 6,241 “pairs” of A.ID vs B.ID for where they intersect/spatially related and it is these pairs of A and related B Polygons that are going to feed into the AreaOnAreaOverlayer to look at 6,241 Groups individually, where it will only look at the Overlay between 1 Feature from A, and 1 Feature from B, per Group.

As a result, this substantially cuts down on the amount of Overlay Polygons output.


It also considerably reduces the output of ListExploder, and because of the Group By, we can use a Tester to to filter out the remnants of A and B that were not an A<->B overlap.

 

Leaving behind just the overlapping polygon of A and B, for which the Area of this can be compared back to the Parent Feature(s).
 

 


Reply