Skip to main content
Question

AreaOnAreaOverlayer Bad Performance - Painful Hours to process 1,300,00 records

  • 24 July 2024
  • 2 replies
  • 48 views

Hi All,

I have created a workbench that includes a few AreaOnAreaOverlayer transformers. For some reason one of the AreaOnAreaOverlayer takes hours and hours to union cadastral data (1,200,000 records) with some protected areas PHI (100,000 records). I have checked the log file and the AreaOnAreaOverlayer transformer always accomplishes the following 4 steps: 

  1. Performing low-level intersection at phase #1
  2. Performing low-level intersection at phase #2
  3. Searching for topologically significant nodes
  4. Breaking curves at topologically significant nodes

I have attached the log file as .TXT for reference. For some reason the AreaOnAreaOverlayer_3 slows the process at Breaking curves at topologically significant nodes. This transformer is at the middle of the workflow which means is being fed with data that comes from other transformers. here is an screenshot of the log file:

...and here another screenshot of the settings of the problematic transformer:

I have also run it with the quick translator and all the data is on my local drive and also takes hours and hours to run.

It’s worth mentioning that the AreaOnAreaOverlayer runs within within a reasonable time response if I run the transformer separately in a blank FME workspace with the same data (OSMM and PHI polygons) but I can’t do that as need to bring the data from the previous steps of the workflow,

Any idea to improve the performance of the workflow?

Thanks :) 

2 replies

Userlevel 6
Badge +39

I see in your screenshot you have feature caching on, thats probably not helping.

You could also do some high level pre filtering to reduce the number of features you’re looking. One approach would be to:

  1. Create bounding boxes of your protected areas (or set the SpatialFilter up to only use bounding boxes)
  2. Use a spatial filter to then select the cadastral areas that intersect
  3. run those through the area on area overlayer

 

There are also other approaches where you could do a similar type of pre filtering that will enable you to use the groupby functionality but the approaches depend on the exact topology of your data (ie can protected areas overlap, can one cadastral unit be in two protected areas etc)

Userlevel 3
Badge +13

Thanks @hkingsbury I think one of the issues is that my protected areas is a bad dataset and contains thousands of vertexes per polygon which slows the process. I’ll try to Generilaze (e.g. Douglas) my datasets and include them through the AreaOnAreaOverlayer following your advise :) 

Reply