Solved

Optimizing point in Polygon overlay operations

  • 24 November 2015
  • 8 replies
  • 9 views

Badge
I am trying to optimize a large collection of "Point in Polygon" overlay operations. The work flow is this:

 

 

1. The user selects a point on the screen (will be publishing to an FME Server solution, once successfully prototyped).

 

 

2. Workbench will accept the point, and then "drill down" through all of the attached data sources and report on the findings, in the context of a PDF report.

 

 

My question is this: some of these data sets have a very large number of features. The assessor parcel layer, for example, has 350,000 polygons to process. Power poles (for proximity searching) number in the 130,000 records range. Is there a way to "pre-process" these data sets so that I am not searching through 350,000 parcels when looking through a few city block's worth would go much faster? More simply put, is there a way to restrict the overlay operation to a smaller geographic subset, like constructing a special geometry (a polygon?) that the PointOnAreaOverlayer would use to search by, instead of sequentially looking at 350,000 records?

 

 

Regards, Lynn

 

 

OS: Windows 7, 64-bit

 

FME Workbench: 2015.0 (Build 15253 - WIN64)

 

 

icon

Best answer by mark2atsafe 11 December 2015, 16:55

View original

8 replies

Badge
Firstly, I think you are using the right transformer, as spatialfilter and clipper could also be used. In my previous testing however, pre 2015, the overlay tools were quicker. However I think your problem here is one of workbench design first principles - minimise the amount of data you read into your pipeline. I'd advise ensuring you store your reference data in a database that fme can perform spatial selects against: arcsde, oracle spatial, postgis, filegdbs can all be used I think - there are probably lots of others. Then i'd look to use the 'featurereader' transformer with a spatial select. This way you don't need to run expensive read jobs to import all of the data everytime.
Userlevel 4
Badge +13
Firstly, I think you are using the right transformer, as spatialfilter and clipper could also be used. In my previous testing however, pre 2015, the overlay tools were quicker. However I think your problem here is one of workbench design first principles - minimise the amount of data you read into your pipeline. I'd advise ensuring you store your reference data in a database that fme can perform spatial selects against: arcsde, oracle spatial, postgis, filegdbs can all be used I think - there are probably lots of others. Then i'd look to use the 'featurereader' transformer with a spatial select. This way you don't need to run expensive read jobs to import all of the data everytime.

Completely agree. The original question doesn't specify what system the polygons are in, but if that system has a spatial index, then the FeatureReader transformer can be used to effectively do the overlay operation very efficiently and "outsource" the spatial processing to the underlying database. @sclkimly -- let us know if this is not clear, and also where your original data is being held, and we can provide more insights. But it shuld be FeatureReader FTW!

Userlevel 2
Badge +12

If both datasets (point and polygons) are in the same database even the SQLCreator will do the job. This way (as with the FeatureReader) the database will do the calculations of the overlay.

Userlevel 4
Badge +25

I gave it a test (1 point against 20,000 building footprints) and found the FeatureReader with a "Within" spatial filter was twice as quick as using the PointOnAreaOverlayer, and with about 20% less memory.

And that's with using an AutoCAD DWG as the building footprint source. If it were a spatial database, it would be even quicker.

Another thing I tested was using a Clipper (buffer the point to a very small polygon and use Clippers first). It too is quicker than using the PointOnAreaOverlayer, but not as good as the FeatureReader.

These were only small tests, but I think they indicate the way to go.

Badge

Thank you, all, for the ideas to try. For speed sake, I am using a local, file geodatabase as much of the queried information is fairly static. My goal is to cut down the amount of time it takes to research a proposed work area (e.g. is permitting needed? from what agency? proximity to sensitive areas? etc.). I like Mark's answer about using a "micro clipper" to pre-process each of the queried feature classes. That said, I am still seeing the Workbench "processing" the entire number of records for a given feature reader. I am only testing the parcels at this point because there are 300,000 in the service territory. Hmm.

Again, thank you all for the input. I will still be reading any more posts that are placed here as I still have much to learn about the most efficient way to process my task.

Badge

I gave it a test (1 point against 20,000 building footprints) and found the FeatureReader with a "Within" spatial filter was twice as quick as using the PointOnAreaOverlayer, and with about 20% less memory.

And that's with using an AutoCAD DWG as the building footprint source. If it were a spatial database, it would be even quicker.

Another thing I tested was using a Clipper (buffer the point to a very small polygon and use Clippers first). It too is quicker than using the PointOnAreaOverlayer, but not as good as the FeatureReader.

These were only small tests, but I think they indicate the way to go.

I have a sample pick point (a POINT object in a filegeodatabase) that I am using to jump start the whole process. I would like to apply a spatial search envelope to my 300k features. Add +/- 350 feet to the point's XY coordinate, then pass these values to the feature reader's search envelope. How do I make that association. Do I create a custom (user) variable, then reference that in the feature reader? I know what I want to do, it's just a challenge to put the Lego blocks together in the proper sequence...

Userlevel 2
Badge +12

I have a sample pick point (a POINT object in a filegeodatabase) that I am using to jump start the whole process. I would like to apply a spatial search envelope to my 300k features. Add +/- 350 feet to the point's XY coordinate, then pass these values to the feature reader's search envelope. How do I make that association. Do I create a custom (user) variable, then reference that in the feature reader? I know what I want to do, it's just a challenge to put the Lego blocks together in the proper sequence...

Use a Bufferer to create a 350 feet radius circle and pass that to the FeatureReader. If you need a rectangle use the BoundingBoxReplacer after the Bufferer.

Badge

I was mistakenly using a Reader object and not a "FeatureReader" transformer. I can now see all of the spatial filters that people have been mentioning. Now I just pass the bounding box (as computed from the pick point) to the initiator port. VERY, very fast reading now.

Reply