I've been tasked with optimising a workbench that's quite slow.
Basically I take in a very large DEM raster 2m granularity, and need to clip it against a vector dataset of 50m grid squares. The vector dataset then pulls the underlying raster data (RasterCellCoercer), and calculates statistics (StatisticsCalculator) using a group-by from the clipping.
The clipping part I've got down to about 30minutes.
The problem is the RasterCellCoercer. That part takes 15hrs! It creates about 300,000,000 points (yes, 300 million!) from the raster (there are about 4.5 Billion points in the source raster, so the other 4.2billion are no "noData" or were culled completely). I can't think of a way to get the number lower because that's how many the features require.
Any suggestions for a faster way to get the DEM values in a way that they can be grouped for a statisticsCalculator pass?
Would somehow trying to coerce it into a PointCloud be faster? (And if so, how?).
Thanks,
Jonathan