Question

Binning a large area

  • 14 January 2019
  • 1 reply
  • 2 views

Badge

I want to create contiguous 30 meter hex grids for binning spatial point data for on-going summary analysis. Ideally, I'd like for these grids to cover the entire earth. Creating and working with this volume of spatial data would be difficult if not impossible. I've thought about creating large rectangular grids as reference with the hex grids overlayed and subdivided and stored into groups (files) referenced by the rectangular grids.

When evaluating new point data, a workflow could reference the rectangular grids that intersect the points and open only the corresponding group of hex grids.

Would this approach work with FME? Is there another approach to binning and summarizing point data on a very large scale?


1 reply

Userlevel 4
Badge +25

There is a HexBinner transformer on the FME Hub (and a HexSampler and HexReplacer).

However... looking at the What3Words website, their technology use a 3m x 3m square grid. That gives them 56,666,666,666,667 grid squares to cover the entire world. On that basis your 30 x 30 grid will be a half trillion features.

Even if you just used point features to represent the centre points of a grid, that's a lot of data. Approximately 7,500GB of coordinates in plain text format, by my calculations.

So I don't think you're going to be able to create a grid dataset for the entire earth. Even if you divide it into pieces, that still the same amount of data, albeit smaller chunks.

So, if you want to do spatial operations, like point-in-polygon, I think what you'll have to do is generate the grids on the fly. For example, if I know a point feature to be analyzed has an X/Y of 488686.9, 5456639.5 then I divide those by 30 (since you have a 30m grid) and round down to an integer. Then multiply by 30 again.

That makes your minimum grid centre point 488670, 5456610. So start creating a grid using those coordinates as the origin (say with a 2DGridCreator transformer or another method if you want to use hexes).

Then you have a local grid, which has predictable coordinates, and can be generated very easily and quickly to evaluate data against.

Personally I like the hex idea. It's a little more complicated than a rectangular grid, but has obvious advantages.

Of course, the elephant in the room that isn't being mentioned is coordinate system. The above coordinate is a UTM zone. I'm not sure what coordinate system you would use in metres that covers the entire world.

Anyway, I hope something here helps. In short, I don't think a grid dataset covering the entire world is feasible. You'd need some sort of algorithm to take point coordinates and generate a matching grid from them.

Reply