How to process dataset in batches?

Hi,

I have got a workspace which is reading from a big dataset of spatial data, and then runs a dissolve and aggregate before outputting into a postgis database. This process can take a very long time, and I wanted it to run in batches e.g. take the first 1000 unique objects and run the process on, and then the next 1000 etc...any ideas please?

I have tried looking at some batch processing documentation but not sure this would help.

thanks

Page 1 / 1

I think a workspacerunner would be helpfull.

get only the id's from the objects with an sqlcreator split them up in portions and feed them to the workspace you created.

The Dissolver transformer has a parallel processing mode, which would be just as good as a WorkspaceRunner.

But in either case the problem you'll have is what happens if two features should be dissolved, but appear in two separate groups? You'd have to run everything through a second time to make sure those polygons get dissolved.

To be honest, the better route might be to just load all the data in PostGIS and use the ST_UNION function to dissolve them together in there. The performance might be better.

Reply

Community Stats

Reply

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded