Skip to main content

Hi everyone,

 

I have a simple workbench where I added a reader that reads 20+ featureclasses from a nationwide dataset from a file geodatabase.

From there the workbench selects 11 featureclasses and uses the clip transform (clip polygon is a province) to clip out a certain area and saves it to a new file geodatabase.

 

Problem with the workbench is that it has milions of features, and the machine I run it on will run out of memory after 5+ hours of processing.

 

Is there a way to split up the workbench so it will read/transform/write the workbench in 1 to 3 feature classes and waits till the memory is free again before starten the next 1-3 features classes? (maybe with embedding multiple workbenches?)

 

thanks in advance!

 

Before looking into splitting the workspace up have a look at the set up of the Clipper. Is the clip polygon getting read in before the 20+ featureclasses and is the Clipper set to Clipper First? If set up correctly with no blocking transformers millions of features shouldn't cause memory problems. Also don't run it with caching turned on.


Before looking into splitting the workspace up have a look at the set up of the Clipper. Is the clip polygon getting read in before the 20+ featureclasses and is the Clipper set to Clipper First? If set up correctly with no blocking transformers millions of features shouldn't cause memory problems. Also don't run it with caching turned on.

Thanks for the reply.

 

I did turn off feature caching.

 

The clip polygon is read from a Esri Shape reader at the start of the workbench (and contains 2 polygons).

Setup is basically:

 

clipper reader--------------|

|

GDB feature1---------------clipper1--------GDBfeature1 (writer)

GDB feature2--------------clipper 2---------GDBfeature2 (writer)

GDB feature3--------------clipper3---------GDBfeature3 (writer)

....

 

 

 


Run batchwise.

create a batch attributes, or find some attributes where you can split the data, may be state or zipcode or city

or

sort geodatabese and use rownum(equalant query in geodatabase, rownum is for oracle or postgis)

and run from command line (where rownum > 0 and rownum< 1000000) this is for oracle.

something like this.

hope this helps.

 


I'm assuming that you're reading in all the data each time it runs (even though you're clipping data for one region)?

 

To limit the amount of data being read in make use of feature readers and the inbuilt spatial filter.

 

First, read in the region you want to use as the clipper, then use this feature to trigger feature reader(s) with the spatial filter set to intersects. This will then only read in the data that intersects the area you're clipping. This will greatly reduce the volume of data your working with and will speed things up.


Reply