Skip to main content

Hi !

 

I have a simple process to query a SQL Server database, create a json, then pass each feature to a python script.

My issue is that the JSON Templater's ram usage grows and grows, storing features even when they have already been passed to python (up to 64Go for about 300000 lines).

How can I split my features stream, in order to work with n-sized batches (e.g. take the first 20 000 features, run the json templater and python, free any ram usage, and then take the 20 000 next features to do the same thing.)

If you have an ID field you can use in the SQL Server table, you can use the WHERE clause of the table in FME.

Like this WHERE ID >= $(MIN_VALUE) and ID < $(MAX_VALUE)

Then run the workspace with "Prompt and Run" and supply the range for the where clause ID's.

This does not need to be an ID field (can be any distinctive field).


If you have an ID field you can use in the SQL Server table, you can use the WHERE clause of the table in FME.

Like this WHERE ID >= $(MIN_VALUE) and ID < $(MAX_VALUE)

Then run the workspace with "Prompt and Run" and supply the range for the where clause ID's.

This does not need to be an ID field (can be any distinctive field).

hmmm... that solution wouldn't allow fme to loop on the min/max values until the whole data is processed on a single execution

 

 


hmmm... that solution wouldn't allow fme to loop on the min/max values until the whole data is processed on a single execution

 

 

Could you use a SQLCreator to select your min/max values and then a FeatureMerger (merge unconditional on 1 = 1) to add the min/max values to each feature?

 

 


Hi,

May be WorkSpaceRunner is useful by running same workspace for required number of times with different source feature

Pratap


Well, finally, adding an aggregator made the garbage collection work as intended.


Reply