Skip to main content

So I have a big job to process and I’m trying to work out some theoretical numbers for the client (a rough estimate on the processing time range depending on various factors). I want to leverage the fmeworker limit of 16 parallel threads (I’m assuming a machine with essentially unlimited resources) because the fme.exe limit is 8.

I have a batch job of many tiles with many rasters and features which all need to be processed at once. The easy thing to do is to just throw it to the workspace runner but the license only gives me 7-8 parallel threads (which is probably more than enough to be fair, most likely CPU will not be the bottle neck).

Anyways for fun I did a test to see what the limit was on the number of parallel groups to send through a custom transformer when parallel processing was on. I pretty quickly got the following error:
 

An error has occurred. Check the logfile above for details
f_6 (TransformFact): Transformer 'ThreadLimitTester': Unable to create worker; try reducing the parallelism level (currently 'EXTREME') or the number of groups (7663 seen so far, 1000 considered borderline, 10000 close to upper limit in best possible scenario)


I thought this was a nice error - “1000” considered borderline. This error came weather or not to process when the group changed or to wait until all received.

Has anyone run a job with 10s or 100’s of thousands of groups at all? I assume that each groups will take 1 - 2 mins to process. The more parallel threads the better here.

I can definitely design it in a way where I use a mixture of workspace runner plus custom transformer - but I don’t make it more complicated than it needs to be.

Unfortunately  it seems like FME Server/FME Flow is not an option

Interesting!

Have how have you got the custom transformer configured in terms of blocking? Are you waiting for all features to be received before running, or have you got it set to ‘When Groups Change’?


Interesting!

Have how have you got the custom transformer configured in terms of blocking? Are you waiting for all features to be received before running, or have you got it set to ‘When Groups Change’?

I tried it with both.

It could be that my synthetic set up was the issue - not sure, I just fed a creator with 100,000 features into a transformer set to wait for 5000 seconds per feature so it would never get though.

It seems there is never any waiting at all, just tries to submit submit submit!

Most likely if there was more upstream processing or some rate limiting before the features hit the custom transformer the issue might go away. If I were to use a decelerator to slow the arrival time of featutes to roughly match the processing time perhaps it would work.

As it is I think I will try and use the WorkspaceRunner just to be on the safe side. I think I would still use the groups but reduce so the total number of groups per workspace never exceeds 200 or so. 


I often run workspace runners above 1000 and it’s completely fine! Sometimes I need a decelerator for the first if there’s feature caching set on the feature readers of the child-workbench to avoid clashing. I usually go for “minimal” (if using the data interoperability extension as it’s limited to 4 parallel workbenches). The workspace runner is set to “Wait for Job to Complete” so I wonder if the limit is more to do with when this setting is set to “No”.


I often run workspace runners above 1000 and it’s completely fine! Sometimes I need a decelerator for the first if there’s feature caching set on the feature readers of the child-workbench to avoid clashing. I usually go for “minimal” (if using the data interoperability extension as it’s limited to 4 parallel workbenches). The workspace runner is set to “Wait for Job to Complete” so I wonder if the limit is more to do with when this setting is set to “No”.

It’s fine for child workspaces for sure, the batching works different and doesn’t require a port for process communication (at least as I understand it). Communcation between processes goes over Ephemeral Ports and there are only about 15,000 or so I think.  

But I do seems to remember in the past having more like 20,000+ via custom transformer parallel processing but I might be wrong there.


Reply