Skip to main content

I'm wondering if anyone can think of a way where I can control the number of jobs being run on FME.

 

I have a process whereby I need to run some complex analysis on FME Server. To do this, I need to split the process into a large number of jobs (current process has 250,000 jobs). This is the only way to do it without crashing FME due to features needing to be held in memory.

 

When on desktop, you can control the number of processes in WorkspaceRunner, however, this doesn't seem to exist with the FMEServerJobSubmitter. I've also tried using the REST API and that has the same problem. If I run the process in sequence, it would take way too long. If I run in parallel, it tries to load all 250,000 jobs to the FME Serve Queue and crashes the application.

 

So, I'm looking for a away to tell FME Server to queue a certain number of jobs in batches.

 

Once the first batch is done, it would release the second batch. This would enable me to run in parallel but without crashing the application server.

 

Any suggestions welcome. Multiple JobSubmitters sn't the answer as it still runs one at a time.

 

Out of curiosity, which application crashes when loading 250'000 jobs in the server queue? FME Server?


Out of curiosity, which application crashes when loading 250'000 jobs in the server queue? FME Server?

Hi David

 

Yes, it's the FME Application Server Service


Hi David

 

Yes, it's the FME Application Server Service

Would it be feasible to submit those 250'000 (which is just a humongous number, btw) jobs in blocks and wait a reasonable amount of time between the blocks (e.g. Decelerator)? That way you might avoid exploding the server queue.


As a temporary hack could you have an interim workspace that handles subbatches?

Let's say you can queue 5000 jobs at a time without issue. The controller workspace calls the interim workspace 50 times (250000/5000), with wait for job to complete set to yes. The interim workspace calls the production workspace 5000 times with wait for job to complete set to yes.


As a temporary hack could you have an interim workspace that handles subbatches?

Let's say you can queue 5000 jobs at a time without issue. The controller workspace calls the interim workspace 50 times (250000/5000), with wait for job to complete set to yes. The interim workspace calls the production workspace 5000 times with wait for job to complete set to yes.

If the parameterization is too difficult to pass through the interim workspace (unique, non-incremental parameters on the production workspace), in the controller workspace you can write the parameters to a file (FeatureWriter) or database and the interim workspace would read that file, parse it, and pass the parameters to the production workspace.


As a temporary hack could you have an interim workspace that handles subbatches?

Let's say you can queue 5000 jobs at a time without issue. The controller workspace calls the interim workspace 50 times (250000/5000), with wait for job to complete set to yes. The interim workspace calls the production workspace 5000 times with wait for job to complete set to yes.

I thought about this and think this is the way I will do it. Just hoping there was a better way.


Hi @david_whiteside

What version of FME Server are you using?

In 2018/2019 you should be able to send a larger number of jobs and the queue should use available memory on the system automatically... We do have a customer that regularly sends 30,000 jobs but this is 2018. Jobs do vary and size and this can impact the memory used by the new queue (and old).

The only caveat is If there is some sort of restart on the system only 50,000 jobs would be re-queued to safeguard against memory issues on restart. This is configurable if you were finding the queue/memory/system could take more jobs in the queue safely.

I know of another way that this could be done but it isn't any more pretty than chunking and using a decelerator to send the jobs in batches. I think my approach would also be more of a hack approach than the suggested ideas below, meaning you would maybe be going into places where we'd not normally recommend - the FME Server System database. Thus, I'll not bother sharing it! 🙂

 


Reply