Question

Cancel Subjob automatically if Mainjob gets canceled

  • 4 October 2018
  • 4 replies
  • 2 views

Badge

I have set up a directory watch with fme-server ( something like

https://knowledge.safe.com/articles/51817/directory-watch-publisher-tutorial-2017.html)

Some data can not be processed by fme-server so I set up a time to cancel of 30 minutes.

After this time the main workspace (convert.fmw) gets canceled but the subworkspace which normally gets calles by concert.fmw through a FMEServerJobSubmitter does not get canceled so these jobs fill up the job queue as "zombie"-jobs.

 

Is there a way I can get rid of these zombie-jobs in the queue?


4 replies

Badge +16

Hi @rdbath_

In the bottom part of the FMEServerJobSubmitter (advanced) you can add some directives to the child job. This is available in FME 2018.1 and 2017.1

Hope this helps.

Itay

Userlevel 4
Badge +25

Yes, as @itay notes, the transformer should have the same parameter to expire after a set time:

 

 

I guess the problem there is that it's a completely separate action to what happens to the main workspace. Because it's submitted as a separate job it's hard to tie the two together, so if one fails the other does too.

So the only way I can see to fully handle this scenario is to use the REST API (say as a call in the main workspace) to find the job ID and record that information somewhere. Then sporadically run a script (another workspace perhaps) to check for orphaned jobs, using the REST API again to delete them.

Or perhaps a Server expert can tell us of a better way?

Badge +10
Hi @rdbath_,

 

 

As of 2017.1 all child jobs should be cancelled if the parent is cancelled through a condition, such as a timeout on a schedule or submitted job, or if manually cancelled. Would you be able to share information on how you are cancelling the master job and your build number?

 

 

Badge
@RichardAtSafe: I Am using FME Server 2017.0.1 - Build 17288 - linux-x64. The mainjob has a timeout of 30 mins. After this time the mainjob gets canceled but the subjob stays in the queue.

 

Reply