Skip to main content
Solved

Rerun canceled job automatically

  • 20 June 2024
  • 5 replies
  • 49 views

In FME Flow, you can set up automation so that a workspace restarts if the job fails. However, if a job is canceled (for instance, it exceeds the max. queue time), this is not possible. Or have I missed something?

5 replies

Userlevel 2
Badge +6

Hi @stezi ,

What Engine Management rule do you have set up to cancel running jobs?

You are correct, the job retry in Automations is for failed jobs; intentionally cancelled jobs will not be resubmitted.

Let me know your thoughts!

Kezia

Badge +4

Hey Kezia. Where can I configure this? I cannot find anything in the Engine Management section and the doc.

Userlevel 2
Badge +6

Hi @stezi ,

There isn’t an engine management rule to cancel running jobs. We do have Advanced settings on the Run Workspace page to cancel jobs based on Queued Job Expiry Time and Running Job Expiry Time. You can also use the FME Flow REST API calls to configure some sort of workflow. 

An idea to resubmit a cancelled job could be to set up an Automation with the FME Flow Systems Event trigger and use the Filter Action to filter for messages that contain the word “Cancelled”. A cancelled job will log like this:

7662	Error Message Logged	Event Description: Triggered whenever an error message is logged to fmeserver.log.Event Title: Error Message LoggedMessage: Job 46: Cancelled running job

You can then author a workspace to resubmit the job using the REST API call. 

Some resources that may be helpful:

Please let me know if I’ve misunderstood your question.

Kezia

Badge +4

Your second idea seems to be what I’ve been looking for, although I do not fully understand the partYou can then author a workspace to resubmit the job using the REST API call.“.

I think it would be nice to have a checkbox “Retry on canceled” in the “run workspace” element in an automation that would manage everything for the user :)

 

Userlevel 2
Badge +6

Hi @stezi ,

There are couple of FME Flow REST API endpoints that you can use to submit/execute the job.

Here is an example workspace for gathering information on a failed job. The workspace that you author can be tailored to submit the job using the appropriate endpoint.

 

If the job has a max queue time set, it’s assumed that when expired, the job will not be submitted. The job has not failed and has not been cancelled yet, just sitting in the queue. Then it will require manual intervention to execute the job. Having Retry on Cancel could be a good idea... I’d recommend submitting an idea here and share your use case. 

Reply