Solved

FME Flow - remove queued jobs from fmeserver database table fme_jobs?

  • 15 April 2024
  • 3 replies
  • 41 views

Badge +2

Our FME Server was patched and rebooted, which left it in a bad state. The automations and schedules kept triggering jobs but no jobs were running. It wasn’t noticed over the weekend and we now have over 160,000 queued jobs and I can’t get the FME server to respond (it hangs at the login screen, API calls just hang). I’ve restarted services, the machine etc etc. I think the issue is the fmejobrestore starts restoring all these jobs to the queue when I restart the fme server services.

 

I’d like to cancel all the queued jobs in the back end database. Can I remove the records from the fme_jobs table? Will this cause any issues?

 

Thank you!

Kathy

icon

Best answer by steveatsafe 16 April 2024, 23:42

View original

3 replies

Userlevel 5
Badge +29

Short answer, yes you can. Before you go down that route, what do the server logs say? Have you tried making API calls to delete jobs, are they successful?


I would be cautious of deleting everything in one go, I’d look to start with less important jobs and see if you can get it to a state where you can log in and clear through the UI

Badge +11

Hi @kathyjbryce  and @hkingsbury
Kathy, thanks for posting the question and I’m super glad you came into Support so we could assist you.  

Hamish, Kathy did try API calls but no joy. You are correct about the caution for deleting everything in one go.  You may want to only delete some records and not all. 

 

Richard and I joined Kathy earlier today to review the system in question.

This is a distributed FME Server 2022 installation with a SQL Server Database hosting the FME Server Database.

In the end we did ‘truncate the fme_jobs table’. 

  1. Stopped all FME Server services and confirmed in Task Manager
    1. Ensure all FME* processes have stopped.  There should be no remaining processes.
    2. Ensure redis-server.exe (or in FME Flow 2023+ memurai.exe) has stopped.  This is the Job Queue of FME Server (Flow).  Terminate the process if it has not stopped with the rest of FME Server/Flow.
  2. Using SQL Server Management Studio (SSMS), truncate the fme_jobs table. NOTE: This will reset the job id to 1 due to the identity id on the table.
  3. Reset the job_id to the previous max value in the SQL Server database where the FME Server Database is hosted.
    1. Determine the max job id by reviewing /logs/engines/jobs to find the maximum job id listed in the job logs.
    2. In SSMS run DBCC CHECKIDENT (N'[FMESERVER].[dbo].[fme_jobs]', RESEED, #####), where ##### is the maximum job id value found in step 1.  It will be a number, for example, the system has run 121,221 jobs.  So use a value above this… 121,300.  This will ensure the system doesn’t overwrite old job logs or produce errors when jobs are completed and archived to the fme_job_history table.
    3. In SSMS confirmed the correct JOB ID value using SELECT IDENT_CURRENT('YourTableName');
    4. Start FME Server Core & FME Server Application Server. Leave the engines off for now.  Log in to the FME Server Web UI.
    5. Run a job as a test to see what JOB_ID it has to confirm the RESEED worked.  Once you have confirmed this, you can start the FME Server Engines.

In this case the server had so many jobs in the fme_jobs table (by the time we connected more that 160,000 likely) that were attempting to be requeued to the Job Queue (redis-server.exe) that it impacted the performance of FME Server Core (fmeserver.exe), and thus the Web UI, making it unusable to cancel the jobs in the normal fashion.  The only way to resolve this was to clear the queued jobs that were in the fme_jobs table.

If you encounter a system where this has happened, and you are not able to interact with the FME Server Web UI, this will probably be the only way to resolve it.  When FME Server (or Flow) is restarted, any jobs that were previously queued or actively running will be requeued to the Job Queue. 

For FME Server Database hosted on Oracle and PostgreSQL (and the embedded FME Server Database), a similar approach can be taken, but reseting the job id will need different syntax.    

Let us know if you have encountered this problem.

Badge +2

Thanks Steve for the great summary!

Reply