Skip to main content

Hi all,

I'm trying to troubleshoot a scheduling anomaly which resulted in jobs that were scheduled to run early this morning.

What has happened is there are several jobs which have repeated on their own instead of just executing once at the scheduled time and stopping when complete. This resulted in one job notifying (Slack bot and email) that it wrote multiple times(3) to a postgres log table that it updated 1 record, but in fact the record is not duplicated in the table, so the result (the distinct count) ends up being correct after further inspection. However, the notifications that I have set up in the workspaces are not reflecting the output correctly.

I suspect what might be happening is that the set(batch) of 23 tasks(workspaces), which perform the same process on 23 tables in a postGIS db are overrunning and interfering with each other. The tables all reside under one schema in the database, and each workspace will read in a master(staging) db and the current production db with the output of those sent through a change detector transformer. Any records that don't match a UUID in the production db are considered new and so they get sent to the update port and then written to the production database. I'm simply using the INSERT parameter on the writer, which might be part of the problem, since there is the fme_db_operation attribute specifically for this, but I didn't think I really needed it to append new records to the tables. Maybe that is my oversight, please advise. These jobs are set up on server scheduled to run as 23 individual tasks, but at the same start time, midnight.

If there is a better way to perform database updates from 23 individual surveys on on a weekly schedule I'd really like to learn how.

If they interfere, then might serial execution not resolve this?

Maybe have those tasks that have interdependency not run simultaneously?


@gio, thanks. Yes that is what I'm trying next is a serial scheduling each 5 minutes apart. None of the them are interdependent at this point in ETL, so I was just trying to be "efficient" and have them run parallel, but maybe I'm not thinking it through. Most only take a few seconds to run even as database updates, just a few take several minutes to an hour+ to complete. I'm still stumped though on the notifications and why they would run a second even third time after completing correctly the first time.


Reply