Question

Is there a way to control order of processing/dependency within an automation?

  • 8 September 2020
  • 3 replies
  • 18 views

Badge +1

I have an automation workflow (FME Server 2019) built from a number of workspaces that process an input file. The automation is triggered by a file landing in a directory (directory watch), and the data passes through the workspaces and gets loaded to the database at the end (if follows the happy path), else someone gets notified for a fail, etc. This works great for 1 file end-to-end, but when multiple files land in the directory, the first file runs through workspace A, then second file runs through workspace A,.. rather than first file runs through workspace A, then workspace B,.. completing all before second file runs. Is there a way to include this dependency?

I've tried adding a merge action, but this doesn't accept it as everything is in sequence there are no parallel streams, and I'm asking it to run/continue regardless of pass or fail of workspace action.

Is this possible within latest version?

The reason I need one file to be processed end-to-end is because some of the workspaces load data to a staging database and later workspaces expect 'their' data to be in the staging database, not the data of a different file.

 

Any help much appreciated

Thanks

Mary


3 replies

Badge +2

Hi @1spatialmary​ ,

I don't think there is a straight forward way to do this, but it may be something you can achieve by making use of FME Server Job Queue Priority.

 

Imagine the scenario where I have 1 FME Server Engine to work with. When I drop multiple files, let's say 5, into my Directory, each of these events will submit a workspace to run simultaneously, however because I only have one engine, 1 job will run and the other 4 will enter the queue.

 

So the problem is workspace B is only submitted for each file when the first workspace A job completes, so at this point it is being submitted to the end of the queue, by default all jobs are submitted with the same priority mean Workspace B becomes the 5th job in the queue and 4 other jobs have to run before the engine is free. However if Workspace B was assigned a higher priority job queue, what would happen is it would be placed at the top of the queue, meaning it is the next job to be picked up once the engine becomes available.

 

I don't think this will work perfectly because as soon as the first Job A completes at the same time it is submitting workspace B, the next job already in the queue will be being picked up, but it may be something you can work with.

 

Order Before Queues:

Screen Shot 2020-09-09 at 1.47.27 PM 

 

 

 

Define Queues:

Screen Shot 2020-09-09 at 1.41.06 PM 

 

 

Screen Shot 2020-09-09 at 1.38.11 PM 

 

 

Order After Queues:

Screen Shot 2020-09-09 at 1.44.36 PM 

 

 

Now if your FME Server has multiple engines, you may want to reduce this Automation to use a Job Queue with only one designated engine, to try and help control this. Otherwise if you have 5 engines, and 5 files are dropped in, all Workspace A jobs will be running in parallel.

 

I'd highly recommend posting an Idea, to request some kind of enhancement that allows you to somehow queue Automation events, e.g. run Automation to completion before triggering next event.

Hi @1spatialmary​ 

I have maybe a workaround for you, but it's a little tricky to put in place. For this workaround, you will need:

  • A Database with 2 tables
    • Table name: WORKFLOW_MANAGER
      • Column: STATUS as BYTE
        • Where 0 = Nothing in progress
        • Where 1 = Something is running
    • Table name: FILES_TO_PROCESS
      • Column: FILENAME as VARCHAR(100)
      • Column: DEPOSIT_DATETIME as DATETIME
      • Column: STATUS as BYTE
        • Where 0 = Not started
        • Where 1 = In processing

The Workspace_A for DirectoryWatch will not process the file anymore, instead, it will write into table FILES_TO_PROCESS the file to be processed, the datetimestamp and STATUS set to 0.

You need to create another workspace, let's call it FILES_MANAGER_LOADER. You need to create a Scheduler for this Workspace to run it, let's say every 10 seconds.

In the FILES_MANAGER_LOADER, you have some things to do, but the main idea is:

  • Check if STATUS of table WORKFLOW_MANAGER is 0, if it's 0:
    • Select the first record from table FILES_TO_PROCESS ordered by DEPOSIT_DATETIME
    • Check the STATUS
      • If 0
        • update the STATUS to 1 in WORKFLOW_MANAGER
        • update the STATUS to 1 WHERE FILENAME = @Value(FILENAME)
        • Launch the Workspace B to process this file in WAIT FOR JOB TO COMPLETES to YES
        • Once Workspace B is done, you can erase the record of this file from FILES_TO_PROCESS
        • update the STATUS to 0 in WORKFLOW_MANAGER
      • If 1
        • Nothing to do, there something processing a file right now.
  • Check if STATUS of table WORKFLOW_MANAGER is 1, if it's 1:
    • Do nothing, let the process to end immediately because there is something running right now.

It's hard to explain in details every step, but the main idea is to have a workspace manager to run one by one every files that are deposed in the directory watch. This workspace manager needs to know if something is in processing right now and wait to launch another file processing until the previous one is done.

 

Hope this workaround could resolve your issue.

Regards,

Jean

Hi @1spatialmary,

Or another idea, if you have more than 1 Engine, you can reserve Engine A only for this process. So every calls will be queued into Engine A. In this case, the workspace must be done before another one is launched into Engine A.

 

Regards,

Jean

Reply