Question

Workspace runner with “wait for Job to compete” = YES and setting a number of the concurrent processes

  • 13 January 2022
  • 8 replies
  • 45 views

Is it possible to run workspace runner with “wait for Job to compete” = YES and setting a number of the concurrent processes? It should work the same way as “wait for Job to compete” = NO with the only difference on the time of showing the success or failure of the process run.


8 replies

Userlevel 4
Badge +25

No, because job 2 has to wait until job 1 has finished, so there's no way it can use concurrent processes. That's so users can ensure job 1 finishes first, eg in case job 2 needs to read the data job 1 created.

 

We could - I suppose - add a Group-By option, so that you could wait for each job to complete in a group, but process each group separately. If that's sort of what you want then perhaps you could post to the Ideas section so we can gauge the demand?

 

Alternatively, perhaps job 1 includes a WorkspaceRunner that is concurrent, so it runs something else on multiple processes (job 1a, 1b, 1c, etc) while the original workspace waits for it (job 1) to finish.

Userlevel 2
Badge +17

No, because job 2 has to wait until job 1 has finished, so there's no way it can use concurrent processes. That's so users can ensure job 1 finishes first, eg in case job 2 needs to read the data job 1 created.

 

We could - I suppose - add a Group-By option, so that you could wait for each job to complete in a group, but process each group separately. If that's sort of what you want then perhaps you could post to the Ideas section so we can gauge the demand?

 

Alternatively, perhaps job 1 includes a WorkspaceRunner that is concurrent, so it runs something else on multiple processes (job 1a, 1b, 1c, etc) while the original workspace waits for it (job 1) to finish.

I think it would be nice to add Group By option to WorkspaceRunner.

 

In the interim, try creating a custom transformer that contains a WorkspaceRunner (Wait for Job to Complete = Yes), expose its Group By and Parallel Processing parameters. You can then run the custom transformer in parallel for each input feature by setting the Group By with a parallel processing mode. I think it's a sort of hack but could be effective in some cases.

See the attached workspace examples to learn more.

See also here to learn parallel processing.

Custom Transformers and Parallel Processing

 

I'm not sure how the maximum number of parallel processes will be determined, but could be different depending on the parallel processing level and the number of CPU cores.

I remember that there was a documentation about the maximum number of processing for the old parallel processing parameter, but currently I cannot find any specific explanation for the current parallel processing. Could you please clarify how the max number of parallel processes will be determined in the modern FME? @mark2atsafe​ 

 

Badge +1

No, because job 2 has to wait until job 1 has finished, so there's no way it can use concurrent processes. That's so users can ensure job 1 finishes first, eg in case job 2 needs to read the data job 1 created.

 

We could - I suppose - add a Group-By option, so that you could wait for each job to complete in a group, but process each group separately. If that's sort of what you want then perhaps you could post to the Ideas section so we can gauge the demand?

 

Alternatively, perhaps job 1 includes a WorkspaceRunner that is concurrent, so it runs something else on multiple processes (job 1a, 1b, 1c, etc) while the original workspace waits for it (job 1) to finish.

WRHello! I have more or less the same request: I'm fetching a series of XMLs through parallel processing of WorkspaceRunner, I need to recreate the indices on DB after the import. This workflow is connected to the Summary port of WorkspaceRunner. The problem is the Summary port fires after the last WSRunner begins, it's not waiting for every job to finish. I would need to continue after all jobs are done.

 

Badge +15

WRHello! I have more or less the same request: I'm fetching a series of XMLs through parallel processing of WorkspaceRunner, I need to recreate the indices on DB after the import. This workflow is connected to the Summary port of WorkspaceRunner. The problem is the Summary port fires after the last WSRunner begins, it's not waiting for every job to finish. I would need to continue after all jobs are done.

 

Did you try using a FeatureHolder behind the Succeeded output?

 

In theory it would have to wait untill it knows all Workbenches had a success. Then you can run the process you want.

Badge +1

WRHello! I have more or less the same request: I'm fetching a series of XMLs through parallel processing of WorkspaceRunner, I need to recreate the indices on DB after the import. This workflow is connected to the Summary port of WorkspaceRunner. The problem is the Summary port fires after the last WSRunner begins, it's not waiting for every job to finish. I would need to continue after all jobs are done.

 

Both ports (Succeeded, Summary) triggers at the same time with last feature. Anyway the subprocesses of WorkspaceRunner run long time ( more than 5 min) after WorkspaceRunner reports succesfull runs. This would be some unexpected behaviour.

fme_WS_subproc

Badge +1

No, because job 2 has to wait until job 1 has finished, so there's no way it can use concurrent processes. That's so users can ensure job 1 finishes first, eg in case job 2 needs to read the data job 1 created.

 

We could - I suppose - add a Group-By option, so that you could wait for each job to complete in a group, but process each group separately. If that's sort of what you want then perhaps you could post to the Ideas section so we can gauge the demand?

 

Alternatively, perhaps job 1 includes a WorkspaceRunner that is concurrent, so it runs something else on multiple processes (job 1a, 1b, 1c, etc) while the original workspace waits for it (job 1) to finish.

@mark2atsafe​ As I posted already in the thread I would need process workspace concurrently to improve performance but at the same time I need to wait when all processes are done to continue with the workflow (reconstruct indices in DB, etc.). I don't need to wait for any particular job, but just for the last one that is running as a whole. I would expect the output on summary port that it could wait for finishing even for parallel processes. There's no other way to tell the time to wait - I used Decelarator, but the times for workspaces to run varies significantly.

The processes don't have any property to group by, just random processing is ok.

Badge +2

No, because job 2 has to wait until job 1 has finished, so there's no way it can use concurrent processes. That's so users can ensure job 1 finishes first, eg in case job 2 needs to read the data job 1 created.

 

We could - I suppose - add a Group-By option, so that you could wait for each job to complete in a group, but process each group separately. If that's sort of what you want then perhaps you could post to the Ideas section so we can gauge the demand?

 

Alternatively, perhaps job 1 includes a WorkspaceRunner that is concurrent, so it runs something else on multiple processes (job 1a, 1b, 1c, etc) while the original workspace waits for it (job 1) to finish.

I also think it would be helpful to allow concurrent processes and have the "Wait for Job to Complete" parameter set to No. My use case is batch processing of individual dates (child workspace processes a single date; parent workspace passes all dates to a WorkspaceRunner). Each batch is independent of all other batches, so there is no reason for one run to wait for another. My primary reason for wanting "Wait for Job to Complete" is so that my WorkspaceRunner receives error messages from the child workspace. As it stands, the child workspace can fail silently (e.g. due to SQL Server errors) so I need to build in clunky completion-validation to ensure that everything ran as expected.

 

@mark2atsafe Is there an alternative approach, where I can run concurrent processes AND get error messages back from the child workspace?

 

Thanks!

Graeme Brown

Badge +2

Hi @mark2atsafe - I am wondering if you have an update on this. I am encountering this same issue for a new project (wanting to catch errors of workspaces that run in parallel via a WorkspaceRunner transformer), and in my search for answers came across this thread.

Thanks,
Graeme Brown

Reply