Skip to main content

Hello all,

I am trying to have my second workbench (Data Validation) start running once my first workbench (Data Creation) is completed. I have tested using the WorkspaceRunner transformer, but it is slow.

Is there a more efficient way to have my second workbench start and run once my first workbench is completed? I'm thinking since WorkspaceRunner is pretty slow, if the best way to do this is in FME Server.

Thank you for any suggestions!

David

The best way would be FME Server!

You can trigger the second workspace to run on success after the first workspace, using notifications.

Using FME Desktop, you have 2 obvious options:

Using the WorkspaceRunner (as you mentioned) or using a batch file with two calls to FME.

The batch file option does not guarantee that the first workspace has been successful, which is a disadvantage.

I would use FME server if available and the WorkspaceRunner in other cases (should not be slower than running the 2 workspaces in a batch file, but allows for a success check).


The best way would be FME Server!

You can trigger the second workspace to run on success after the first workspace, using notifications.

Using FME Desktop, you have 2 obvious options:

Using the WorkspaceRunner (as you mentioned) or using a batch file with two calls to FME.

The batch file option does not guarantee that the first workspace has been successful, which is a disadvantage.

I would use FME server if available and the WorkspaceRunner in other cases (should not be slower than running the 2 workspaces in a batch file, but allows for a success check).

Thanks @erik_jan! I was thinking FMEServer as well. I am fairly new to Server, how do I set a trigger for the second workspace to run?

 

 

Thanks!

 


You could also do the Data Validation in the same workspace as the writing, using the FeatureWriter instead of regular writers and use the output ports of the FeatureWriter transformer to start the Data Validation. The output from the FeatureWriter is passed on after the writing has finished.

This would most likely give you the best performance.


Thanks @erik_jan! I was thinking FMEServer as well. I am fairly new to Server, how do I set a trigger for the second workspace to run?

 

 

Thanks!

 

Hi @david_prosack88,

 

Have a look at this documentation: Notifications

 

It explains how to set up a workspace run, triggered by a completion of another workspace.

 

In general: set up a publication (succesfully finished Data load workspace), a topic and a subscription (run workspace for Data Validation).

 


You could also do the Data Validation in the same workspace as the writing, using the FeatureWriter instead of regular writers and use the output ports of the FeatureWriter transformer to start the Data Validation. The output from the FeatureWriter is passed on after the writing has finished.

This would most likely give you the best performance.

I thought about that too! But, I'm afraid by consolidating the two workbenches will cause the workbench to crash. What is the recommended size of a workbench? If there is one.

 


I thought about that too! But, I'm afraid by consolidating the two workbenches will cause the workbench to crash. What is the recommended size of a workbench? If there is one.

 

Just a general recommendation to use as few transformers as possible.

 

My own opinion: Getting the job done (and being able to maintain the workspace) is more important than limiting transformers.

 


You could also do the Data Validation in the same workspace as the writing, using the FeatureWriter instead of regular writers and use the output ports of the FeatureWriter transformer to start the Data Validation. The output from the FeatureWriter is passed on after the writing has finished.

This would most likely give you the best performance.

I agree that less can be better. But because of different requested validation rules, the data validation workbench consist of around 80 different transformers. Would it be best to run this larger than normal workbench in FME Server? Instead of consolidating into one workbench.

 

 


I agree that less can be better. But because of different requested validation rules, the data validation workbench consist of around 80 different transformers. Would it be best to run this larger than normal workbench in FME Server? Instead of consolidating into one workbench.

 

 

As both workspaces are to run in sequence, I do not think it matters a lot.

 

You have no option to run both in parallel (on different FME Server engines) as the first has to finish before the second can start.

 

The main advantage is the ability to start the second automatically after the first succeeds and to detect success or failure of the first.

 

 


I have never tested, but you say 'since WorkspaceRunner is pretty slow'. What makes the WorkspaceRunner a slow solution?


Hi @david_prosack88,

 

Have a look at this documentation: Notifications

 

It explains how to set up a workspace run, triggered by a completion of another workspace.

 

In general: set up a publication (succesfully finished Data load workspace), a topic and a subscription (run workspace for Data Validation).

 

Notifications can be a tricky thing. Certainly if the FME Server environment is new to you.

 

There is also an analogy of the WorkspaceRunner at server level, called the JobSubmitter. This is just a transformer that starts a new process on FME Server.

 

 


I have never tested, but you say 'since WorkspaceRunner is pretty slow'. What makes the WorkspaceRunner a slow solution?

I'm wondering about this as well. In my experience, unless you launch tens or hundreds of child workspaces using the WorkspaceRunner, the performance is pretty much identical to running it inside the main workspace.

I have never tested, but you say 'since WorkspaceRunner is pretty slow'. What makes the WorkspaceRunner a slow solution?

It's probably me and I'm using the WorkspaceRunner incorrectly. But I'm using the FeatureWriter to write to a shapefile and then proceeds to the Validation Workbench (WorkspaceRunner). There are about 500 features being processed and it takes about 1 and half mins for one feature to process through the workspace runner. 1 and half mins doesn't seem slow, but when I'm about to process large files, it's not what I need.

 

 

Again, maybe I'm not using the correct parameters within the transformer and that might be causing the WorkspaceRunner to be slow.

 

 


I agree that less can be better. But because of different requested validation rules, the data validation workbench consist of around 80 different transformers. Would it be best to run this larger than normal workbench in FME Server? Instead of consolidating into one workbench.

 

 

We've got workbenches that are upwards of 1500 transformers. FME will handle 80 just fine.

 

 


It's probably me and I'm using the WorkspaceRunner incorrectly. But I'm using the FeatureWriter to write to a shapefile and then proceeds to the Validation Workbench (WorkspaceRunner). There are about 500 features being processed and it takes about 1 and half mins for one feature to process through the workspace runner. 1 and half mins doesn't seem slow, but when I'm about to process large files, it's not what I need.

 

 

Again, maybe I'm not using the correct parameters within the transformer and that might be causing the WorkspaceRunner to be slow.

 

 

Create one Workspace that creates all the data. And create one Workspace that Validates all the data. Then create a Workspace with one Creator Transformer and 2 WorkspaceRunners. This is just as fast as doing both individual.

 

I can understand it is very slow if you call the WorkspaceRunner for each feature. In that case convert the second Workspace to a Custom Transformer.

 

 


Hi David.

Whenever I need to control the flow of workspaces being run, I do it with a "master" workspace, that runs the "slave" workspaces with WorkspaceRunner. In this transformer, you can opt to wait for completion of the workspace.


The best way would be FME Server!

You can trigger the second workspace to run on success after the first workspace, using notifications.

Using FME Desktop, you have 2 obvious options:

Using the WorkspaceRunner (as you mentioned) or using a batch file with two calls to FME.

The batch file option does not guarantee that the first workspace has been successful, which is a disadvantage.

I would use FME server if available and the WorkspaceRunner in other cases (should not be slower than running the 2 workspaces in a batch file, but allows for a success check).

Hi @erik_jan,

Could you please tell me why do you think batch execution could not be tracked for successful execution? I use always this option with checking return value of the batch and I never had problems with it.


Hi @erik_jan,

Could you please tell me why do you think batch execution could not be tracked for successful execution? I use always this option with checking return value of the batch and I never had problems with it.

Do you get a log file for the overall process? Using a "Runner" workspace, calling "child" workspaces, log files are provided for all steps to be investigated.


Do you get a log file for the overall process? Using a "Runner" workspace, calling "child" workspaces, log files are provided for all steps to be investigated.

Yes, calling batch with "LOG_STANDARDOUT Yes" option provides full log.


Hi David.

Whenever I need to control the flow of workspaces being run, I do it with a "master" workspace, that runs the "slave" workspaces with WorkspaceRunner. In this transformer, you can opt to wait for completion of the workspace.

How do you accomplish this? How do you kick off the first slave workspace? It requires an input - what is the input? For the second workspace, I just connected in the first workspace as the input. But, what's the input for the first?


How do you accomplish this? How do you kick off the first slave workspace? It requires an input - what is the input? For the second workspace, I just connected in the first workspace as the input. But, what's the input for the first?

You can initiate any workspace with a Creator transformer.


Reply