Skip to main content

I have an intermittent problem where an FMEServerJobWaiter fails to wait for a job to complete and rejects it after 30 mins-or-so when in fact the job is still running and completes successfully (it gets rejected with a 502 gateway error).

Is there a difference in the way the FMEServerJobWaiter works compared to the Wait for Jobs to Complete parameter in the FMEServerJobSubmitter? - might that give more reliable performance?

Using FME2021.2.

Hi @btl​ thank you for your post!

The FMEServerJobWaiter is designed to wait for FME Server jobs to be started or finished based on job IDs. The FMEServerJobSubmitter on the other hand submits jobs to be run on the server. The FMEServerJobSubmitter has an option though to wait until all submitted jobs are completely processed before proceeding.

 

Depending on your use case and how you are needing to process things, the FMEServerJobSubmitter, with the option "Wait for jobs to complete" enabled may provide better results. There may also be a better way to chain your jobs together to run more effectively. If you are able to provide more context on what you are trying to achieve I'd be happy to help!

 

Kate


Hi @btl​ thank you for your post!

The FMEServerJobWaiter is designed to wait for FME Server jobs to be started or finished based on job IDs. The FMEServerJobSubmitter on the other hand submits jobs to be run on the server. The FMEServerJobSubmitter has an option though to wait until all submitted jobs are completely processed before proceeding.

 

Depending on your use case and how you are needing to process things, the FMEServerJobSubmitter, with the option "Wait for jobs to complete" enabled may provide better results. There may also be a better way to chain your jobs together to run more effectively. If you are able to provide more context on what you are trying to achieve I'd be happy to help!

 

Kate

Hi Kate - it's a process that we've had running successfully on FME 2019 for some time but with a move to 2021 on an Azure server this issue has popped up. The process just extracts some data from a db and writes several CSVs before putting them in a zip and moving it to another location.

I have a runner workspace where the FMEServerJobSubmitter kicks off a few jobs with a worker workspace, which extracts the CSVs - one job does one particularly large file (almost 2GB), this is the problematic one, meanwhile a other jobs do groups of smaller CSVs. The FMEServerJobWaiter then waits for the jobs to finish so all the files can be zipped.

Hopefully, as I'm just updating this process to a newer version, it can be tweaked rather that doing a major reconfiguration to how it works :)


Hi Kate - it's a process that we've had running successfully on FME 2019 for some time but with a move to 2021 on an Azure server this issue has popped up. The process just extracts some data from a db and writes several CSVs before putting them in a zip and moving it to another location.

I have a runner workspace where the FMEServerJobSubmitter kicks off a few jobs with a worker workspace, which extracts the CSVs - one job does one particularly large file (almost 2GB), this is the problematic one, meanwhile a other jobs do groups of smaller CSVs. The FMEServerJobWaiter then waits for the jobs to finish so all the files can be zipped.

Hopefully, as I'm just updating this process to a newer version, it can be tweaked rather that doing a major reconfiguration to how it works :)

Thank you for those details! Just to confirm as well, is your workbench the same version/build as your server? If so, did you upgrade the transformers to the latest versions (if applicable)?


I would assume the mechanism is similar for waiting but not 100% sure. It's possible the JobSubmitter keeps the connection open and waits for a response while the job Waiter almost certainly polls the server periodically to check the status.

My guess is that the job waiter is using the Rest service to check a jobs status but it could be another process​.

It's hard to say why the 502 error is happening but presumably it's related​ to some kind of routing. If the server is small thought it might just be that it doesn't have the capability to polroperly handle the responses though.

Depending on your fme server set up you could try and change the web connection to use https://localhost instead of the server url - this only works for non-distributed systems. This would likely remove any weird routing effects when trying to resolve a url.

Another thing you can try is to just add another waiter onto the rejected port​. This won't solve the issue but should reduce its frequency.


Hi Kate - it's a process that we've had running successfully on FME 2019 for some time but with a move to 2021 on an Azure server this issue has popped up. The process just extracts some data from a db and writes several CSVs before putting them in a zip and moving it to another location.

I have a runner workspace where the FMEServerJobSubmitter kicks off a few jobs with a worker workspace, which extracts the CSVs - one job does one particularly large file (almost 2GB), this is the problematic one, meanwhile a other jobs do groups of smaller CSVs. The FMEServerJobWaiter then waits for the jobs to finish so all the files can be zipped.

Hopefully, as I'm just updating this process to a newer version, it can be tweaked rather that doing a major reconfiguration to how it works :)

That's right, both the same build and all transformers upgraded.


Reply