Skip to main content

Hi everyone,

 

I have a parent workspace that looks like a series of FMEJobSubmitter transformers:

Each FMEServerJobSubmitter set up at below:

Submit Jobs: In Sequence

Wait for the Jobs to Complete: Yes

 

When I call this parent workspace from the Web Interface, something weird is happening. First job gets submitted straight away and it becomes Completed after a second. Then the second one will only appear in Running jobs after about 45 seconds. It is not in the Queued and one of the engines is not doing anything.

The same case is happening for the rest of the workspaces - they will only appear in Running after about 45 seconds once the previous one gets Completed.

It is worth mentioning that we installed our FME server as distributed components (3-tier).

If it is installed as Express, it is working fine.

 

Any help, please?

PS We are using the latest build (20596)

Can you share a screenshot of the configuration of the first FMEServerJobSubmitter? I'm assuming that they are configured similarily.


Can you share a screenshot of the configuration of the first FMEServerJobSubmitter? I'm assuming that they are configured similarily.

Hi David, please, see below. e attached.


Have you looked in the FME Server logs to see exactly what is happening with the jobs and the engines? In particular, check logs / engine / current / fmeprocessmonitorengine.log


Have you looked in the FME Server logs to see exactly what is happening with the jobs and the engines? In particular, check logs / engine / current / fmeprocessmonitorengine.log

I have checked the logs and this is what I found.

The job 312 finished at 22:25:45, and the 313 started at 22:26:29, which is 44 seconds after the 312 has finished.

Now, let's look at the log:

So, 312 job was submitted at 10:25:44, at 10:25:45 it was finished ant than there was a problem delivering results and some other stuff before job 313 was submitted. Not sure where the problem is.

I have checked the Windows/System32/drivers/etc/host on the server where web application is installed and it is exactly as explained here:

https://knowledge.safe.com/articles/333/request-times-out-network-error-between-tomcat-and.html

 


I have also noticed that if I go to Licensing & Engines -> Engine page, after a few seconds I get this error.

I did what was explained if you hit the link, but it did not solve the problem. I have a feeling that this may be connected to the issue I am having.


I have also noticed that if I go to Licensing & Engines -> Engine page, after a few seconds I get this error.

I did what was explained if you hit the link, but it did not solve the problem. I have a feeling that this may be connected to the issue I am having.

I agree, if there are e.g. internal timeouts it could explain some delays. I think a closer look at the server system logs and network configuration is in order.


I know this is old... but I wanted to share some information that is potentially related to this.

 

I have recently seen the reported behaviour when a SQL Server AlwaysOn was where the FME Server System Database was hosted, but the multisubnetfailover=true flag was not added to the FME Server Database Connection in the config files of the FME Server. So what was happening was the two cores/engines were interacting with different databases (behind the curtains of AlwaysOn) and this was presenting problems to how the jobs were processed, slowing them down considerably (delaying being pulled from the queue it seems).

 

The customer, in my case, added the flag and jobs were no longer delayed.

Please review the documentation for Changing the Database Provider.

 

There is a possibility of other clustered Databases where this could also present a similar behaviour but I have little evidence of this yet and one possible suspect. Please do ensure the DB_JDBC_URL in the FME Server Config file(s) are properly set for the clustered database you are using.

 

If you have any questions don't hesitate to post a new question or reach out to Safe Software Support.


Reply