Skip to main content
Question

Engines disappear when active core is down in a fault tolerant installation

  • 11 February 2020
  • 2 replies
  • 12 views

Hi,

I have installed a Fault tolerant FME Server 2019.2. There are two Core servers and 4 engine servers with one engine each, no engines on the core servers. A third party load balancer take care of the active/passive routing to the core machines. When I look at the Deployment Status in the webUI, all 4 engines are tied to the engine manager on the active core host.

The issue appears if I stop the FME Server services on the active core host to test the active/passive behaviour. When the active core is down there are no engines left and the only online core have taken over the Job routing. If I send jobs to the FME Server all end up in the queue. When I start the services again the engines show up again and look like they are tied to the now following engine manager.

This should not be the case. The engines shuld swich to the active core when the first one is down. This behaviour appeared after an upgrade (uninstall/install) from 2019.0 to 2019.2. Before the upgrade the engines never dissapered, they just moved to the second core. I can't find any setting to get the right behaviour. As far as I can see, I have followed the installation documentation for the fault tolerant installation.

Thanks,

/Per Angerud

Hi @swedper,

I'm sorry we are so long in replying to this question.

Where are you with this problem. This should not be the case.

I'm curious about what the installation steps were to build this environment.

In my testing of 2019.2, all distributed engines did as you have expected and moved to the other core. If you are still experiencing this I would invite you to submit a case at www.safe.com/support so we can review your system.

 

Again, I'm really sorry for our delay in responding to your post.

Hi @swedper,

I'm sorry we are so long in replying to this question.

Where are you with this problem. This should not be the case.

I'm curious about what the installation steps were to build this environment.

In my testing of 2019.2, all distributed engines did as you have expected and moved to the other core. If you are still experiencing this I would invite you to submit a case at www.safe.com/support so we can review your system.

 

Again, I'm really sorry for our delay in responding to your post.

Hi,

 

I did some new test and it actually works as expected. I was to fast on the trigger and didn't let FME Server get the time it needed to switch the engines to the passive node. I found that the first time this happens it takes a while (like 5 minutes) before anything happens. The second time the switch is more rapid.

Thanks,

/Per


Reply