How does FME Server Failover work under the covers?

Page 1 / 1

If we log the messages sent from the failover system we see the following.

This set of messages is repeated 3 times.

fail_message arrived at 16:48:58

{ "msg": "FAILOVER Node: Heartbeat not detected on host AP-FAIL-CORE2 and port 7,073.", "ws_topic": "fail_message" }

fail_message arrived at 16:48:51

{ "msg": "FAILOVER Node: Heartbeat not detected on host AP-FAIL-CORE2 and port 7,075.", "ws_topic": "fail_message" }

fail_message arrived at 16:48:50

{ "msg": "FAILOVER Node: Heartbeat not detected on host AP-FAIL-CORE2 and port 7,071.", "ws_topic": "fail_message" }

fail_message arrived at 16:48:45

{ "msg": "FAILOVER Node: Heartbeat not detected on host AP-FAIL-CORE2 and port 7,072.", "ws_topic": "fail_message" }

Then we see this sent once:

fail_message arrived at 17:23:42

{ "msg": "ACTIVE Node: AP-FAIL-CORE1 has executed failover operation on host AP-FAIL-CORE2 and port 7,071.", "ws_topic": "fail_message" }

fail_message arrived at 17:23:41

{ "msg": "ACTIVE Node: AP-FAIL-CORE1 has executed failover operation on host AP-FAIL-CORE2 and port 7,073.", "ws_topic": "fail_message" }

fail_message arrived at 17:23:40

{ "msg": "ACTIVE Node: AP-FAIL-CORE1 has executed failover operation on host AP-FAIL-CORE2 and port 7,072.", "ws_topic": "fail_message" }

fail_message arrived at 17:23:37

{ "msg": "ACTIVE Node: AP-FAIL-CORE1 has executed failover operation on host AP-FAIL-CORE2 and port 7,075.", "ws_topic": "fail_message" }

There are 4 port numbers in these messages. They correspond to the following components:

7073 = scheduling

7075 = publishers

7071 = core

7072 = notification requests? core

"AP-FAIL-CORE1 has executed failover operation on host AP-FAIL-CORE2”

^- this means: AP-FAIL-CORE2 is dead.

fmeserver.log:

ACTIVE Node: Heartbeat not detected on host AP-FAIL-CORE1 and port 7,071

ACTIVE Node: Taking over jobs from host AP-FAIL-CORE1.

I don't know how these work:

# FAILOVER_SCHEDULER_OWNER - This is the scheduler owner name of the host to be monitored. By default the

# FAILOVER_MONITOR_HOST value is used which by default corresponds with the SCHEDULER_OWNER setitng

# of the monitored host.

# FAILOVER_TRANSFORMATION_OWNER - This is the transformation owner name of the host to be monitored. By default the

# FAILOVER_MONITOR_HOST value is used which by default corresponds with the TRANSFORMATION_OWNER setitng

# of the monitored host.

Sign up