Skip to main content

If we log the messages sent from the failover system we see the following.

 

This set of messages is repeated 3 times.

 

 

fail_message arrived at 16:48:58

 

{ "msg": "FAILOVER Node: Heartbeat not detected on host AP-FAIL-CORE2 and port 7,073.", "ws_topic": "fail_message" }

 

 

fail_message arrived at 16:48:51

 

{ "msg": "FAILOVER Node: Heartbeat not detected on host AP-FAIL-CORE2 and port 7,075.", "ws_topic": "fail_message" }

 

 

fail_message arrived at 16:48:50

 

{ "msg": "FAILOVER Node: Heartbeat not detected on host AP-FAIL-CORE2 and port 7,071.", "ws_topic": "fail_message" }

 

 

fail_message arrived at 16:48:45

 

{ "msg": "FAILOVER Node: Heartbeat not detected on host AP-FAIL-CORE2 and port 7,072.", "ws_topic": "fail_message" }

 

 

Then we see this sent once:

 

 

fail_message arrived at 17:23:42

 

{ "msg": "ACTIVE Node: AP-FAIL-CORE1 has executed failover operation on host AP-FAIL-CORE2 and port 7,071.", "ws_topic": "fail_message" }

 

 

fail_message arrived at 17:23:41

 

{ "msg": "ACTIVE Node: AP-FAIL-CORE1 has executed failover operation on host AP-FAIL-CORE2 and port 7,073.", "ws_topic": "fail_message" }

 

 

fail_message arrived at 17:23:40

 

{ "msg": "ACTIVE Node: AP-FAIL-CORE1 has executed failover operation on host AP-FAIL-CORE2 and port 7,072.", "ws_topic": "fail_message" }

 

 

fail_message arrived at 17:23:37

 

{ "msg": "ACTIVE Node: AP-FAIL-CORE1 has executed failover operation on host AP-FAIL-CORE2 and port 7,075.", "ws_topic": "fail_message" }

 

 

There are 4 port numbers in these messages. They correspond to the following components:

 

7073 = scheduling

 

7075 = publishers

 

7071 = core

 

7072 = notification requests? core

"AP-FAIL-CORE1 has executed failover operation on host AP-FAIL-CORE2”

 

 

^- this means: AP-FAIL-CORE2 is dead.

fmeserver.log:

ACTIVE Node: Heartbeat not detected on host AP-FAIL-CORE1 and port 7,071


ACTIVE Node: Taking over jobs from host AP-FAIL-CORE1.

I don't know how these work:

 

 

# FAILOVER_SCHEDULER_OWNER - This is the scheduler owner name of the host to be monitored. By default the


# FAILOVER_MONITOR_HOST value is used which by default corresponds with the SCHEDULER_OWNER setitng


# of the monitored host.


#


# FAILOVER_TRANSFORMATION_OWNER - This is the transformation owner name of the host to be monitored. By default the


# FAILOVER_MONITOR_HOST value is used which by default corresponds with the TRANSFORMATION_OWNER setitng


# of the monitored host.


Reply