Cluster member is Down after reboot / policy installation / running 'cpstart'
||Cluster - 3rd party, ClusterXL
|Platform / Model
Output of '
cphaprob state' command shows that the state of a cluster member is '
Output of '
cphaprob list' command on the problematic cluster member shows that the state of Critical Device '
Synchronization' is '
Forcing Full Synchronization (per sk37029) on the problematic cluster member does not resolve the issue.
Debugging Full Synchronization (per sk37030) on the problematic cluster member shows the following:
fwclient_connected: connection failed
fwsyncn_connected: connection to X.X.X.X failed. error=-1
fwsync_set_fwlddist_state: FWACTLDCHGS ioctl returned success (mask: 80, value: 0, instance -2)
fwsync: command failed. Try adding '-d' for debugging information
Traffic capture on the problematic cluster member shows that traffic on TCP port 256 (used by FWD daemons for Full Synchronization) is sent to working peer member, but there is no reply.
FW Monitor (with flag '
-p all') on the working cluster member shows that traffic on TCP port 256 is not passing all the expected Inbound chains, and there are no reply packets.
Kernel debug on the working cluster member ('
fw ctl debug -m fw + drop') shows that it drops Full Synchronization traffic (TCP port 256) from the problematic peer member on the Clean Up rule:
;fw_log_drop: Packet proto=6 IP_of_Problematic_Member:Some_Port -> IP_of_Working_Member:256 dropped by fw_handle_first_packet Reason: Rulebase drop - rule Number_of_Clean_Up_Rule;
The current security policy does not allow the cluster members to communicate on TCP port 256 (used by FWD daemons for Full Synchronization).
Note: To view this solution you need to