Support Center > Search Results > SecureKnowledge Details
When to use 'fwha_freeze_state_machine_timeout' parameter Technical Level
Symptoms
  • During Security Policy installation, failover might occur in the ClusterXL configured in High Availability mode.
Solution

Policy installation may, in certain cases, cause a cluster member to initiate a failover in ClusterXL configured in High Availability mode.

To prevent this situation, you can use the global kernel parameter fwha_freeze_state_machine_timeout.

This parameter sets the number of seconds, during which the state of each cluster member in ClusterXL configured in High Availability mode will be "frozen" starting from the moment the policy installation starts on the member, and until the count-down reaches zero.

  • To get the current timeout of the "freeze" mechanism, run this command on each member:

    [Expert@HostName]# fw ctl get int fwha_freeze_state_machine_timeout
  • To enable the "freeze" mechanism, on-the-fly, run this command on each member:

    [Expert@HostName]# fw ctl set int fwha_freeze_state_machine_timeout VALUE_IN_SECONDS

    where VALUE_IN_SECONDS is an integer number - timeout in seconds either in DEC, or HEX format [default value = 0 (zero)].
  • To disable the "freeze" mechanism, on-the-fly, run this command on each member:

    [Expert@HostName]# fw ctl set int fwha_freeze_state_machine_timeout 0
  • To set the value of the "freeze" mechanism permanently:

    Follow sk26202 - Changing the kernel global parameters for Check Point Security Gateway and set the timeout in seconds in DEC format.

 

Notes:

  1. This feature applies only to ClusterXL configured in High Availability mode.

  2. In versions R75.46 and lower: When the kernel parameter 'fwha_freeze_state_machine_timeout' is enabled (set to some value per sk25971 / sk26202), and Cluster Member priorities were changed, then during policy installation, the cluster configuration on members will not be updated correctly even though output of "cphaprob state" command shows that the Member IDs and their state have changed. For workaround, refer to sk66064 - Change of Cluster Member priority when the kernel parameter 'fwha_freeze_state_machine_timeout' is enabled may cause network outage.

  3. On VSX cluster members, the "freeze" mechanism applies only to cluster member itself (Virtual System 0). It does not apply to any other Virtual Systems.

  4. Starting in R75.40VS version, the "freeze" mechanism is enabled by design - the default value of 'fwha_freeze_state_machine_timeout' parameter is set to 30 seconds.
    (In R75.40VS / R76 and above in VSX mode, the VS0 will monitor/perform state change lock even when other Virtual Systems get the policy).

  5. Try setting the desired value in DEC format. If the value does not work (not accepted), then try setting the desired value in HEX format.
    In versions R75.30, R75.40 and above, the timeout value should be set only in DEC format.

  6. When the "freeze" mechanism is enabled, the following messages will appear in /var/log/messages file during policy installation:

    ;FW-1: fwha_state_freeze: FREEZING state machine at CURRENT_STATE (time=HTU,caller=fwha_set_conf);
    ;FW-1: fwha_state_freeze: ENABLING state machine at CURRENT_STATE (time=HTU,caller=policy change - finished changes (fwha_start));

    The following messages in /var/log/messages file are normal during the boot of the machine:

    ;FW-1: fwha_state_freeze: FREEZING state machine at FAILURE (time=HTU,caller=fwha_set_conf);
    ;FW-1: fwha_state_freeze: ENABLING state machine at FAILURE (time=HTU,caller=policy change - finished changes (fwha_start));

    When a member is just started after reboot, by design, the state of this member is "Initialized" and then "Down". After performing Full Sync with peer member, and after installing the policy, this member will change its state to next allowed - Active or Standby. The "freeze mechanism" starts working while the member is loading its configuration, and freezing its state at "failure" (i.e., not Active or Standby).

    Note: "HTU" Stands for "HA Time Unit". All internal time in ClusterXL is measured in HTUs (the times in cluster debug also appear in HTUs). Formula in the code:
    1 HTU = 10 x fwha_timer_base_res = 10 x 10 milliseconds = 100 ms

  7. This parameter is not related to the State Synchronization mechanism in any way. It is related to what Check Point calls the "state machine". The "state machine" is responsible for determining the state of each machine in the cluster, i.e., whether the machine is Active/Standby/Down. When the state of the machine is changed, failover takes place. During policy installation, there are cases, in which, the state is changed, and consequently an unwanted failover may occur. Correctly setting fwha_freeze_state_machine_timeout should prevent the unwanted failover.

  8. Correctly setting fwha_freeze_state_machine_timeout should also prevent unwanted failovers in a 3rd party cluster, especially in cases, in which the 3rd party environment may bring the cluster down during policy installation. In 3rd party environments, the state of the cluster member is determined by the 3rd party environment. Whereas, in ClusterXL, the state of the cluster member is determined by the ClusterXL state machine code, which may cause unwanted failovers during policy installation.

 

Related Solutions:

Applies To:
  • This SK replaces sk115642

Give us Feedback
Please rate this document
[1=Worst,5=Best]
Comment