Support Center > Search Results > SecureKnowledge Details
ClusterXL Sync Statistics - output of 'cphaprob syncstat' command Technical Level
Solution

For R80.20 and higher

See the ClusterXL Administration Guide for your version > Section "Monitoring Delta Synchronization".

This section describes and explains the output parameters of the show cluster statistics sync and cphaprob syncstat commands.

Example output from a cluster member:

Delta Sync Statistics
Sync status: OK
Drops:
Lost updates................................. 0
Lost bulk update events...................... 0
Oversized updates not sent................... 0
Sync at risk:
Sent reject notifications.................... 0
Received reject notifications................ 0
Sent updates:
Total generated sync messages................ 12316
Sent retransmission requests................. 0
Sent retransmission updates.................. 0
Peak fragments per update.................... 1
Received updates:
Total received updates....................... 12
Received retransmission requests............. 0
Queue sizes (num of updates):
Sending queue size........................... 512
Receiving queue size......................... 256
Fragments queue size......................... 50
    Timers:

The "Sync status:" section

This section shows the status of the Delta Sync mechanism. One of these:

  • Sync status: OK
  • Sync status: Off - Full-sync failure
  • Sync status: Off - Policy installation failure
  • Sync status: Off - Cluster module not started
  • Sync status: Off - SIC failure
  • Sync status: Off - Full-sync checksum error
  • Sync status: Off - Full-sync received queue is full
  • Sync status: Off - Release version mismatch
  • Sync status: Off - Connection to remote member timed-out
  • Sync status: Off - Connection terminated by remote member
  • Sync status: Off - Could not start a connection to remote member
  • Sync status: Off - cpstart
  • Sync status: Off - cpstop
  • Sync status: Off - Manually disabled sync
  • Sync status: Off - Was not able to start for more than X second
  • Sync status: Off - Boot
  • Sync status: Off - Connectivity Upgrade (CU)
  • Sync status: Off - cphastop
  • Sync status: Off - Policy unloaded
  • Sync status: Off - Hibernation
  • Sync status: Off - OSU deactivated
  • Sync status: Off - Sync interface down
  • Sync status: Fullsync in progress
  • Sync status: Problem (Able to send sync packets, unable to receive sync packets)
  • Sync status: Problem (Able to send sync packets, saving incoming sync packets)
  • Sync status: Problem (Able to send sync packets, able to receive sync packets)
  • Sync status: Problem (Unable to send sync packets, unable to receive sync packets)
  • Sync status: Problem (Unable to send sync packets, saving incoming sync packets)
  • Sync status: Problem (Unable to send sync packets, able to receive sync packets)

The "Drops:" section

This section shows statistics for drops on the Delta Sync network.

Field Description
Lost updates

Shows how many Delta Sync updates this cluster member considers as lost (based on sequence numbers in CCP packets).

If this counter shows a value greater than 0, this cluster member lost Delta Sync updates.

Possible mitigation: Increase the size of the Sending Queue and the size of the Receiving Queue:

  • Increase the size of the Sending Queue, if the counter "Received reject notification" is increasing.
  • Increase the size of the Receiving Queue, if the counter "Received reject notification" is not increasing.
Lost bulk update events

Shows how many times this cluster member missed Delta Sync updates. (bulk update = twice the size of the local receiving queue)

This counter increases when this cluster member receives a Delta Sync update with a sequence number much greater than expected. This probably indicates some networking issues that cause massive packet drops.

This counter increases when the amount of missed Delta Sync updates is more than twice the local Receiving Queue Size.

Possible mitigation: 

  • If the counter's value is steady, this might indicate a one-time synchronization problem that can be resolved by running manual Full Sync. See sk37029.
  • If the counter's value keeps increasing, it is probable that there are some networking issues. Increase the sizes of both the Receiving Queue and Sending Queue.
Oversized updates not sent Shows how many oversized Delta Sync updates were discarded before sending them.

This counter increases when Delta Sync update is larger than the local Fragments Queue Size.

Possible mitigation:

  • If the counter's value is steady, increase the size of the Sending Queue.
  • If the counter's value keeps increasing, contact Check Point Support.

The "Sync at risk:" section

This section shows statistics that the Sending Queue is at full capacity and rejects Delta Sync retransmission requests.

Field Description
Sent reject notifications

Shows how many times this cluster member rejected Delta Sync retransmission requests from its peer cluster members because this cluster member does not hold the requested Delta Sync update anymore.

    Received reject notification 

    Shows how many reject notifications this cluster member received from its peer cluster members.

      The "Sent updates:" section

      This section shows statistics for Delta Sync updates sent by this cluster member to its peer cluster members.

      Field Description
      Total generated sync messages

      Shows how many Delta Sync updates were generated. This counts the Delta Sync updates, Retransmission Requests, Retransmission Acknowledgments, and so forth).

        Sent retransmission requests

        Shows how many times this cluster member asked its peer cluster members to retransmit specific Delta Sync update(s).

        Retransmission requests are sent when certain Delta Sync updates (with a specified sequence number) are missing, while the sending cluster member already received Delta Sync updates with advanced sequences.

        Note - Compare the number of Sent retransmission requests to the Total generated sync messages of the other cluster members.

        A large counter's value can imply connectivity problems. If the counter's value is unreasonably high (more than 30% of the Total generated sync messages of other cluster members), contact Check Point Support equipped with the entire output and a detailed description of the network topology and configuration.

          Sent retransmission updates Shows how many times this cluster member retransmitted specific Delta Sync update(s) at the requests from its peer cluster members.
          Peak fragments per update Shows the peak amount of fragments in the Fragments Queue on this cluster member (usually, should be 1).

          The "Received updates:" section

          This section shows statistics for Delta Sync updates that were received by this cluster member from its peer cluster members.

          Field Description
          Total received updates

          Shows the total number of Delta Sync updates this cluster member received from its peer cluster members.

          This counts only Delta Sync updates (not Retransmission Requests, Retransmission Acknowledgments, and others).

            Received retransmission requests

            Shows how many retransmission requests this cluster member received from its peer cluster members.

            A large counter's value can imply connectivity problems. If the counter's value is unreasonably high (more than 30% of the Total generated sync messages on this cluster member), contact Check Point Support equipped with the entire output and a detailed description of the network topology and configuration.

              The "Queue sizes (num of updates):" section

              This section shows the sizes of the Delta Sync queues.

              Field Description
              Sending queue size

              Shows the size of the cyclic queue, which buffers all the Delta Sync updates that were already sent until it receives an acknowledgment from the peer cluster members.

              This queue is needed for retransmitting the requested Delta Sync updates.

              Each cluster member has one Sending Queue.

              Default: 512 Delta Sync updates, which is also the minimal value.

                Receiving queue size

                Shows the size of the cyclic queue, which buffers the received Delta Sync updates, in two cases:

                • When Delta Sync updates are missing, this queue is used to hold the remaining received Delta Sync updates until the lost Delta Sync updates are retransmitted (cluster members must keep the order, in which they save the Delta Sync updates in the kernel tables).
                •  This queue is used to re-assemble a fragmented Delta Sync update.

                Each cluster member has one Receiving Queue.

                Default: 256 Delta Sync updates, which is also the minimal value.

                  Fragments queue size

                  Shows the size of the queue, which is used to prepare a Delta Sync update before moving it to the Sending Queue.

                  Notes:

                  • This queue must be smaller than the Sending Queue.
                  • This queue must be significantly smaller than the Receiving Queue.
                  Default: 50 Delta Sync updates, which is also the minimal value.

                  The "Timers:" section

                  This section shows the Delta Sync timers.

                  Field Description
                  Delta Sync interval (ms)

                  Shows the interval at which this cluster member sends the Delta Sync updates from its Sending Queue.

                  The base time unit is 100ms (or 1 tick).

                  Default: 100 ms, which is also the minimum value.

                    The "Reset on XXX (triggered XXX)" section

                    Shows the date and the time of last statistic's reset.

                    In parentheses, it shows how the last statistics was triggered - manually, or by fullsync.


                    For R80.10 and lower

                    Show / Hide this section

                    Refer to ClusterXL Administration Guide (R55, R60, R61, R62, R65, R70, R71, R75, R75.20, R75.40, R75.40VS, R76, R77.x) - Chapter 'Monitoring and Troubleshooting' Gateway Clusters - Troubleshooting Synchronization.

                    Example:

                    Sync Statistics (IDs of F&A Peers - 1 2 3 4 5 6 7 ):
                    
                    Other Member Updates:
                    Sent retransmission requests...................  165
                    Avg missing updates per request................  1
                    Old or too-new arriving updates................  5661
                    Unsynced missing updates.......................  0
                    Lost sync connection (num of events)...........  4354
                    Timed out sync connection .....................  1
                    
                    Local Updates:
                    Total generated updates .......................  9180670
                    Recv Retransmission requests...................  1073
                    Recv Duplicate Retrans request.................  2564
                    
                    Blocking Events................................  0
                    Blocked packets................................  0
                    Max length of sending queue....................  4598
                    Avg length of sending queue....................  0
                    Hold Pkts events...............................  1
                    Unhold Pkt events..............................  1
                    Not held due to no members.....................  16
                    Max held duration (sync ticks).................  0
                    Avg held duration (sync ticks).................  11
                    
                    Timers:
                    Sync tick (ms).................................  100
                    CPHA tick (ms).................................  100
                    
                    Queues:
                    Sending queue size.............................  512
                    Receiving queue size...........................  256
                    

                     

                    Output section Explanation Limits
                    IDs of F&A Peers The F&A (Flush and Ack) peers are the cluster members that this member recognizes as being part of the cluster. The IDs correspond to IDs and IP addresses shown by the 'cphaprob state' command.  
                    Other Member Updates: The statistics in this section relate to Delta Sync updates generated by other cluster members, or to Delta Sync updates that were not received from the other members. Updates inform about changes in the connections handled by the cluster member, and are sent from and to members. Updates are identified by sequence numbers.  
                    Sent retransmission requests The number of retransmission requests, which were sent by this member. Retransmission requests are sent when certain packets (with a specified sequence number) are missing, while the sending member already received updates with advanced sequences. Has to be less than 30% of "Total generated updates" ON OTHER MEMBERS.
                    Avg missing updates per request Each retransmission request can contain up to 32 missing consecutive sequences. The value of this field is the average number of requested sequences per retransmission request. More than 20 can imply connectivity problems.
                    Old or too-new arriving updates The number of arriving Delta Sync updates where the sequence number is too low, which implies it belongs to an old transmission, or too high, to the extent that it cannot belong to a new transmission. Has to be less than 10% of "Total generated updates" ON THIS MEMBER.
                    (Note: when several Sync networks are configured, this counter grows very fast because all Sync networks work in parallel).
                    Unsynced missing updates The number of missing Delta Sync updates, for which the receiving member stopped waiting. It stops waiting when the difference in sequence numbers between the newly arriving updates and the missing updates is larger than the length of the "Receiving Queue". Should be 0 - less than 1% of "Total generated updates" is acceptabe.
                    Lost sync connection (number of events) The number of events, in which synchronization with another member was lost and regained due to either Security Policy installation on the other member, or a large difference between the expected and received sequence number.
                    During each policy installation, Delta Sync mechanism is reinitialized on each member. During the reinitialization, this counter is increased by several (at most) on each member, because Sync is lost and regained.
                    In ideal situation should be 0. If the value keeps growing without policy installation, it indicates connectivity problems between the members.
                    Timed out sync connection The number of events, in which the member declares another member as not connected. The member is considered as disconnected because no CCP packets with ACK were received from that member for a period of time (1 second), even though there are Flush and Ack packets being held for that member. Should be 0 - positive value indicates connectivity problems.
                    Local Updates: The statistics in this section relate to Delta Sync updates generated by the local cluster member. Updates inform about changes in the connections handled by the cluster member, and are sent from and to members. Updates are identified by sequence numbers.  
                    Total generated updates The number of Delta Sync updates generated by the Sync mechanism since the statistics were last reset (with 'cphaprob -reset syncstat' command). Its value is the same as the difference between the sequence number when applying the 'cphaprob -reset syncstat' command, and the current sequence number. Can have any value.
                    Recv Retransmission requests The number of received retransmission requests. A member requests retransmissions when it is missing specified packets with lower sequence numbers than the ones already received. Should be less than 30% of "Total generated updates" ON THIS MEMBER.
                    Recv Duplicate Retrans request The number of duplicated retransmission requests received by the member. Duplicate requests were already handled, and so are dropped. Should be less than 30% of "Total generated updates" ON THIS MEMBER.
                    Blocking Events Under extremely heavy load conditions, the cluster member may block new connections (refer to sk43896). This counter shows the number of times that the cluster member started blocking new connections due to Sync overload. If "Block New Connections" mechanism is enabled (per sk43896), then positive value indicates heavy load.
                    Blocked packets The number of packets that were blocked because the cluster member was blocking all new connections (see 'Blocking Events' above). The number of blocked packets is usually one packet per new connection attempt. Higher than 5% of "Avg length of sending queue" can imply connectivity problems.
                    Max length of sending queue The size of the Sending Queue is fixed and by default, it is 512 sync words. This size is controlled via kernel parameter fw_sync_sending_queue_size.
                    As newer Delta Sync updates with higher sequence numbers enter the queue, older Delta Sync updates with lower sequence numbers drop off the end of the queue. An older update could be dropped from the queue before the member receives an ACK about that Delta Sync update from all the other members.
                    This counter is the difference between the current Delta Sync sequence number and the last sequence number, for which the member received an ACK from all the other members.
                    If "Block New Connections" mechanism is enabled (per sk43896), then should be less than "Sending queue size".
                    Avg length of sending queue The average value of the 'Max length of sending queue', since last reboot or since the Sync statistics were reset. If "Block New Connections" is enabled (per sk43896), then should be less than 80% of "Sending Queue size".
                    Hold Pkts events The number of event, where the Delta Sync update required Flush and Ack, and so was kept within the system until an ACK arrived from all the other functioning members Should be the same as "Unhold Pkt events".
                    Unhold Pkt events The number of events, when the member received all the required ACKs from the other functioning members. Should be the same as "Hold Pkt events".
                    Not held due to no members The number of packets, which should have been held within the system, but were released because there were no other operating members. Should be 0 - positive value indicates connectivity problem between the members.
                    Max held duration (sync ticks) The maximal time in cluster ticks (1 tick equals 100ms), for which a held packet was delayed in the system for Flush and Ack purposes. Should be less than 50 - positive value indicates connectivity problem between the members.
                    Avg held duration (sync ticks) The average duration in cluster ticks (1 tick equals 100ms), for which the held packets were delayed within the system for Flush and Ack purposes. Should be about the Round-Trip Time (RTT) of the Sync network. A larger value indicates connectivity problem.
                    Timers: The values in this section relate to internal timers that control Sync and cluster related actions.  
                    Sync tick (ms) Timer interval for Delta Sync operations. The value is controlled via kernel parameter fwha_timer_sync_res per sk41471.
                    Default value is 100 ms (minimal possible value).
                    CPHA tick (ms) Timer interval for cluster operations (excluding Delta Sync). The value is controlled via kernel parameter fwha_timer_cpha_res per sk43872.
                    Default value is 100 ms (minimal possible value).
                    Queues: The values in this section relate to the sizes of Delta Sync Queues.  
                    Sending queue size The Sending Queue on the cluster member stores locally generated Delta Sync updates. Updates in the Sending Queue are replaced by more recent updates. In a highly loaded cluster, updates are therefore kept for less time. If a member is asked to retransmit an update, it can only do so if the update is still in its Sending Queue.
                    Each member has one sending queue.
                    The value is controlled via kernel parameter fw_sync_sending_queue_size per sk82080.
                    Default value is 512 sync words (minimal possible value).
                    Receiving queue size The Receiving Queue on the cluster member keeps the updates from each cluster member until it has received a complete sequence of updates.
                    Each member keeps a Receiving Queue for each of the peer members.
                    The value is controlled via kernel parameter fw_sync_recv_queue_size per sk82080.
                    Default value is 256 sync words (minimal possible value).


                    Related Solutions:

                    Give us Feedback
                    Please rate this document
                    [1=Worst,5=Best]
                    Comment