Support Center > Search Results > SecureKnowledge Details
Full Synchronization in cluster might fail due to insufficient buffer size - cluster member would be in 'Down' state because 'Synchronization' device reports its state as 'problem'
Symptoms
  • When one of the cluster member goes down while users are logged on, sometimes that cluster member cannot go back up to either Standby or Active status.

  • Output of cphaprob list command shows that the "Synchronization" device reports its state as "problem".

Cause

There is a problem with the Full Sync operation:

During a Full Sync process, the Server machine goes over its kernel tables marked for sync and synchronizes their data - bucket after bucket (when dealing with hash tables). For each bucket, it goes over all nodes in the linked list pointed by this bucket and copies their data into a buffer sent to the user space.

The problem is: If the Server machine failed copying one of the nodes because the buffer size is not big enough, it tries again and again to copy the same nodes into the buffer. That is, each time it goes back to the first node connected to the current bucket, and starts copying the linked list to the buffer in the same order it tried before, from the beginning. The nodes sizes are not changed and the buffer size is still too small, so it will fail again when reaching it's limit, and this will repeat infinitely.


Solution
Note: To view this solution you need to Sign In .