Support Center > Search Results > SecureKnowledge Details
Cluster status change notifications causing redundant ADLOG reconf leading to AD Query outage Technical Level
Symptoms
  • Security Gateway / Cluster loses connection to the Domain Controller (DC) randomly.
    As a result, all users are disconnected from the Internet and can not work.

  • Output of 'adlog a dc' command during the issue shows:
    Domain controllers:
    Domain Name IP Address Connection state Events in the last hour
    =====================================================
    Ignored domain controllers on this gateway:
    No ignored domain controllers found.
    
  • Domain Controller (DC) is up and running.
    Pings from Security Gateway / Cluster to DC during the issue are passing without losses.

  • Restarting the PDP and PEP processes does not help - this random issue persists.

  • Debug of PDP daemon ("pdp debug set all all") repeatedly shows (in $FWDIR/log/pdpd.elg file) that the ADLOG reconf starts, but then stops in the middle because of a new reconf being triggered:
    ...............
    [ADQUERY(TD::Critical)] virtual void pdp::ADEvents::notifyClusterStatus(const NAC::IS::IClusterXLStatusObserver::STATUS&): Received cluster status change. calling internal reconf
    [ADQUERY(TD::Events)] void pdp::ADEvents::reconf(): Running AD Query engine - calling reconf
    [ADLOG(TD::Events)] virtual bool ADLOG::AdLogExtractor::reconf(): called in main thread ID: ...
    [ADLOG(TD::Events)] virtual bool ADLOG::AdLogExtractor::reconf(): reconf is delayed for 5 seconds from now..
    ...............
    
Cause

When running AD Query in a cluster environment, only the Active member should run ADLOG. Each time the AD Query process gets a cluster state change signal, it would either stop ADLOG or perform a reconf according to the local member state, but many times it is redundant as the state change is coming from another cluster member, and the status of this member is still the same.

A rapid flow of such notifications (e.g., because of a flapping interface) might cause an endless reconf process, leading to AD Query outage.


Solution
Note: To view this solution you need to Sign In .