Support Center > Search Results > SecureKnowledge Details
High CPU, high IOWait utilization on random CPUs, and delayed CLI outputs on various commands Technical Level
Symptoms
  • The Security Gateway is under high load. The output of the top command on the Security Gateway shows that at least one CPU core is at 100% load.
  • There is a high IOWait utilization on random CPUs.
  • There are delayed CLI outputs on various commands.
    Example: fw stat.
  • Notes:
    • The Security Gateway is running with Check Point kernel 3.10.
    • The Security Gateway is running with RAID.
Cause

When the CPU consumption is constantly high, there are no available resources to run the RCU (Read, Copy, Update) scheduled functions that are needed in order to allow RCU fluidity.

RAID uses RCU in a critical section that holds the file system, and as a result of stalled RCU callbacks this section blocks attempts to access the disk. For example, a process which requires reading data from the registry is delayed.

Once the CPU consumption is reduced, the system is able to run RCU callbacks quickly enough and the RAID critical section is released without delay.


Solution
The following kernel parameter is available in the Check Point kernel to allow CPU rescheduling even under a heavy load: kiss_kthread_allow_resched (disabled "0" by default).


Instructions for Security Gateway R80.30 (kernel 3.10) and above

To force the Security Gateways' / cluster members' rescheduling mechanism, permanently set the value of kernel parameter kiss_kthread_allow_resched to 1 (one) on the relevant Security Gateways / cluster members as per sk26202 (Changing the kernel global parameters for Check Point Security Gateway):

  1. Set the parameter on-the-fly and permanently:
    # fw ctl set -f int kiss_kthread_allow_resched 1
  2. Check the contents of the $FWDIR/boot/modules/fwkern.conf file:

    [Expert@HostName]# cat $FWDIR/boot/modules/fwkern.conf
  3. Verify that the configured value is set:

    [Expert@HostName]# fw ctl get int kiss_kthread_allow_resched
This solution has been verified for the specific scenario, described by the combination of Product, Version and Symptoms. It may not work in other scenarios.

Give us Feedback
Please rate this document
[1=Worst,5=Best]
Comment