Support Center > Search Results > SecureKnowledge Details
Performance analysis for Security Gateway NGX R65 / R7x
Solution

In addition, refer to sk98348 - Best Practices - Security Gateway Performance.

 

The following is a list of commands that can be run and files that can be used to monitor and troubleshoot the Performance of the Security Gateway.

 

Various performance problems on Security Gateway can be divided into the following categories:

  • Problems with CPU on Security Gateway
  • Problems with memory on Security Gateway
  • Problems with traffic that is passing through Security Gateway
  • Problems with acceleration on Security Gateway

 


 

Problems with CPU on Security Gateway

 

Key words:
--- load
--- softirq
--- interrupts

Commands:

(A) cpstat -f cpu os

  • displays internal statistics for OS about CPU as collected by Check Point
  • helpful in monitoring CPU utilization

 

(B) cpstat -f multi_cpu os

  • displays internal statistics for OS about all CPUs as collected by Check Point
  • helpful in monitoring CPU utilization

 

(C) top

  • displays dynamic real-time view of a running system on Linux
  • helpful in monitoring different aspects of CPU utilization
  • look at the amount of "Idle"
  • look at the load in "User Space"
  • look at the load in "System (kernel) Space"
  • look at the amount of "SoftIRQ"
  • look at the amount of "IOwait"
  • collect this output continuously during the problem
  • output differs on Linux kernel 2.4 and Linux kernel 2.6 (have to press 1 and Shift+W)
  • man page - http://linux.die.net/man/1/top

Example from Linux kernel 2.4

 

Example from Linux kernel 2.6

 

(D) ps auxwf

  • displays information about the current processes (daemons)
  • helpful in detecting problems in User Space (Memory , CPU)
  • look at the amount of "CPU", "MEM", "VSZ", "RSS", "TIME" consumed by the daemons
  • collect this output over period of time to see the trend of memory consumption
  • man page - http://linux.die.net/man/1/ps

 

(E) cat /proc/interrupts

  • displays the number of interrupts per each IRQ
  • helpful in monitoring interrupts on CPU cores from different devices (mostly, NICs)
  • verify that the interfaces do not share the same IRQ number, which is problematic with affinity
  • man page - http://linux.die.net/man/5/proc

 

(F) vmstat X [Y]

  • displays information about processes, memory, paging, block IO, and CPU activity
  • helpful in monitoring different aspects of CPU utilization and memory utilization
  • look at the "procs" section - counter "r" (number of processes waiting for CPU)
  • look at the "memory" section - all counters
  • look at the "swap" section - reading "si" and writing "so" in swap file
  • look at the "io" section - reading "bi" and writing "bo" on hard disk
  • look at the "system" section at "cs" (number of Context Switches)
  • look at the "cpu" section - all counters
  • collect this output continuously during the problem
  • man page - http://linux.die.net/man/8/vmstat

Example from Linux kernel 2.4

 

Example from Linux kernel 2.6

 

(G) cat /proc/cpuinfo

  • displays a collection of CPU and system architecture dependent items about CPU
  • helpful in collecting information about CPU cores (architecture, vendor, number)
  • multi-CPU (SMP) machines will show information for each CPU
  • man page - http://linux.die.net/man/5/proc

 

(H) dmesg

  • displays boot up messages and message from various FireWall mechanisms
  • helpful in detecting problems in kernel and in execution of functions
  • man page - http://linux.die.net/man/8/dmesg

Recommendations:
When using SecureXL, disable Hyper-Threading in BIOS (on Check Point appliances this is disabled by default) - applies to Intel processors prior to "Intel Nehalem (Core i7)", where this technology was improved (called Simultaneous Multi-Threading)


Related Solution: sk36846 (Interpreting high SoftIRQ values)

 


 

Problems with memory on Security Gateway

 

Key words:
--- memory
--- swap

Note: Refer to sk22343 (What is the maximum memory supported by SecurePlatform) and sk71001 (High Connection Capacity (64-bit) on Gaia)

Commands:

(A) cpstat -f memory os

  • displays internal statistics for OS about memory as collected by Check Point
  • helpful in monitoring memory utilization

 

(B) fw ctl pstat

  • displays FireWall internal statistics about memory and traffic.
  • helpful in monitoring memory utilization, traffic counters, ClusterXL Sync counters.
  • no single field that indicates a problem - need to interpret all counters together.
  • collect the output before and after the suspected problem.
  • use different flags to get more data (fw ctl pstat -flag)
    - 'h' for HMEM
    - 's' for SMEM
    - 'k' for KMEM
    - 'l' for Handles (kbufs)
  • counters are reset when Check Point Services are stopped.
  • under memory, "allocations" counter always grows, may wrap around.
  • HMEM
    - failures under HMEM - no real memory problem, just mean HMEM is full ; HMEM should have been configured larger.
    - "failed allocations" under HMEM (only) do not indicate any problem.
  • SMEM
    - failures under SMEM - reached Check Point memory limit , exhausted OS memory, large non-sleep allocation , indicate some shortage
    - "failed allocations" under SMEM may not mean that a user's allocation failed, maybe HMEM extension failed.
    - "failed free" under SMEM means an overrun or freeing an invalid pointer - indicates a bug.
  • KMEM
    - failures under KMEM - application asked for memory and couldn't get it , usually, it is a memory problem.
    - "failed allocations" under KMEM means that the application didn't get memory.

 

(C) fw tab -t connections -s

  • displays summary about connections in Connections Table.
  • helpful in monitoring the amount of concurrent connections.
  • collect the output several times to see how fast the #VALS counter changes.
  • calculate the ratio of the #SLINKS counter to the #VALS counter (greater than 4-5 means problem).
  • compare the #PEAK counter to the limit of Connections Table (fw tab -t connections | head -n 3 | grep limit).

 

 

(D) cat /proc/meminfo

Example from Linux kernel 2.4

 

Example from Linux kernel 2.6

 

(E) cat /proc/slabinfo

Example from Linux kernel 2.4

 

Example from Linux kernel 2.6

 

(F) vmstat X [Y]

  • displays information about processes, memory, paging, block IO, and CPU activity.
  • helpful in monitoring different aspects of CPU utilization and memory utilization.
  • look at the "procs" section - counter "r" (number of processes waiting for CPU).
  • look at the "memory" section - all counters.
  • look at the "swap" section - reading "si" and writing "so" in swap file.
  • look at the "io" section - reading "bi" and writing "bo" on hard disk.
  • look at the "system" section at "cs" (number of Context Switches).
  • look at the "cpu" section - all counters.
  • collect this output continuously during the problem.
  • man page - http://linux.die.net/man/8/vmstat

Example from Linux kernel 2.4

 

Example from Linux kernel 2.6

 

(G) dmesg

  • displays boot up messages and message from various FireWall mechanisms.
  • helpful in detecting problems in kernel and in execution of functions.
  • man page - http://linux.die.net/man/8/dmesg

Recommendations:
Decrease timeout for TCP and UDP -
SmartDashboard -> 'Policy' menu -> Global Properties -> Stateful Inspection

  • decrease "TCP end timeout"
  • decrease "UDP virtual session timeout"

(refer to Performance Pack Administration -
Chapter Performance Tuning and Measurement Hints -
Performance Tuning -
Amount of Concurrent Connections and Hash Size -
Increasing the Number of Concurrent Connections)


Formula:
[maximum number of concurrent connections] = [session establishment rate] x [TCP end timeout]

 

Note: decreasing the 'TCP session timeout' will also allow to increase the number of concurrent connections.

Related Solutions:

 


 

 

Problems with traffic that is passing through Security Gateway

 

Key words:
--- drops
--- rate
--- throughput
--- latency
--- NIC receiving buffer
--- NIC sending buffer
--- NIC RX buffer
--- NIC TX buffer
--- RX-DRP
--- ring size
--- traffic blend

 

Important Notes:

 

Commands:

(A) netstat -ni

  • displays a table of all network interfaces.
  • helpful in monitoring the amount of traffic and the drops on NICs.
  • look at "RX-ERR" , "RX-DRP" , "RX-OVR" and "TX-ERR" , "TX-DRP" , "TX-OVR".
  • man page - http://linux.die.net/man/8/netstat
  • the 'RX-OK' and 'TX-OK' columns show how many packets have been received (RX) or transmitted (TX) error-free.
  • the 'RX-ERR' and 'TX-ERR' columns show how many packets have been received (RX) or transmitted (TX) damaged.
  • the 'RX-DRP' and 'TX-DRP' columns show how many received packets (RX) and transmitted packets (TX) have been dropped.
  • the 'RX-OVR' and 'TX-OVR' columns show how many received packets (RX) and transmitted packets (TX) have been lost because of an overrun
    RX-OVR = the number of times the receiver hardware was unable to hand received data to a hardware buffer - the internal FIFO buffer of the chip is full, but is still tries to handle incoming traffic ; most likely, the input rate of traffic exceeded the ability of the receiver to handle the data.
  • the 'Flg' column shows the flags that have been set for this interface - these characters are one-character versions of the long flag names that are displayed in the output of 'ifconfig' command:
    A = this interface will receive all Multicast addresses
    B = a Broadcast address has been set
    D = debugging is turned on
    L = this interface is a loopback device
    M = all packets are received (promiscuous mode)
    m = master
    N = trailers are avoided
    O = ARP is turned off for this interface
    P = this is a Point-to-Point connection
    R = interface is running
    s = slave
    U = interface is up

 

(B) ifconfig IF_NAME

  • displays the status of the currently active interfaces.
  • helpful in monitoring the amount of traffic and the drops on NICs.
  • look at "errors", "dropped", "overruns", "frame", "carrier".
  • man page - http://linux.die.net/man/8/ifconfig

 

(C) /var/log/messages*

  • displays the OS log.
  • helpful in overall monitoring of the system.
  • check for relevant messages about interfaces, links, any abnormal messages.

 

(D) dmesg

  • displays bootup messages and message from various FireWall mechanisms.
  • helpful in detecting problems in kernel and in execution of functions.
  • man page - http://linux.die.net/man/8/dmesg

 

(E) netstat -an

  • displays both listening and non-listening sockets.
  • helpful in monitoring queue of incoming traffic to specific application and outgoing traffic from specific application.
  • under "Active Internet connections" look at "Recv-Q" and at "Send-Q".
  • man page - http://linux.die.net/man/8/netstat
    • where 'Recv-Q' is the data (in bytes), which has not yet been pulled from the socket buffer by the application (value should be as close to 0 as possible).
    • where 'Send-Q' is the data (in bytes), which the sending application has given to the transport, but has yet to be ACKnowledged by the receiving TCP (value should be as close to 0 as possible - a large number may indicate a network bottleneck).
    [Expert@FW]# netstat -anp
    Active Internet connections (servers and established)
    Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
    tcp        0   2368 172.30.20.69:22         172.30.20.228:3676      ESTABLISHED 16179/0
    udp   256956      0 0.0.0.0:9282            0.0.0.0:*
    [Expert@FW]#
    

 

(F) netstat -s

  • displays summary statistics for each protocol.
  • helpful in monitoring traffic.
  • under "Ip" look at "incoming packets discarded".
  • under "Icmp" look at "ICMP messages failed".
  • under "Tcp" look at "bad segments received".
  • under "Udp" look at "packet receive errors".
  • man page - http://linux.die.net/man/8/netstat

 

(G) fw ctl pstat

  • displays FireWall internal statistics about memory and traffic
  • helpful in monitoring memory utilization, traffic counters, ClusterXL Sync counters
  • no single field that indicates a problem - need to interpret all counters together
  • collect the output before and after the suspected problem
  • use different flags to get more data (fw ctl pstat -flag)
    - 'h' for HMEM
    - 's' for SMEM
    - 'k' for KMEM
    - 'l' for Handles (kbufs)
  • counters are reset when Check Point Services are stopped
  • under memory, "allocations" counter always grows, may wrap around
  • HMEM
    - failures under HMEM - no real memory problem, just mean HMEM is full ; HMEM should have been configured larger
    - "failed allocations" under HMEM (only) do not indicate any problem
  • SMEM
    - failures under SMEM - reached Check Point memory limit , exhausted OS memory, large non-sleep allocation , indicate some shortage
    - "failed allocations" under SMEM may not mean that a user's allocation failed, maybe HMEM extension failed
    - "failed free" under SMEM means an overrun or freeing an invalid pointer - indicates a bug
  • KMEM
    - failures under KMEM - application asked for memory and couldn't get it , usually, it is a memory problem
    - "failed allocations" under KMEM means that the application didn't get memory
  • under "Connections" look at "W total, X TCP, Y UDP, Z ICMP" to understand the traffic blend
  • under "Fragments" look at "duplicates" (attack, or simple duplicate) and "failures" (failed due to lack of resources)
  • on ClusterXL members refer to "Sync" section (refer to ClusterXL Administration Guide for explanations)

 

(H) fw tab -t connections -s

  • displays summary about connections in Connections Table
  • helpful in monitoring the amount of concurrent connections
  • collect the output several times to see how fast the #VALS counter changes
  • calculate the ratio of the #SLINKS counter to the #VALS counter (greater than 4-5 means problem)
  • compare the #PEAK counter to the limit of Connections Table (fw tab -t connections | head -n 3 | grep limit)

 

 

 

(I) arp -an | wc -l

 

(J) ethtool IF_NAME

  • displays Ethernet card settings.
  • helpful in collecting the data about the interface's speed, duplex, link.
  • check every line of the output.
  • man page - http://linux.die.net/man/8/ethtool

 

(K) ethtool -i IF_NAME

  • displays information about associated driver.
  • helpful in detecting problems with current NIC driver.
  • use driver with NAPI.
  • use the latest version of the driver.
  • man page - http://linux.die.net/man/8/ethtool

 

(L) ethtool -S IF_NAME

  • displays NIC-specific and driver-specific statistics.
  • helpful in monitoring traffic through this NIC.
  • check every line that contains "error", "drop", "buffer", "fail".
  • man page - http://linux.die.net/man/8/ethtool

 

(M) ethtool -g IF_NAME

 

(N) fwaccel stat

 

(O) fw ctl multik stat

  • displays status of CoreXL instances and summary for traffic that passes through each instance (current number and peak number of concurrent connections).
  • helpful in detecting problems with CoreXL and with traffic that went through each instance.

 

(P) fgate stat

  • displays status of FloodGate-1 and summary for traffic that passed through QoS.
  • helpful in detecting problems with FloodGate-1 and with traffic that went through QoS.

 

(Q) ps auxwf

  • displays information about the current processes (daemons).
  • helpful in detecting problems in User Space (Memory , CPU).
  • look at the amount of "CPU", "MEM", "VSZ", "RSS", "TIME" consumed by the daemon.
  • collect this output over period of time to see the trend of memory consumption.
  • man page - http://linux.die.net/man/1/ps

 

(R) cat /proc/interrupts

  • displays the number of interrupts per each IRQ.
  • helpful in monitoring interrupts on CPU cores from different devices (mostly NICs).
  • verify that the interfaces do not share the same IRQ number, which is problematic with affinity.
  • man page - http://linux.die.net/man/5/proc

 

(S) WireShark - Statistics menu

  • displays different statistics about traffic
  • helpful in analyzing the traffic
  • open the traffic capture file and use different options
    - Summary
    - Protocol Hierarchy
    - Conversations
    - Endpoints
    - IO Graphs
    - Flow graph
  • www.wireshark.org

Recommendations:
- use PCI-E/PCIe (PCI Express) interface cards instead of PCI-X
- use interface driver with NAPI


Formula:
[maximum number of concurrent connections] = [session establishment rate] x [TCP end timeout]


Related Solutions:

Relevant links:

 


 

 

Problems with acceleration on Security Gateway

 

Key words:
--- acceleration
--- acceleration path (SecureXL processes the packet)
--- medium path (SecureXL processes the packet except when IPS is required)
--- Firewall path (SecureXL is not able to process the packet, and passes it to CoreXL)
--- forwarded traffic

Commands:

(A) fwaccel stat

 

(B) fwaccel stats

  • displays statistics for SecureXL.
  • helpful in detecting problems with non-accelerated traffic.
  • calculate the ratio of "F2F" counter to "Accelerated" counter (the lower the better).
  • check the statistics on the device (use 'fwaccel stats -s').
  • check for "dropped" traffic (use 'fwaccel stats -d' in versions R70 and higher).
  • check "TCP violations" counter.

Example from Linux kernel 2.4

fwaccel stats

 

fwaccel stats -s

 

Example from Linux kernel 2.6

fwaccel stats

 

fwaccel stats -s

 

fwaccel stats -d

 

(C) fwaccel conns

  • displays Connections in SecureXL.
  • helpful in detecting problems with non-accelerated traffic.

Example from Linux kernel 2.4

 

Example from Linux kernel 2.6

 

Flags:

F = Forward to Firewall - the connection is not accelerated
U = Unidirectional - the connection can pass data on either C2S or S2C - data packets from the opposite direction will be F2F'ed
N = NAT is being performed on the connection by the device
A = Accounting is performed on the connection (the connection is viewed by either rulebase accounting or SmartView Monitor)
C = Encryption is done on the connection by the device
W = the connection is in wire mode
P = Partial (versions R70 and higher)
S = Streaming - PXL (versions R70 and higher)

 

(D) fwaccel templates

  • displays Connection Templates in SecureXL.
  • helpful in detecting problems with Connection Templates.

 

Flags:

F = Forward to Firewall - the connection is not accelerated
U = Unidirectional - the connection can pass data on either C2S or S2C - data packets from the opposite direction will be F2F'ed
N = NAT is being performed on the connection by the device
A = Accounting is performed on the connection (the connection is viewed by either rulebase accounting or SmartView Monitor)
C = Encryption is done on the connection by the device
W = the connection is in wire mode
P = Partial (versions R70 and higher)
S = Streaming - PXL (versions R70 and higher)
D = Drop Template
L = Log drop action

 

(E) cat /proc/interrupts

  • displays the number of interrupts per each IRQ.
  • helpful in monitoring interrupts on CPU cores from different devices (mostly NICs).
  • verify that the interfaces do not share the same IRQ number, which is problematic with affinity.
  • man page - http://linux.die.net/man/5/proc

 

(F) top

  • displays dynamic real-time view of a running system on Linux
  • helpful in monitoring different aspects of CPU utilization
  • look at the amount of "Idle"
  • look at the load in "User Space"
  • look at the load in "System (kernel) Space"
  • look at the amount of "SoftIRQ"
  • look at the amount of "IOwait"
  • collect this output continuously during the problem
  • output differs on Linux kernel 2.4 and Linux kernel 2.6 (have to press 1 and Shift+W)
  • man page - http://linux.die.net/man/1/top

Example from Linux kernel 2.4

 

Example from Linux kernel 2.6

 

(G) vmstat X [Y]

  • displays information about processes, memory, paging, block IO, and CPU activity
  • helpful in monitoring different aspects of CPU utilization and memory utilization
  • look at the "procs" section - counter "r" (number of processes waiting for CPU)
  • look at the "memory" section - all counters
  • look at the "swap" section - reading "si" and writing "so" in swap file
  • look at the "io" section - reading "bi" and writing "bo" on hard disk
  • look at the "system" section at "cs" (number of Context Switches)
  • look at the "cpu" section - all counters
  • collect this output continuously during the problem
  • man page - http://linux.die.net/man/8/vmstat

Example from Linux kernel 2.4

 

Example from Linux kernel 2.6

 

(H) sim affinity -l

  • displays affinity of physical interfaces and CPU cores
  • helpful in detecting problems with SIM Affinity that lead to poor CPU utilization
  • use static affinity ('sim affinity -s' command ; $PPKDIR/boot/modules/sim_aff.conf file will be created)
  • for Clear traffic
    - with SecureXL - use dual affinity
    - without SecureXL - use single affinity
  • for VPN traffic
    - with SecureXL - use dual affinity
    - without SecureXL - use single affinity

Example from Linux kernel 2.4

 

We can see that:

    - Sync = mapped to IRQ 19 and affinity is set to CPU core 3
    - Mgmt, Exp1-4, Lan2, Lan3 = mapped to IRQ 18 and affinity to CPU core 2
    - Exp1-1, Lan6, Lan7 = mapped to IRQ 16 and affinity is set to CPU core 0
    - Exp1-2, Exp1-3, Lan1, Lan8 = mapped to IRQ 17 and affinity is set to CPU core 1
    - Lan4, Lan5 = mapped to IRQ 19 and affinity is set to CPU core 3
    - Exp2-1 = mapped to IRQ 28 and affinity is set to CPU core 4
    - Exp2-2 = mapped to IRQ 29 and affinity is set to CPU core 5
    - Exp2-3 = mapped to IRQ 30 and affinity is set to CPU core 6
    - Exp2-4 = mapped to IRQ 31 and affinity is set to CPU core 7

Analysis:
Let us take a look, for example, at Exp1-2, Exp1-3, Lan1 and Lan8 -
these interfaces have an affinity set to CPU core 1 on IRQ 17.
This means that this CPU core 1 might do more work than the others,
if all of the 4 interfaces were used at the same time.

Important note:
IRQ sharing on Linux kernel 2.4 must be consistent with the input of the 'sim affinity' command.
For example, Exp1-1 and Lan7 share the same IRQ (16).
In that case, Exp1-1 will reside with the assigned core only if Lan7 will be assigned to the same CPU core
(in case Exp1-1 will be assigned to CPU core 0, and Lan7 will be assigned to CPU core 1 -
both Exp1-1 and Lan7 will use CPU core 1).

Example from Linux kernel 2.6

 

Analysis:
We can see that each interface uses a different IRQ, hence all CPU cores can be used.

 

(I) fw ctl multik stat

  • displays status of CoreXL instances and summary for traffic that passes through each instance (current number and peak number of concurrent connections)
  • helpful in detecting problems with CoreXL that lead to poor CPU utilization

 

Note: we can see that CPU #0 and CPU #1 do not run CoreXL FW instance

(J) fw ctl affinity -l -r -v -a

  • displays affinity of CoreXL instances and CPU cores
  • helpful in detecting problems with CoreXL and FW Affinity that lead to poor CPU utilization
  • use static affinity if there is no SecureXL ('fw ctl affinity -s' command ; $FWDIR/conf/fwaffinity.conf file)

 

Note: we can see that CPU #0 and CPU #1 do not run CoreXL instance

(K) dmesg

  • displays bootup messages and message from various FireWall mechanisms
  • helpful in detecting problems in kernel and in execution of functions
  • helpful in detecting problems kernel and execution of functions
  • man page - http://linux.die.net/man/8/dmesg

Recommendations:
When using SecureXL, disable Hyper-Threading in BIOS (on Check Point appliances this is disabled by default) - applies to Intel processors prior to "Intel Nehalem (Core i7)", in which this technology was improved (called Simultaneous Multi-Threading)


Related Solutions:

The following solutions provide the same "fix" for various problems:

Instructions:
1. Set the following kernel parameter in $FWDIR/boot/modules/fwkern.conf file (refer to sk26202)
cphwd_handle_link_collision=1

2. Set the following kernel parameter in $PPKDIR/boot/modules/simkern.conf file (refer to sk26202)
sim_resolve_link_collision=1

3. Restart the Security Gateway

Applies To:
  • This solution replaces sk33856.

Give us Feedback
Please rate this document
[1=Worst,5=Best]
Comment