Table of Contents:
-
Background
-
Solution
-
Instructions
-
For R77.30
-
For R80.10 and above
-
Kernel Debug
-
Kernel parameters
-
Troubleshooting
-
Related solutions
Note: for R80.40 and higher versions refer to - sk165153 - GNAT port allocation feature
Background
In certain configuration, the number of ports available for a CoreXL FW instance might not be enough, which leads to dropped packets.
In such case, "NAT Hide failure - there are currently no available ports for hide operation
" log for dropped Hide NATed connections will be seen repeatedly in SmartView Tracker / SmartLog.
In general, source ports for NAT functionality are divided into three ranges:
Range name |
Port numbers |
Comments |
Low |
600 - 1023 |
None |
High |
10000 - 60000 |
Used for standard connections (usually). |
Extra, or Global |
60001 - 65000 |
Mainly used for standard features that are not supported by CoreXL (in which case, the traffic will be processed only by Core FW Instance #0 - see sk61701). These ports should be configured per sk86401. |
Which port range will be used for the port allocation, depends on the connection's service.
Maximal number of concurrent hidden connections per destination is the combination of Destination "X" with NAT "Y" and the low/high port range according to the ranges division.
Maximal number of concurrent connection using extra ports is the size of extra ports range.
These port ranges are being allocated statically during policy installation.
The port distribution is based on the following factors:
- Number of CoreXL FW instances
- Whether cluster is enabled and number of cluster members:
- For ClusterXL in High Availability mode, each Standby member has a ports quota reserved for it, and the rest of ports are assigned to the Active cluster member.
- For ClusterXL in Load Sharing mode, the ports range is divided by the number of cluster members.
- Whether SecureXL is enabled
- Whether VPN blade is enabled
Notes about SecureXL:
- When enabling SecureXL NAT Templates, SecureXL gets a share of port ranges from the ranges dedicated to the specific CoreXL FW instance.
- SecureXL range is static and cannot be shared.
- Since R80.20 SecureXL does not have a separate port pool, and does not affect the number of ports. Since R80.20 all templates start in the Firewall.
Two common NAT issues may occur when port ranges are allocated statically:
-
NAT is not being performed, and there is no log indicating NAT problems.
In such case, refer to sk86401 - change the value of 'hide_max_high_port'.
-
NAT occasionally runs out of available ports, and a log is being sent with the description "NAT Hide failure - there are currently no available ports for hide operation".
Additional notes:
- Sometimes NAT ports are being used even if customer does not use NAT - e.g., for cluster sync.
- The feature is not supported on R76SP versions and on R80.20SP.
Solution
Starting in R77.30 Security Gateway / Cluster members are able to allocate NAT ports dynamically.
NAT port allocation was changed from static to dynamic:
Each CoreXL FW instance requests a range of ports as needed from the full range assigned to the cluster member.
Only those CoreXL FW instances that require NAT ports, request ranges.
The ranges are also keyed by the Destination IP address, so each Destination IP address gets a separate allocation.
Additional information:
-
When not using the Dynamic NAT port allocation, each CoreXL FW instance receives a fixed amount of ports determined by the number of CoreXL FW instances.
For example, if Security Gateway is configured with 12 CoreXL FW instances, then around 4133 high ports are available for each CoreXL FW instance in Cluster HA.
The current dispatching algorithm takes into consideration the Source IP address, Destination IP address and IP protocol when selecting the CoreXL FW instance to inspect a connection.
This means that connections to the same web site (same Destination IP address and IP protocol) can go to the same CoreXL FW instance.
Security Gateway gets the NAT state of <dest=WebSite,HideIP=ClusterVIP,Proto=HTTP>, so more than 4133 connections to the same web site server that reach the same CoreXL FW instance will exhaust the NAT range for that CoreXL FW instance.
-
With the Dynamic NAT port allocation, when the dispatching algorithm sends these connections to the same small number of CoreXL FW instances, each CoreXL FW instance can use more ports that would otherwise have been allocated to unused CoreXL FW instances for these connections.
-
The more common (and more historical) cause of this issue is when a network with Hide NAT configuration has multiple connections to a single IP address - e.g., a Load Balancer IP address, Proxy server, or DNS server.
Special considerations for VSX:
On VSX Gateway / VSX Cluster members, this feature (with its default configuration) may impact the performance, if enabled on lower number of CoreXL FW instances.
To avoid such issues, it is recommended to change the value of the high ports quota - set the following value for the kernel parameter fwx_nat_dynamic_high_port_allocation_size (for details, refer to the "Troubleshooting" section):
Configured number of CoreXL FW instances per Virtual System |
Default state of the Dynamic NAT port allocation |
Recommended value |
2 |
Disabled |
fwx_nat_dynamic_high_port_allocation_size=1500 |
4 |
Disabled |
fwx_nat_dynamic_high_port_allocation_size=700 |
8 |
Enabled |
fwx_nat_dynamic_high_port_allocation_size=500 |
more than 8 |
Enabled |
Keep the default value |
Instructions
-
For R77.30
It is recommended to install the latest Take of R77.30 Jumbo Hotfix Accumulator that includes fixes for various Dynamic Port Allocation issues.
Important Note: Value of any kernel parameter must be identical on all members of the cluster.
-
To check the current value of the kernel parameter:
[Expert@HostName]# fw ctl get int fwx_nat_dynamic_port_allocation
-
To set the value of the kernel parameter permanently (Note: it is not supported to set values for this kernel parameter on-the-fly):
Follow sk26202 - Changing the kernel global parameters for Check Point Security Gateway).
For Gaia / SecurePlatform OS:
-
Create the $FWDIR/boot/modules/fwkern.conf file (if it does not already exit):
[Expert@HostName]# touch $FWDIR/boot/modules/fwkern.conf
-
Edit the $FWDIR/boot/modules/fwkern.conf file in Vi editor:
[Expert@HostName]# vi $FWDIR/boot/modules/fwkern.conf
-
Add the following line (spaces are not allowed):
fwx_nat_dynamic_port_allocation=1
For performance reasons, it is recommended to change the high ports quota default value by adding also the following line (spaces are not allowed):
fwx_nat_dynamic_high_port_allocation_size=100
-
Save the changes and exit from Vi editor.
-
Check the contents of the $FWDIR/boot/modules/fwkern.conf file:
[Expert@HostName]# cat $FWDIR/boot/modules/fwkern.conf
-
Reboot each cluster member.
-
Regarding VSX Gateway, since fwkern.conf is shared between the VS's, the configuration is the same for all VS's.
-
For R80.10
Dynamic NAT port allocation is enabled by default in systems with more than 5 CoreXL instances - value of the kernel parameter fwx_nat_dynamic_port_allocation is set to 1.
(Refer to the table below for the parameter setting for systems with less than 5 CoreXL instances.)
Important Note: Value of any kernel parameter must be identical on all members of the cluster.
Note: When the Number of CoreXL FW instances is less than 6, the Dynamic NAT port allocation is disabled by default.
Kernel Debug
To debug issues with NAT port allocation, follow this action plan:
Important Note: These steps must be performed on all members of the cluster.
-
Prepare the debug:
[Expert@HostName]# fw ctl debug 0
[Expert@HostName]# fw ctl debug -buf 32000
[Expert@HostName]# fw ctl debug -m fw + xlate xltrc nat conn drop
-
Verify the debug:
[Expert@HostName]# fw ctl debug -m fw
You should see:
Kernel debugging buffer size: 32000KB
Module: fw
Enabled Kernel debugging options: error warning xlate xltrc nat conn drop
-
Start the debug:
[Expert@HostName]# fw ctl kdebug -T -f > /var/log/debug_$(uname -n).txt
-
Replicate the issue / wait for the issue to occur.
-
Stop the debug:
Press CTRL+C, and run
[Expert@HostName]# fw ctl debug 0
-
Send the following files to Check Point Support for analysis:
- /var/log/debug_MEMBER_HOSTNAME.txt file from each cluster member (compress the file with gzip command)
- CPinfo file from the Security Management Server
- CPinfo file from each cluster member
- /var/log/messages* files from each cluster member
- output of dmesg command from each cluster member
- /var/log/dmesg file from each cluster member
Kernel parameters
Troubleshooting
-
Possible Issue:
Within seconds, a busy Security Gateway can use and reuse NAT port range 10,000 - 11,000 numerous times. Depending on how fast the connections are cleared up on the remote end, and depending on the Security Gateway hardware, the reused ports may get converted from TCP "SYN" to TCP "ACK" (known as Check Point smart reuse feature). This can causes timeouts and delays for clients.
Solution:
Contact Check Point Support to get a Hotfix for this issue (ID 01926907).
A Support Engineer will make sure the Hotfix is compatible with your environment before providing the Hotfix.
For faster resolution and verification please collect CPinfo files from the Security Management and Security Gateways involved in the case.
Code was improved: New kernel parameter fwx_nat_dynamic_port_allocation_entry_timeout
was added to enhance the Dynamic Hide NAT feature. In addition, for performance reasons the default value of the kernel parameter fwx_nat_dynamic_high_port_allocation_size
(the high ports quota) was changed from 0 to 100.
* Note: The default value of fwx_nat_dynamic_high_port_allocation_size on R80.10 is 0.
Any number above 0 means to use this number as the port quota.
The default behaviour (When it is set to 0) depends on the number of instances:
- Less than 25 instances : the value will be 100
- 25 instances of more: 2500 divided by the number of instances. (for example, if there are 30 instances, the value will be 2500/30 = 83.
Note: This fix is included in:
On Security Gateway / each cluster member, set the desired values for these kernel parameters:
-
To check the current value of this kernel parameter:
[Expert@HostName]# fw ctl get int PARAMETER
-
To set the desired value for this kernel parameter on-the-fly (does not survive reboot):
[Expert@HostName]# fw ctl set int PARAMETER VALUE
-
To set the desired value for this kernel parameter permanently:
Follow sk26202 - Changing the kernel global parameters for Check Point Security Gateway.
For Gaia / SecurePlatform OS:
-
Create the $FWDIR/boot/modules/fwkern.conf file (if it does not already exit):
[Expert@HostName]# touch $FWDIR/boot/modules/fwkern.conf
-
Edit the $FWDIR/boot/modules/fwkern.conf file in Vi editor:
[Expert@HostName]# vi $FWDIR/boot/modules/fwkern.conf
-
Add the following line (spaces and comments are not allowed):
PARAMETER=VALUE
-
Save the changes and exit from Vi editor.
-
Check the contents of the $FWDIR/boot/modules/fwkern.conf file:
[Expert@HostName]# cat $FWDIR/boot/modules/fwkern.conf
-
Reboot the Security Gateway.
-
Verify that the new value was set:
[Expert@HostName]# fw ctl get int PARAMETER
Applies To:
- 01453536 , 01515863 , 01532431 , 01533591
- 01539025 , 01544341 , 01544561 , 01549649
- 01559836 , 01565204 , 01575939
- 01555320
- 01603651 , 01604729
- 01614832
- 02028177 , 02119256 , 02082869 , 02064273 , 02058535 , 02031181
- 01926907, 01961074, 02027022, 01960451, 02003486, 01995744, 02005808, 02011926, 02079049, 02044254, 01955909