For deploying a new high availability solution that supports VPN termination it is recommended to use the "CloudGuard IaaS High Availability" solution.
It is recommended to upgrade existing CloudGuard IaaS Cluster solutions to CloudGuard IaaS High Availability solutions. Refer to the CloudGuard IaaS High Availability Admin Guide for upgrade instructions (See "Additional Information" section).
Template version can be identified on each cluster member at: /etc/cloud-version.
For template version older than 20180301, see sk122793.
Recent updates
Starting from template version 20180301, the following changes were made to the cluster template:
Cluster members' public IP addresses are attached directly to their network interface cards.
In addition to the public cluster address, a private cluster address is attached to the Active member and gets reattached on fail over.
A load balancer is no longer deployed automatically. If a load balancer is required for publishing services, it must be deployed to the Cluster's resource group. By default, its name should be "frontend-lb". In case another name is used, see the section "Using different resource groups, naming constraints and permissions". Follow the instructions in this article for configuring the load balancer.
The names of load balancer NAT rules for publishing services must begin with "cluster-vip" in order to be updated on cluster fail over.
Table of Contents
Prerequisites
Method of operation
Example Environment
Deployment using a Solution Template
Setting up the route tables of the Frontend and Backend subnets
Setting up internal subnets and their route tables
Setting up routes on the cluster members to the Internal subnets
SmartConsole Configuration
Configuring cluster interfaces
VPN configuration
Testing and Troubleshooting
Azure HA daemon and its configuration file
Changing the Credentials
Using a different Azure cloud environment
Working with a proxy
Deployments without a public IP cluster addresses
Using different resource groups, naming constraints and permissions
Known Limitations and Issues
Related solutions
Click Here to Show the Entire Article
Prerequisites
It is assumed that the reader is familiar with general Azure concepts and features such as:
Virtual Networks
Virtual Machines
Load Balancers
Availability Sets
Public IP addresses
User Defined Routes (UDR)
Role Based Access Control (RBAC)
Method of operation
A Check Point cluster in a non-Azure environment uses multicast or broadcast in order to perform state synchronization and health checks across cluster members.
Since multicast and broadcast are not supported in Azure, the Check Point cluster members in Azure communicate with each other using unicast. In addition, in a regular ClusterXL working in High Availability mode, cluster members use Gratuitous ARP to announce the MAC Address of the Active member that is associated with Virtual IP Address (during the normal operation and when cluster failover occurs).
In contrast, in Azure this is implemented by making API calls to Azure. When an active cluster member fails, the Standby cluster member is promoted to Active and takes ownership of the cluster resources.
As part of this process this member:
Associates the cluster private and public IP addresses with its external network interface.
Modifies any User Defined Routes that were pointing to the member that went down to point to itself.
Azure API authentication:
To be able to automatically make API calls to Azure, the cluster members need to be provided with Azure Active Directory credentials. This is achieved using the Role-Based Access Control (RBAC) feature of Azure.
To best explain the configuration steps, we will be using the following example environment. Make sure to replace the IP addresses in the example environment to reflect your environment when you follow the configuration steps below.
The diagram above depicts a virtual network in Azure divided into 4 subnets. These are designated:
Frontend
Backend
Web
App
The Check Point cluster is comprised of 2 members designated: Cluster Member 1 and Cluster Member 2 Each member has 2 interfaces. The two cluster members are part of the same Azure Availability Set.
By putting the cluster members in an Availability Set you guarantee that the two are in separate fault domains as well as separate update domains as explained in Manage the availability of virtual machines article.
In this diagram, the cluster is protecting 2 web applications. Each web application consists of:
A Public IP address
A Web Server
An Application Server
In addition, the cluster provides:
Site to site VPN connectivity to an on-premises network through an on-premises gateway (not depicted)
Remote access VPN connectivity to allow mobile users to access resources in the virtual network.
Public and private cluster addresses (VIPs).
The following components in the example environment are not deployed by the solution template and need to be deployed and configured manually according the guidelines in this article if required:
Load balancer and load balancer NAT rules.
When deploying a load balancer, it must be deployed to the Cluster's resource group. By default, its name should be "frontend-lb". In case another name is used, see the section "Using different resource groups, naming constraints and permissions".
Backend hosts, subnets and routing tables (Web, App).
IP addresses
Several static IP addresses are used as follows:
Name
Attached To
Use
Cluster Public Address
The external interface of the active member
VPN
Cluster Private Address
The external interface of the active member
Destination of External Load Balancer NAT rules for backend services
Member 1 Public Address
The external interface of member 1
External management of member 1
Member 2 Public Address
The external interface of member 2
External management of member 2
WebApp1
The Azure Load Balancer
Internet Access to WebApp1
WebApp2
The Azure Load Balancer
Internet Access to WebApp2
When a cluster failover occurs, the cluster member that gets promoted to be the active member uses an Azure API to associate itself with the cluster private and public IP address.
Failover time: This change normally takes effect under 2 minutes.
We use the Inbound NAT rules feature of the Azure Load Balancer to forward traffic arriving from the Internet as follows:
Note: The following ports cannot be used: 80, 443, 444, 8082 and 8880.
Front-end IP address
Front-end TCP ports
Destination IP address
Destination port
WebApp1
HTTP
Active cluster member
8081
WebApp2
HTTP
Active cluster member
8083
When a cluster failover occurs, the cluster member that gets promoted to be the active member uses an Azure API to reconfigure the load balancer to send the traffic belonging to the web applications to the cluster private address.
Note that these rules must start with "cluster-vip" in order for them to be updated automatically.
Failover time: This change normally takes effect under 3 minutes.
The active cluster member uses Network Address Translation (NAT) in order to forward traffic belonging to the 2 web application to the appropriate web server:
No.
Original Packet
Translated Packet
Install On
Source
Destination
Service
Source
Destination
Service
1
Any
Member1-external-private
TCP_8081
= Original
S Web1
Shttp
Policy Targets
2
Any
Member2-external-private
TCP_8081
= Original
S Web1
Shttp
Policy Targets
3
Any
Member1-external-private
TCP_8083
= Original
S Web2
Shttp
Policy Targets
4
Any
Member2-external-private
TCP_8083
= Original
S Web2
Shttp
Policy Targets
Note: Member1-external-private host object and Member2-external-private host object represent the private addresses of the external interface of member 1 and member 2 respectively.
Traffic between the Web subnet and the Application subnet is governed by User Defined Routes (UDR). Each subnet is associated with its own routing table.
When a cluster failover occurs, the cluster member that gets promoted to be the active member uses an Azure API to reconfigure the routing tables to send traffic to itself. Failover time: This change normally takes effect under 20 seconds.
This template can create a new virtual network or allow you to deploy into an existing virtual network
The template does not create the Web and App subnets - you need to add these subnets yourself.
The template does not create an external load balancer, you will need to create the load balancer according to section "Using different resource groups and naming constraints" and configure it yourself in case you need it for publishing services to the internet.
The template does not deploy any web or application VMs
VMs launched in the backend subnets might require Internet access in order to finalize their provisioning. You should launch these VMs only after you have applied NAT hide rules on the cluster to support this type of connectivity.
After you deploy the template, the cluster members will automatically execute the Check Point First Time Configuration Wizard based on the parameters provided. Once the First Time Configuration Wizard completes, the cluster members are expected to reboot
We will now create an Azure Active Directory Service account whose credentials would be used by the cluster members to make API calls into Azure. We will be creating a non-privileged service with a fixed but strong password. We will assign that service the required privileges to the resources we are creating and to other resources that are being changed on failover, according to table at "Using different resource groups and naming constraints" section.
You should have installed and configured Azure PowerShell version 1.0 or higher. For additional information on how to install it, refer to How to install and configure Azure PowerShell article.
You should have installed and configured the Azure AD PowerShell module. For additional information on how to install it, refer to AzureADHelp article.
Open a PowerShell window and run the following commands:
#######################################################################
# Parameters
#######################################################################
# Set this variable to the subscription ID
$SubscriptionId = ""
# Set this variable to the name of the resource group you have created
$ResourceGroup = ""
#######################################################################
$ErrorActionPreference = "Stop"
# Generate a Random password
Add-Type -AssemblyName System.Web
$Password = [System.Web.Security.Membership]::GeneratePassword(20,10)
# Login:
Add-AzureRmAccount Set-AzureRmContext -SubscriptionID $SubscriptionId
$tenant = (Get-AzureRmSubscription).TenantId
# Create a new Azure AD application and a service principal
$AppName = "check-point-cluster-"+[guid]::NewGuid()
$azureAdApplication = New-AzureRmADApplication -DisplayName $AppName -HomePage "https://localhost/$AppName" -IdentifierUris "https://localhost/$AppName"
-Password (ConvertTo-SecureString $Password -AsPlainText -Force ) -EndDate (Get-Date).AddYears(10)
New-AzureRmADServicePrincipal -ApplicationId $azureAdApplication.ApplicationId
# Wait till the new application is propagated
Start-Sleep -Seconds 15
# Assign the service with permission to modify the resources in the resource group
New-AzureRmRoleAssignment -ResourceGroupName $ResourceGroup -ServicePrincipalName $azureAdApplication.ApplicationId.Guid -RoleDefinitionName Contributor
Write-Host "ApplicationId:" $azureAdApplication.ApplicationId.Guid
Write-Host "Password : $Password"
To create the service account using the Azure portal:
- When you create the Azure Active Directory Application
When asked to provide a name for your application, enter a unique name such as: check-point-cluster-UNIQUEID Replace UNIQUID with a unique string to distinguish it from other service accounts
Select Web app / API under the Application Type
For the Sign-on URL enter: https://localhost/APPNAME where APPNAME is the application name you entered above:
- When you get the application ID and authentication key:
You can name the key chkp-cluster
Select Never expires:
- When you assign the application a role:
Assign the Contributor role.
o Assign the role at the scope of the cluster resource group.
If any of the following resources are not in the cluster resource group, assign the application this role at the scope of their resource groups:
Network interfaces used by the cluster members
Virtual Network
Peered Virtual Networks
Any route table used by the subnets in the Virtual Network or peered Virtual Networks
In both cases (using PowerShell or the Azure portal) write down the following items:
The resource group name
The location it was created in
The application ID
The automatically generated password (if you used PowerShell) or the automatically generated key (if you used the portal)
Set the client_id and client_secret on each of the cluster members.
In this section we will ensure that the route tables associated with the cluster frontend and backend subnets are set up correctly.
You need to follow this section only if you have deployed the cluster into an existing virtual network. If you have opted to let the template create a new virtual network you should skip this step. The route table associated with the frontend subnet should consist of the following routes:
Connect with Check Point SmartConsole to Check Point Management Server.
Create a new Check Point Cluster: in Cluster menu, click on Cluster...
Select Wizard Mode.
Enter the cluster object's name (e.g., checkpoint-cluster). In the Cluster IPv4 Address field, enter the public address allocated for the cluster and click on Next button.
Note: The Cluster IP address can be seen in the Azure portal by selecting the Active Member's primary NIC -> IP configurations -> "cluster-vip".
Example:
Click on Add button to add the cluster members.
Configure cluster member properties:
In the Name field, enter the first cluster member's name (e.g., member1).
In the IPv4 Address field:
If you are managing the cluster from the same virtual network, then enter the member's private IP address
Otherwise, enter the member's public IP address
In the Activation Key field, enter the SIC (Secure Internal Communication) key you used in the previous step.
In the Confirm Activation Key field, renter the key and click on Initialize The Trust State field should show: Trust established. Click OK.
Example:
Repeat Steps 5-6 to add the second cluster member. Click on Next button.
Example:
In the new window, click on Next:
Configure all (2) interfaces as Cluster Synchronization interfaces. Click Next.
Choose Cluster + Sync, and insert the private vip address of cluster.
Example:
Click "OK" and exit the cluster object configurations dialog
Go to "Network Management" -> disable "Anti-Spoofing" from the Network Interfaces.
To provide Internet connectivity to the internal subnets, create the following NAT rules:
No.
Original Packet
Translated Packet
Install On
Comment
Source
Destination
Service
Source
Destination
Service
1
VNET
VNET
Any
= Original
= Original
= Original
Check Point cluster
Avoid NAT inside the VNET
2
App-subnet
App-subnet
Any
= Original
= Original
= Original
Check Point cluster
Automatic rule (see the network object data).
3
App-subnet
Any
Any
H App-subnet (Hiding Address)
= Original
= Original
Check Point cluster
Automatic rule (see the network object data).
4
Web-subnet
Web-subnet
Any
= Original
= Original
= Original
Check Point cluster
Automatic rule (see the network object data).
5
Web-subnet
Any
Any
H Web-subnet (Hiding Address)
= Original
= Original
Check Point cluster
Automatic rule (see the network object data).
Notes:
NAT rule #1 is a manual NAT rule whose purpose is to avoid NAT inside the VNET. Make sure that it is placed above the automatic NAT rules.
NAT rules #2-4 are automatic NAT rules created as follows:
For each internal subnet, create a network object with the following properties:
Example:
On the NAT tab:
Check the box Add Automatic Address Translation rules
In the Translation method section, select Hide and Hide behind Gateway
In the Install on Gateway field, select the cluster object
Example:
If a load balancer is required for publishing services, it must be deployed to the Cluster's resource group. By default, its name should be 'frontend-lb'. In case another name is used, see the section 'Using different resource groups, naming constraints and permissions'.
In addition, set up NAT rules to allow incoming connections as below. Note the names of the NAT rules must begin with "cluster-vip" in order to updated on fail over.
Choose "Name" that starts with "cluster-vip" .
"Target Virtual Machine" - choose the Active Member VM.
"Network IP configuration" - choose "member-ip" of the Active member.
Press "OK".
Configure and install the Security policy on the cluster.
Use the cphaprob state and the cphaprob -a if commands on each cluster member to validate that the cluster is operating correctly. Output of cphaprob state command on both cluster member must show identical information (except the "(local)" string).
Example:
[Expert@HostName:0]# cphaprob state
Cluster Mode: High Availability (Active Up) with IGMP Membership
Number Unique Address Assigned Load State
1 (local) 10.0.1.10 0% Active
2 10.0.1.20 100% Standby
Use the cluster configuration tester script, located on each cluster member that can be used to verify the cluster configuration.
This script verifies:
The configuration file is defined in $FWDIR/conf/azure-ha.json. This file is created by the ARM template.
A Primary DNS server is configured and is working
The machine is set up as a cluster member.
IP forwarding is enabled on all network interfaces of the Cluster Member.
It is possible to use the APIs to retrieve information about the cluster's resource group.
It is possible to log in to Azure with the Azure credentials in the $FWDIR/conf/azure-ha.json file.
Calibration of ClusterXL configuration for Azure.
To get the latest version of the test script:
Note - Perform Steps 2 - 8 on each Cluster Member.
Back up the current $FWDIR/scripts/azure_ha_test.py script:
# mv -v $FWDIR/scripts/azure_ha_test.py{,_backup}
Unpack the TGZ file:
# tar -zxvf /<path to the downloaded script package>/Azure_cluster_ha_testing_sk110194.tgz
Back up the current $FWDIR/scripts/azure_ha_test.py script:
# mv -v $FWDIR/scripts/azure_ha_test.py{,_backup}
Copy the latest script to the $FWDIR/scripts/ directory:
# cp -v /<path to the downloaded script package>/azure_ha_test.py $FWDIR/scripts/
Assign the required permissions:
# chmod -v 755 $FWDIR/scripts/azure_ha_test.py
To run the script on each Cluster Member:
Connect to the command line.
Log in to the Expert mode.
Run the script with this command (do not change the syntax):
# $FWDIR/scripts/azure_ha_test.py
If all tests were successful, this shows: All tests were successful! Otherwise, an error message is displayed with information to troubleshoot the problem.
A list of common configuration errors:
Message
Recommendation
The attribute [ATTRIBUTE] is missing in the configuration
Primary DNS server is not configured Failed to resolve [HOST]
The cluster member is not configured with a DNS server
Failed in DNS resolving test
Confirm that DNS resolution on the cluster member works
You do not seem to have a valid cluster configuration
Make sure that the cluster configuration on the Check Point Management is complete and that security policy on it is installed
IP forwarding is not enabled on Interface [INTERFACE-Name]
Use PowerShell to enable IP forwarding on all the network interfaces of the cluster members
Failed to read configuration file: /opt/CPsuite-R77/fw1/conf/azure-ha.json
The Azure HA configuration file is missing or malformed
Testing credentials...
[Exception]
Failed to login using the provided credentials. See the exception text to understand why
Testing authorization...
[Exception]
Make sure the Azure Active Directory service account you created is designated as a contributor to the cluster resource group
Simulate a cluster failover. For example, shut down the internal interface of the Active cluster member:
[Expert@HostName:0]# ip link set dev eth1 down
In a few seconds, you should see that the 2nd cluster member reports itself as the Active member.
Browse to the Azure portal and confirm that in all the routing tables associated with internal subnets, the routes are pointing to the interface of the member that was promoted to active. Note: You might need to refresh the Azure portal to see the changes.
If you are experiencing issues:
Make sure that you have set up an Azure Active Directory Service Account that the service has Contributor privileges to the resource group or Contributor and Reader privileges on your chosen resources, in which the cluster is deployed.
To make the networking changes automatically, the cluster members need to communicate with Azure. This requires HTTPS connections over TCP/443 to the Azure end points.
Make sure that the Security Policy installed on the Security Gateway allows this type of communication.
The Check Point clustering solution in Azure, uses a dedicated process which is responsible for making API calls to Azure when a cluster failover takes place. This daemon uses a configuration file $FWDIR/conf/azure-ha.json located on each cluster member.
When you deploy the above solution from the supplied template, this file is created automatically. The configuration file is in JSON format and contains the following attributes:
Attribute name
Type
Value
debug
Boolean
true, or false
subscriptionId
String
subscription ID
location
String
resource group location
environment
String
name of the environment
resourceGroup
String
resource group name
credentials
String
client credentials
proxy
String
name of the proxy
virtualNetwork
String
name of the virtual network
clusterName
String
name of the cluster
lbName
String
name of load balancer
You can verify that the daemon in charge of communicating with Azure is running on each cluster member.
[Expert@HostName:0]# cpwd_admin list | grep -E "PID|AZURE_HAD"
The output should have a line similar to:
APP PID STAT #START START_TIME MON COMMAND
AZURE_HAD 3663 E 1 [12:58:48] 15/1/2016 N python /opt/CPsuite-R77/fw1/scripts/azure_had.py
Notes:
The script should appear in the output
The STAT column should show "E" (stands for "Executing")
The #START column should show "1" (how many times this script was started by the Check Point WatchDog)
To troubleshoot issues related to this daemon, enable debugging printouts:
To enable debug printouts, run: [Expert@HostName:0]# azure-ha-conf --debug --force To disable debug printouts, run: [Expert@HostName:0]# azure-ha-conf --no-debug --force
Debug output will be written to $FWDIR/log/azure_had.elg* file.
If you are deploying your cluster in a special Azure environment such as Azure Government, Azure China or Azure Germany, then the $FWDIR/conf/azure-ha.json file should contain the following attribute:
In some deployments, access to the Internet is only possible through a web proxy.
In order to allow the cluster members to make API calls to Azure through the proxy, you should edit the $FWDIR/conf/azure-ha.json file and add the following attribute:
In some cases, a customer might wish to deploy a cluster without a public address. This is typically the case when the cluster is not expected to be facing the Internet, but rather just to inspect traffic between subnets in the virtual networks, or other virtual networks. In such cases, there is no need for a public address to be set up.
To disable the cluster public IP addresses, in the Azure portal:
Go to the interface resource of the active member -> click IP configurations-> choose the IP configuration -> click Disabled.
For example:
Wait until the public address is detached.
In SmartConsole:
Use the cluster private IP address as the cluster's main IP address
Use the cluster members private IP addresses.
Using different resource groups, naming constraints and permissions
The following resources need to be in the same resource group as the cluster members:
The Cluster public IP address
The following resources can be in any resource group:
Virtual Network
Route tables
Network Interfaces
Storage account
In case an external Load Balancer is added, it must be in the same resource group as the cluster members.
Naming Constrains
In case an external Load Balancer is added, it is recommended to name it "frontend-lb".
Note:
If another name is used for the external Load Balancer it should be changed in the $FWDIR/conf/azure-ha.json configuration file by editing the value of "lbName". After changing the value, reload the cluster Azure configuration by running:
$FWDIR/scripts/azure_ha_cli.py reconf
This change must be done on both cluster members.
The Inbound NAT rules names of Load Balancer Network IP configuration must start with "cluster-vip".
The cluster members' names in Azure should match the cluster name with a suffix of '1' and '2'.
The cluster public IP address name in Azure should match the configuration file. By default it should match the cluster name.
Permissions
Starting from the following image versions it is possible to assign the service principal permissions to specific Azure resources:
Image versions can be identified on each cluster member at: /etc/in-azure:
gey_hvm-56-288
ogu_GAR1-20-289
Refer to sk116585 on how to determine the image version.
To allow the cluster update the necessary Azure resources on fail over the service principal should be assigned at least the following roles on these resources or on their respective resource group:
Resource type
Role
Load Balancer Public IP
Virtual Machine Contributor
Load Balancer
Network Contributor
CloudGuard Virtual machines
Reader
Cluster Public IP
Network Contributor
Public cluster members IP
Virtual Machine Contributor
Virtual Network
Virtual Machine Contributor
Any route table used by subnets in the VNET
Network Contributor
The network interfaces used by the cluster member
Virtual Machine Contributor
Note: If a load balancer is not deployed and specific permissions are used, its attribute must be removed from the configuration file on both cluster members:
Edit the $FWDIR/conf/azure-ha.json file.
Remove the 'lbName' attribute:
{ ... "lbName": "frontend-lb", ... }
Apply the change by executing: $FWDIR/scripts/azure_ha_cli.py reconf
This feature is available starting from template version: 20180301.
The feature is only available in Azure Resource Manager deployments. It is not supported with Azure Service Manager (also known as classic) deployments.
Only two members per cluster are supported.
Running the Security Management Server on the cluster members is not supported.
Only High Availability mode (Active/Standby) is supported. Load Sharing modes are not supported.
VRRP is not supported.
Failover times: Cluster failover time depends on the types of cluster resources used. For more information, see the Example Environment section.
Services consumed by the cluster members via VPN are only reachable by the active member. Standby member will reach those services by the time it will become active.