Skip to content

High Availability Cluster

Deploying a high-availability (HA) SocketXP Gateway cluster involves setting up and configuring an active server and a standby server. This setup ensures that one server is always available to handle gateway functions.

At any given time, only one server will assume the active role and perform all gateway functions, such as forwarding traffic and managing connections. This mode of high availability (HA) is called active-standby mode.

SocketXP IoT Platform High Availability Cluster Setup

How SocketXP Active-Standby Mode Works

Typically, HA clusters function in active-active mode or active-standby mode.

The SocketXP Gateway HA Cluster operates exclusively in active-standby mode. This means that at any point in time, only one server will be active. The active server will receive and forward all incoming and outgoing traffic. All IoT devices connect to the active server, and all traffic from these devices, apps, or end-users will be handled by it.

The standby server will remain idle and act as a backup. It will become active and start forwarding traffic only when the current active server goes down or becomes completely unresponsive.

How are active server failures detected, and how is traffic forwarding switched over to the standby server? Specifically, who detects this failure and takes the necessary corrective action?

This is where Virtual IP (VIP) and Virtual Router Redundancy Protocol (VRRP) concepts become essential. These will be discussed in detail in the next section.

Virtual IP Address (VIP) and VRRP

The SocketXP Gateway HA cluster utilizes the concepts of Virtual IP address (VIP) and Virtual Router Redundancy Protocol (VRRP) to operate effectively in active-standby mode. These technologies work together to provide a single, consistent point of access to the active gateway server.

The cluster nodes will elect one server as the master (or active) server. Traffic will always be forwarded to the master (active) server in the cluster.

Though VRRP is a networking-world HA concept, it can be effectively applied to the server infrastructure as well.

VRRP allows a set of routers (or servers, in our case) to be grouped together to form a single virtual router. This virtual router has a Virtual IP Address (VIP) associated with it. One of the routers in the group is elected as the master, owning the VIP and forwarding traffic destined for that IP address. The other routers act as backups. If the master router fails, a backup router automatically takes over the master role, assumes the VIP, and continues forwarding traffic, providing transparent failover.

The SocketXP Gateway HA cluster utilizes the same logic.

A Virtual IP address (VIP) will be used to route all traffic (received from the internet through the NAT/Firewall) to the active server in the SocketXP Gateway HA cluster. Whenever a failover occurs, the newly elected master or active server will takeover the VIP and send out a gratuitous ARP or unsolicited ARP. This ensures that all other devices on the network update their ARP tables to direct traffic for the VIP to the newly active server's MAC address.

NAT and Public IP address

The VIP associated with the HA cluster will be NAT'ed to a public IP address when communicating with IoT devices, apps, or end-users on the internet. The *.socketxp.example.com domain must DNS resolve to this public IP address, which is associated with the VIP.

The VIP can be managed by running keepalived on each server in the cluster. keepalived will use the Virtual Router Redundancy Protocol (VRRP) to manage the VIP within the cluster by electing the master and standby nodes.

Note:

SocketXP Gateway server will work with any HA load-balancing solution that supports the active-standby model. SocketXP will also work with any Virtual IP address (VIP) management software that implements VRRP, such as keepalived or vrrpd.

The SocketXP Gateway implementation is independent of the load-balancer solution discussed in this document. SocketXP Gateway configuration looks the same on both the master and the standby server. No special configuration is required in the SocketXP Gateway server to implement the HA function.

What is Keepalived

Keepalived is a software package that provides simple and robust load balancing and high availability (HA) for Linux systems.

Keepalived implements the VRRP protocol and is commonly used in server environments to implement VRRP and HA functionality.

Download and Install Keepalived

Keepalived can be easily installed on most Linux distributions using the distribution's package manager. Here's how to do it on some common distributions:

Debian/Ubuntu

sudo apt update
sudo apt install keepalived

Red Hat/CentOS/Fedora

sudo yum install keepalived
#   or
sudo dnf install keepalived  #   On Fedora and newer CentOS/RHEL

For other distributions, please refer to their respective documentation.

Configuration

Here is a sample keepalived configuration file for the master and the standby server to run VRRP and manage the VIP in the cluster.

The main configuration file for keepalived is typically located at /etc/keepalived/keepalived.conf on most Linux distributions.

The Virtual IP (VIP) address used in this example is 10.1.1.100/24.Each server in the cluster has its own IP address configured on the VRRP interface (eth0).

For example, the master node has an IP address of 10.1.1.5/24, and the backup node has an IP address of 10.1.1.10/24 in this case.

SocketXP IoT Platform High Availability Cluster Setup

Keepalived config on the master server:

vrrp_instance VI_1 {
    state MASTER
    interface eth0  #   Replace with your actual network interface
    virtual_router_id 51  #   Must be the same across all nodes
    priority 255
    advert_int 5 (seconds)
    unicast_src_ip 10.1.1.5
    authentication {
        auth_type PASS
        auth_pass your_password
    }
    virtual_ipaddress {
        10.1.1.100/24  #   Your Virtual IP Address
    }
    #   'nopreempt' is not included here
}

Keepalived config on the standby server:

vrrp_instance VI_1 {
    state BACKUP
    interface eth0  #   Replace with your actual network interface
    virtual_router_id 51  #   Must be the same across all nodes
    priority 200
    advert_int 5 (seconds)
    unicast_src_ip 10.1.1.10
    authentication {
        auth_type PASS
        auth_pass your_password
    }
    virtual_ipaddress {
        10.1.1.100/24  #   Your Virtual IP Address
    }
    nopreempt  # included only in the standby
}

Explanation:

Master Server: The master server has higher priority and takes over the Virtual IP (VIP).

Standby Server: The standby server has lower priority, takes over when the master fails, and, due to nopreempt, retains the VIP even when the master recovers. The master server will become the new backup server.

Important:"

After making changes to the keepalived.conf file, you must restart the Keepalived service for the changes to take effect. Use the following command:

sudo systemctl restart keepalived

Keepalived Tracking Scripts:

You can also implement and use custom scripts to monitor the health of the active server or any application running on it using the following keepalived configuration settings. These scripts indicate the true health of the active server to keepalived, providing a more robust health check than relying solely on VRRP advertisement packets.

...
...

vrrp_script chk_httpd {
    script "/etc/keepalived/check_httpd.sh"
    interval 3 (seconds)      #   Check every 3 seconds
    weight -50     #   Decreases the server's priority by 50 if the script fails
    rise 2          #   Requires 2 successful checks to become healthy
    fall 3          #   Requires 3 failed checks to become unhealthy
}

vrrp_script chk_network {
    script "/etc/keepalived/check_network.sh" #   Check network connectivity
    interval 5 (seconds)
    weight -20     #   Decreases the server's priority by 20 if the script fails
    rise 2
    fall 2
}

track_script {
    chk_httpd
    chk_network
}
...
...

For more information about these settings and the meaning of each variable in the config, please refer to the official keepalived documentation.

Note:

The intention of this documentation guide is to show you how the SocketXP Gateway HA Cluster works and how to use keepalived as an example tool to implement the HA function. The intention of this guide is NOT to discuss the various functions/features provided by keepalived. Please refer to the keepalived documentation for more information.