Wednesday, 14 February 2018

Creating a high availability setup for Linux on Power

This article describes high availability (HA), disaster recovery (DR), and fail-over for Linux on Power virtual machines (VMs) or logical partitions (LPARs). The solution described in this article works for all Linux distributions available for IBM® POWER8® and later processor-based servers. Open sources used in this solution are Distributed Replicated Block Device (DRBD) and heartbeat, which are available for all supported distributions. We have used Ubuntu v16.04, supported on IBM Power® servers, to explain and verify the solution.

We are using DRBD for this solution, as it is a software-based, shared-nothing, replicated storage solution mirroring the content of block devices (such as hard disks, partitions, logical volumes and so on) between hosts. DRBD mirrors data in real time. Replication occurs continuously while applications modify the data on the device transparently. Mirroring happens synchronously or asynchronously. With synchronous mirroring, applications are notified of write completions after the write operations have been carried out on all (connected) hosts. With asynchronous mirroring, applications are notified of write completions when the write operations have completed locally, which usually is before they have propagated to the other hosts.

Heartbeat is an open source program that provides cluster infrastructure capabilities—cluster membership and messaging - to client servers, which is a critical component in a HA server infrastructure. Heartbeat is typically used in conjunction with a cluster manager, such as DRBD, to achieve a complete HA setup.

This article demonstrates how to create a HA cluster with two nodes by using DRBD, heartbeat, and a floating IP.

Goal


After reading through this article you can successfully set up a HA environment consisting of two Ubuntu 16.04 servers in an active/passive configuration. This can be accomplished by pointing a floating IP, which is how our users will access their services or websites, to point to the primary or active server unless a failure is detected. In the event of a failure, the heartbeat service detects that the primary server is unavailable, and the secondary server will automatically run a script to reassign the floating IP to itself. Thus, subsequent network traffic to the floating IP will be directed to your secondary server, which will act as the active server until the primary server becomes available again (at which point, the primary server will reassign the floating IP to itself. We can disable the primary node to take over the role by disabling the Auto Fallback option).

Requirement


We need the following setup to be in place before we proceed with failover:

◈ Two servers/VMs installed with Ubuntu 16.04 installed. These will act as the primary and secondary servers for the application and web services.
◈ One floating IP that will act as IP for the application and web services
◈ One additional disk for each VMs, for installing the application and web services. Need not be shared.

Installing DRBD


First, we need to install DRBD on both the servers and create resource groups on free disks, which need not be shared between VMs. In this way, we can use local disks also for this solution and storage area network (SAN) disks are not required.

We can install DRBD packages along with its dependent packages. You need to run the following commands to install DRBD on the servers.

apt-get install drbd* -y

Figure 1. Install DRBD

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

apt-get install "linux-image-extra-`uname -r`"

This is a dependent package to enable the kernel module of DRBD

Figure 2. Install kernel extra packages

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

Installing heartbeat


The next step is to install heartbeat on both servers. The simplest way to install heartbeat is to use the apt-get command.

apt-get install heartbeat

Figure 3. Install heartbeat package

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

After successfully installing the heartbeat package, you need to configure it for high availability.

Now that we have installed the packages required for DRDB and HA, we can begin with DR and HA configuration. First, we will configure DRBD and then configure heartbeat.

Configuring DRBD


To configure DRBD, you need a storage resource (a disk, directory, or a mount point), which will be defined as a DRBD resource group (in our example referred as r0). This resource contains all the data, which need to be moved from the primary to the secondary node when failover happens.

We need to define the resource group r0, under the /etc/drbd.d/r0.res file. The r0.res file should look as shown below:

resource r0 {
     device    /dev/drbd1;
     disk      /dev/sdc;
     meta-disk internal;
     on drbdnode1 {
          address   172.29.160.151:7789;
     }
     on drbdnode2 {
          address   172.29.160.51:7789;
     }
}

We need to define the resource as a device name /dev/drbd1, and in our case, we are using a disk, /dev/sdc, as the storage device. Note that all the nodes involved in HA setup (in our case 2), need to be defined in the r0.res file.

After the file is created on both the participating nodes, we need to run the following command. In our example, we are making drbdnode1 as primary node and drbdnode2 as backup node initially. So, whenever drbdnode1 fails, drbdnode2 should take over as the primary.

Run the following command on both nodes.

modprobe drbd
/etc/init.d/drbd start

Figure 4. Creating kernel module for DRBD

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

Now, initialize the r0 resource group on drbdnode1 using the following command:

drbdadm create-md r0

Figure 5. Creating metadata

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

Then, define drbdnode1 as the primary node and drbdnode2 as the secondary node by running the following commands on the respective nodes:

drbdadm primary r0
drbdadm secondary r0

Figure 6. Overview of primary node

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

Figure 7. Overview of secondary node

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

After, setting the nodes as primary and secondary, start the drbd resource group. We need to do this to make the drbd resource active and ready-to-use.

Create a file system on the primary node using the following command:

root@drbdnode1:~# mkfs.ext4 /dev/drbd1

Then, mount the disk on the primary node.

root@drbdnode1~# mount /dev/drbd1 /data

Figure 8. Check if file system is mounted

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

Now, we completed DRBD configuration. Next, we need to configure heartbeat. We need heartbeat to automate our failover, in case of any disaster.

Configuring heartbeat


In order to get the required cluster up and running, we must set up the following heartbeat configuration files in /etc/ha.d, identically on both servers:

◈ ha.cf: Contains the global configuration of the heartbeat cluster, including its member nodes. This file is needed to make both the nodes aware of the network interfaces and the floating IP that need to be monitored for heartbeat purpose.
◈ authkeys: Contains a security key that provides nodes a way to authenticate to cluster.
◈ haresources: Specifies the services that are managed by the cluster and the node that is the preferred owner of the services. Note that this file is not used in a setup that uses a DRBD resource group.

Create the ha.cf file

On both servers, open /etc/ha.d/ha.cf:

vi /etc/ha.d/ha.cf

We need to add the details of each node in our cluster as shown in Figure 9.

Figure 9. Review ha.cf file

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

Next, we'll set up the cluster's authorization key.

Create the authkeys file

The authorization key is used to allow cluster members to join a cluster. We can just generate a random key for this purpose.

On the primary node, run the following commands to generate a suitable authorization key in an environment variable, named AUTH_KEY:

if [ -z "${AUTH_KEY}" ]; then
  export AUTH_KEY="$(command dd if='/dev/urandom' bs=512 count=1 2>'/dev/null' \
      | command openssl sha1 \
      | command cut --delimiter=' ' --fields=2)"
fi

Then create the /etc/ha.d/authkeysfile using the following commands:

auth1
1 sha1 $AUTH_KEY

Figure 10. Generating authkeys

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

Ensure that the file is only readable by root user:

chmod 600 /etc/ha.d/authkeys

Now, copy the /etc/ha.d/authkeys file from your primary node to your secondary node.

On the secondary server, make sure to set the permissions of the authkeys file:

chmod 600 /etc/ha.d/authkeys

Both servers should have an identical /etc/ha.d/authkeys file.

Create the haresources file

The haresources file should contain details of hosts participating in the cluster. The preferred host is the node that should run the associated services if the node is available. If the preferred host is not available, that is, it is not reachable by the cluster, one of the other nodes will take over. In other words, the secondary server takes over if the primary server goes down.

On both servers, open the haresources file in your favorite editor. We'll use vi.

vi /etc/ha.d/haresources

Now add the following line to the file, substituting in your primary node's name:

primary floatip

This configures the primary server as the preferred host for the floatip service, which is currently undefined.

Figure 11. Review haresources file

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

Configuring floating IP

Now we will configure floating IP on the primary VMs, where we will have the service running for the first time. Floating IP should only be active on one server at a time. This IP is where we are going to host our application or webservice and in case of a failover, we will be moving this IP to another server. In the following example, we will configure the floating IP on drbdnode1 and in case of failover, it will need to move to drbdnode2.

ifconfig ibmeth0:0 9.126.160.53 netmask 255.255.192.0 up

Figure 12. Confirm floating IP is assigned

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

Testing high availability


The next step is to start DRBD and heartbeat services one after the other one, in the following order.

Run the following command on the primary node:

drbdadm primary r0

Run the following command on the secondary node:

drbdadm secondary r0

Start heartbeat services on both cluster nodes using the following command:

service heartbeat start

Now to check for primary and secondary VMs, run the drbd-overview command on both the VMs.

Figure 13. Check DRBD status

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

After completing the primary and secondary set up, and verifying that our services are working as intended, initiate the failover test.

You can perform the failover test in the following scenarios:

◈ Reboot primary node using reboot or shutdown –r command.
◈ Halt primary node using halt command.
◈ Stop the heartbeat service on the primary node using the service heartbeat stop command.

After running one of these scenarios, you should monitor the fail over using the drbd-overview command, and within a few seconds, you can notice that the secondary node has taken over the primary role, and all the services are up on this node. Your floating IP will also be moved along with fail over of the services.

Figure 14. Successful fail over

LPI Certification, LPI Guides, LPI Tutorials and Materials, LPI Learning, Linux

Figure 14 shows that drbdnode2 has taken over as the primary node, and this indicates successful failover.

Related Posts

0 comments:

Post a Comment