Ubuntu RAID Pain Relief

server_rack1.jpg

Recently while working on a client project we upgraded our Ubuntu version of our base Amazon EC2 AMI to 12.04.01 LTS (Precise Pangolin). However some time after we hit a problem when replacing a failed ElasticSearch node using this image. When the instance rebooted the md-device changed from /md0 to /md127 and our raiddrive mount did not come back up as fstab expected it to be on the md0. Why this changed is a subject of much debate if you search the interwebs and its a problem that that has bitten many.

At the time we really wanted to get the node going fast so a work around was to just mounted it on device md127 instead. This is a bad smell that would probably come back and bite us later if left too long.

We soon had to replace all of our entire ElasticSearch clusters to upgrade from 0.19.9 to 0.20.6 so time to solve the problem. Firstly a upgrade of Ubuntu to 12.04.02 and then some more config to fix this mounting problem.

This is the bash script to mount 4 x ESBs volumes to a RAID 0 array.

#!/bin/bash
set -e -x

sudo apt-get install -y mdadm xfsprogs

# Give the device a name - this is important.
sudo mdadm —create /dev/md0 —name esdata —level 0 —chunk=256 —metadata=1.1 —raid-devices=4 /dev/xvdh /dev/xvdi /dev/xvdj /dev/xvdk

echo DEVICE /dev/xvdh /dev/xvdi /dev/xvdj /dev/xvdk | sudo tee /etc/mdadm/mdadm.conf

sudo mdadm —detail —scan | sudo tee -a /etc/mdadm/mdadm.conf

# Update the initramfs - very important after mdadm change
sudo update-initramfs -u

sudo mkfs.xfs /dev/md0

# Don’t hang the EC2 instance start if its not ready
echo “/dev/md0 /raiddrive xfs nodiratime,nobootwait 0 0” | sudo tee -a /etc/fstab

sudo mkdir -p /raiddrive
sudo mount /raiddrive

Volia your RAID will now survive a restart and stay on the same device number.

Posted on April 22, 2013 .