Mdadm

From Briki
Jump to: navigation, search

Overview

Several physical disks (/dev/sdX) or partitions (/dev/sdX1) of equal size are joined into a single array.

Creating a RAID array

  • (Recommended) Create a partition on each disk. Note:
    • Use optimal alignment, with "-a optimal" (this doesn't appear to have any obvious effect on behaviour though!)
    • Use the "GPT" partition table format (to handle disks > 2TB)
    • Name the partition "primary" (note that this is free text)
    • Use 0% for partition start (this will normally mean that the partition start will be at the 1MB boundary, which gives optimal alignment)
    • End 100MB before the end of the disk (this is to allow for slight variances in exact size of similar disks)
    • Set partition type to raid (0xFD00); this is optional, but may encourage some tools to avoid writing directly to the disk (and avoid corrupting the array)
parted -a optimal -s /dev/sdX -- mklabel gpt mkpart primary 0% -100MB set 1 raid on


  • Create a RAID 5 array over 3 partitions:
    • Note, the default metadata version is now 1.2 for create commands
mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdX1 /dev/sdY1 /dev/sdZ1
  • Wait (potentially several days) for the array to be built
  • Once built, save the current raid setup to /etc, to allow for automounting on startup:
mdadm --detail --scan >> /etc/mdadm/mdadm.conf
  • Update the initial boot image for all current kernel versions to include the new mdadm.conf:
update-initramfs -u -k all
  • Start the array:
mdadm --assemble /dev/md0 /dev/sdX1 /dev/sdY1 /dev/sdZ1
  • From this point, just treat the array (/dev/md0) as a normal physical disk.

Convert RAID 1 array to RAID 5

  • Create partition on the new disk as for creating a new array
  • Add the new partition to the array:
mdadm --add /dev/md0 /dev/sdX1
  • Convert the array to RAID 5, with the correct number of devices:
mdadm --grow --level=5 --raid-devices=3
  • Wait (potentially several days) for the array to be reshaped
  • Grow the partition / volume on /dev/md0

Recovering from disk failure

  • Check the disk status in mdadm:
mdadm --detail /dev/md0
  • If the disk is already marked as failed, then skip this step. Otherwise:
mdadm /dev/md0 --fail /dev/sdX1
  • From this point, the array will continue to operate in "degraded" mode
  • Remove the failed disk:
mdadm /dev/md0 --remove /dev/sdX1
  • To more easily determine the disk for physical removal from the machine (once powered off), note down the serial number as reported by:
hdparm -i /dev/sdX | grep SerialNo
  • Add a replacement disk:
mdadm /dev/md0 --add /dev/sdY1
  • Wait (potentially several days) for the array to be resynced

Recover from a dirty reboot of a degraded array

If the server shuts down uncleanly (eg. due to a power cut) when the array is degraded, it will refuse to automatically assemble the array on startup (with a dmesg error of the form "cannot start dirty degraded array"). This is to because the data may be in an inconsistent state. In this situation:

  • Check that the good disks have the same number of events. If the numbers differ slightly, that suggests some of the data being written when the server shutdown wasn't written fully, and is probably corrupt (hopefully this will just mean a logfile with some bad characters, or similar).
mdadm --examine /dev/sdX /dev/sdY /dev/sdZ | grep Events
  • Assuming the number of events is the same (or very similar), forcibly assemble the array.
mdadm --assemble --force /dev/md0 /dev/sdX1 /dev/sdY1 /dev/sdZ1

Useful Commands

cat /proc/mdstat 
Display a summary of current raid status
mdadm --detail /dev/md0 
Display raid information on array md0
mdadm --examine /dev/sdf 
Display raid information on device/partition sdf