X

Rebuilding a degraded software mirrored RAID array

<p>Apple offers a couple of software RAID solutions for people interested in either creating a larger disk out of two, or mirroring two disks together for data redundancy in the event of a drive failure. When a drive does fail, in hardware RAID solutions

CNET staff
5 min read

Apple offers a couple of software RAID solutions for people interested in either creating a larger disk out of two, or mirroring two disks together for data redundancy in the event of a drive failure. When a drive does fail, in hardware RAID solutions you can usually swap out the disk and the controllers will automatically rebuild the RAID array; however, with Apple's software RAID, rebuilding is not so straightforward.

When a mirrored RAID array goes into degraded mode, there is no loss of function since the data is just redundant, but for the sake of data and workflow safety, it's a good idea to rebuild it as soon as possible. There is no rush to repair the RAID, but it should be done at your first convenience. After all, workflow continuity is the benefit of having a software mirrored RAID over standalone disks without going with a hardware solution that supports hot-swapping and rebuilding.

The way to rebuild an array is to format the corrupted disk (or swap it out with a new and recently formatted one), and then use the terminal to repair the mirror. For start-up disks this must be done when booted off another source (ie: the OS X installation DVD). There are ways to rebuild arrays using the Disk Utility application, but sometimes that can fail and cause further problems so we are outlining how to do this strictly through the terminal.

Primary approach

First, be sure all your data is backed up in an easily restorable manner (ie: Time Machine) and then launch the Terminal and run the following command to check the RAID setup:

sudo diskutil checkRAID

This can be done when booted off the RAID array, and should print out a list of the RAID arrays and information about them as follows:

 server:~ admin$ sudo diskutil checkRAID
 password:
 RAID SETS
 ---------
 ===============================================================================
 Name:                 Boot RAID
 Unique ID:            2E725E1C-645E-4E12-A894-418D521CEAB7
 Type:                 Mirror
 Status:               Online
 Size:                 82213994496 B
 Device Node:          disk3
 Apple RAID Version:   2
 -------------------------------------------------------------------------------
 #   Device Node       UUID                                   Status
 -------------------------------------------------------------------------------
 0   disk0s3           BEFB54EC-C405-410F-BEB0-73A1FDA0F5E2   Online
 1   disk2s3           1F8148B9-E173-4862-9C5E-F026A5FCDD6B   Online
 ===============================================================================

In this output, the key items to find are the device nodes of the RAID drives (disk0 and disk2) and the final device node of the array (disk3). We will use these identifiers later in rebuilding the array.

If a drive is degraded, you will see something besides "Online" listed under "Status," indicating this is the drive to replace. If the system boots off the RAID array, you will need to perform the repair after booting off the OS X installation DVD; however, if not you can continue using the current Terminal session. Powering off will also need to be done to replace the faulty disk, unless you have a spare drive installed (for mirrored RAID arrays, the new drive will need to be at least as large as the current disks in the array).

The next thing to do is to list the devices on the system with the following command:

sudo diskutil list

This will output a list of the disk devices along with partition, size, and identifier information. In our example computer we see the following:

 server:~ admin$ sudo diskutil list
 password:
 /dev/disk0
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:     Apple_partition_scheme                        *76.7 Gi    disk0
   1:        Apple_partition_map                         31.5 Ki    disk0s1
   2:                 Apple_Boot Boot OSX                128.0 Mi   disk0s2
   3:                 Apple_RAID                         76.6 Gi    disk0s3
 /dev/disk1
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:     Apple_partition_scheme                        *76.7 Gi    disk1
   1:        Apple_partition_map                         31.5 Ki    disk1s1
   2:                  Apple_HFS Server TM Backup        76.6 Gi    disk1s3
 /dev/disk2
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:     Apple_partition_scheme                        *76.7 Gi    disk2
   1:        Apple_partition_map                         31.5 Ki    disk2s1
   2:                 Apple_Boot Boot OSX                128.0 Mi   disk2s2
   3:                 Apple_RAID                         76.6 Gi    disk2s3
 /dev/disk3
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:                  Apple_HFS Boot RAID              *76.6 Gi    disk3
 /dev/disk4
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:     Apple_partition_scheme                        *2.7 Ti     disk4
   1:        Apple_partition_map                         64.0 Ki    disk4s1
   2:                  Apple_HFS Data1                   2.7 Ti     disk4s3
 /dev/disk5
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:     Apple_partition_scheme                        *698.6 Gi   disk5
   1:        Apple_partition_map                         31.5 Ki    disk5s1
   2:                  Apple_HFS Data2                   698.5 Gi   disk5s3

From this information, find the name of the drive that needs replacing and power off and replace it. In this case, we see disk0 and disk2 as being members of "Apple_RAID", which displays as device "disk3" and is named "Boot RAID." The other disks in the system are standalone disks that are not members of software RAID setups.

With the new disk installed, you will see it display without the "Apple_RAID" label. Identify the disk based on its size, and then run the following command to rebuild the array:

sudo diskutil repairMirror MIRRORNODE NEWDISKNODE

In our example, if we had a new drive device called "disk6" (the "NEWDISKNODE") and there was a fault with the array on disk3 (the "MIRRORNODE"), we would reference the array and the new disk as follows:

sudo diskutil repairMirror disk3 disk6

Pressing enter will rebuild the array accordingly. To ensure data integrity, we recommend not touching the array while it is being rebuilt, but doing so technically should not harm anything.

Alternative approach

You can also use the "addToRAID" and "removeFromRAID" options to also manage problematic RAID drives. To do this, you will need the UUID of the drives in use on a RAID array, so we recommend that for each array you have, write down the UUIDs listed with the "checkRAID" option (see above). If a drive is malfunctioning, you may not be able to find its UUID.

To use these commands to remove a member disk from an array, first add a new drive to the current array, expanding the mirror onto it. This will set up a healthy array among the working disks:

sudo diskutil addToRAID member DEVICENODE RAIDNODE

In this command, we are adding the "DEVICENODE" device (ie: disk6) to the "RAIDNODE" array (ie: disk3) as a "member" drive and not a spare drive. The RAID array will be expanded to this device, and should continue functioning accordingly. When this is done (use the "checkRAID" command to monitor progress--can take some time), you should then remove the nonfunctional drive from the array with the following command:

sudo diskutil removeFromRAID OLDDEVICEUUID RAIDNODE

In this command, the "OLDDEVICEUUID" is the UUID string for the problematic array drive, and "RAIDNODE" is the array device identifier. Looking at our example, if we wanted to remove "disk2" from our mirrored array, we would enter the following:

sudo diskutil removeFromRAID 1F8148B9-E173-4862-9C5E-F026A5FCDD6B disk3

While you can use device numbers instead of UUIDs for present drives, if the drive is not functional and not showing up, then you will have to use the UUID for the "removeFromRAID" command. After this command is run, you can then use the "addToRAID" command to expand the RAID array to new member disks. With the new disk present in the system, the following command will add it to the current array:

After you have expanded the array, use the "checkRAID" command mentioned above to ensure the array is online and healthy. Rebooting the computer may be required to mount the array after expanding it.

Resources

  • More from Tutorials