unRAID does not (currently) provide a feature to remove a drive from the array without losing parity integrity. So if you do remove a disk, and while you are rebuilding parity with the remaining you hit a read error on one of the disks already in the array (or worse yet lose a drive), you have no way of recovering. If you had run a parity check just before the removal, this would be very unlikely, but if you didn't, the chances of encountering at least one read error on a large array are certainly significant.
Not too long ago, Tom revealed a command that I thought might allow a way to accomplish a drive removal while maintaining parity protection. I experimented with this yesterday, and proved it can be done. Below is the documentation of what I did.
To do this test I setup a test array with some spare drives. My parity is 300G, my data1 is 250G, and my data2 is an older 20G drive. My goal is to remove the 20G drive without losing parity protection. I first copied a bunch of data to the 20G drive to make sure it was full of non-zero data.UPDATED 8/16/10 for new unRAID versions and to reference larger buffer to increase speed.UPDATED 1/16/12. These instructions rely on the "mdcmd set invalidslot ..." command introduced into unRAID in release 4.3.2 and worked through version 4.7. However, with the 5.0 betas, this functionality is broken in some betas and works differently than documented in this post in other betas. If you are beyond version 4.7, research the "set invalidslot" functionality in your version of unRAID, and be prepared to adapt the instructions here. Once the 5.0 release comes out, I will update these instructions for 5.0.
**** STANDARD DISCLAIMER: USE THIS PROCESS AT YOUR OWN RISK. ****1: CLEAN BEGINNING
- Parity check in progress on the array. Notice that the parity check is past the 20G drive's size with no sync errors.2: BEGIN FILL - TELNET
- Just ran umount and dd commands to fill the 20G drive with zerosUPDATE: By increasing the blocksize, this step can be sped up considerably. (Thanks Joe L.). See later post for more info, but the syntax for this example would be:
dd if=/dev/zero bs=2048k of=/dev/md23: BEGIN FILL - WEB GUI
- Web GUI shortly after dd begins. Notice drive instantly becomes unformatted when you fill it with zeros. But the parity is still being updated as the drive is being zeroed.4: END FILL - TELNET
- dd command is done. This took a long time (see "FINAL THOUGHTS" below)5: END FILL - WEB GUI
- Web GUI after dd ends6: STOPPED THE ARRAY
- Just pressed the Stop button7: GO TO DEVICES TAB
- About to unassign Disk2 (20G drive)8: UNASSIGN DEVICE
- Drive unassigned9: BACK TO MAIN PAGE
- Drive is missing10: RESTORE READY
- About to press the dreaded restore button - must do in order to remove a drive from the array.UPDATE: The Restore button was replaced with an "initconfig" command in later versions of unRAID. If the restore button does not appear, simply run this command from a telnet prompt.11: RESTORE PRESSED
- Technically parity protection is gone now, but array is offline. Hard to think of a way you coud lose data now. Parity will be restored in 15 seconds.12: SET INVALIDSLOT
- Special command that tells unRAID that the array is already protected. Starting the array after this command will trigger a PARITY CHECK and not a PARITY REBUILD. (Remember that the array is protected during a parity check, but not during a parity rebuild).13: ARRAY JUST STARTED
- Back to Web GUI and pressed Start button. Notice that the parity CHECK automatically starts and is in progress. (Remember the array IS protected during a parity check.) If we hadn't run the set invalidslot, parity would be rebuilding, and the array would be totally unprotected until the rebuild completed. Changing the parity rebuild to a parity check is the goals of this process. Although not recommended, you could stop the parity check and the array would be protected.13: PARITY CHECK - FINAL
- Parity check is past the size of the 20G drive. ZERO sync errors. Mission accomplished.
The only problem with this procedure is the amount of time it took to fill the drive with zeros with the "dd" command. The 20G drive is old and pretty slow (likely a 5400 RPM drive), but filling it with zeros took 10+ hours (remember this is only a 20G drive). There has to be a faster way to accomplish this. If we can get past that performance problem, I think this is a very plausible way to remove a drive while maintaining parity protection.