Set Drive as Disabled? (Parity Swap Disabled)


Recommended Posts

The drive in question is showing signs of unacceptable behavior from wear and tear, so it needs to be replaced. It has not had any write errors, but has had some questionable read errors. The current data drive and current parity drive is 2TB in size. The new parity drive will be a nice Hitachi 4TB 7200rpm drive.

 

This scenario seems to fit perfectly for performing the Disabled Parity Swap move.

 

So my questions are, does the drive need to be marked as disabled? If so, is there a way to set a drive as disabled under either the unRAID 5 or unRAID 6 series? That is, aside from shutting down the server and disconnecting the drive in question, if that does indeed mark a disk as disabled and not just missing.

 

I feel like I've seen this done somewhere by feeding commands into the md driver, but can't quite seem to find it now.

 

Thanks for any help or pointers to where to find the answers.

 

The closest bit I'm finding on this is

You can "fail a disk" by stopping the array, un-assigning it from its slot, and starting the array with it un-assigned, then stopping the array once more.. Starting the array with it un-assigned will mark it as "failed"

 

More information is being found in this thread: http://lime-technology.com/forum/index.php?topic=29529.0

Link to comment

Yes, that is the most recent discussion on "parity swap disable" I'm aware of.

The key is to cause a drive/slot to show up red balled.

Then the option to perform the drive swap will show up.

Assign old parity --> new data & new drive --> new parity.

 

As I understand you have the old data drive that is still working as a backup.

Basically nothing to loose if it goes wrong.

Link to comment

Welp... Decided to give this a go on unRAID 5.05.

 

I resorted to downing the server and physically disconnecting the read-error 2TB data drive.

I brought up the server to notice the array was not started and drive appeared as missing.

I started the array and the 2TB data drive appeared as disabled.

I stopped the array.

I assigned new 4TB drive as parity.

I assigned the old 2TB parity drive as the disabled 2TB data drive.

I noticed the [Copy] button was available, checked the checkbox, and clicked on [Copy].

 

The array screen updated with the 4TB Parity drive and 2TB Data drive showing up as blue balled. Array status showed "Copying, 0% complete...". After a few minutes I clicked [Refresh] button and Array status showed "Copying, 2% complete...".

 

Now it's a matter of waiting things out and seeing how it turns out. If this produces a functional system, I will then perform a parity check to ensure everything should be fine.

 

After that, I will then upgrade back to unRAID 6.0 beta 8 and add in a second 4TB drive as a brand new data drive. I would have preferred replacing the 2TB drive directly with a 4TB drive, but there were too many mitigating circumstances. Mostly, I didn't fully trust the 2TB data drive enough to be stable enough to generate correct reads if I rebuilt parity on the 4TB drive, then reconstructed the 2TB drive onto the replacement 4TB data drive.

 

I will update this thread after the procedure is complete with the final verdict.

 

The lesson to me is to not trust drives being fine so long as there are no write errors.

 

I now find the following features as absolute requirements in a real NAS system:

  • Scheduled automated SMART tests.
  • Notification of SMART test failures.
  • Notification of all errors, READ errors in addition to WRITE errors.
  • Ability to easily reconstruct a drive onto another drive.
  • Ability to manually mark a drive as disabled as it prevents the need to have physical access to the server

 

 

Link to comment

Process is still going. Seems to be about 3.5 minutes per percent.

 

Sep 6 18:34:16 REAVER emhttp: copy: 54% complete

Sep 6 18:37:38 REAVER emhttp: copy: 55% complete

Sep 6 18:40:59 REAVER emhttp: copy: 56% complete

Sep 6 18:44:21 REAVER emhttp: copy: 57% complete

Sep 6 18:47:48 REAVER emhttp: copy: 58% complete

Sep 6 18:51:18 REAVER emhttp: copy: 59% complete

Sep 6 18:54:46 REAVER emhttp: copy: 60% complete

Sep 6 18:58:15 REAVER emhttp: copy: 61% complete

Sep 6 19:01:45 REAVER emhttp: copy: 62% complete

Sep 6 19:05:20 REAVER emhttp: copy: 63% complete

 

Link to comment

The system finished copying the parity information from the old 2TB parity drive to the new 4TB parity drive. It's then showed Array Status of "Stopped. Ugrading disk/swapping parity." The parity disk is green-balled and the replacement 2TB data drive is orange-balled.

 

The next step was to check the box "Yes I want to do this" next to the [start] button which states: "Start will expand the file system of the data disk (if possible); and then bring the array on-line and start Data-Rebuild."

 

After some time (30 seconds or so), the web console refreshed and showed Array Status as "Started. Data-Rebuild in progress.". The progress indicator shows total size of 2TB, 18.29 GB (1%) completed at estimated speed of 99.28 MB/sec and finish in 333 minutes.

 

Link to comment

Just a point to note - if you want to replace the 2TB data drive with a 4TB one then I would suggest doing this before going back to v6 or waiting for v6 beta 9.  v6 Beta 8 has (temporarily) disabled expanding the file system to use the full drive when replacing a drive with a larger one. 

Link to comment

The replacement data drive rebuild has completed. The Array Status showed parity has not been checked.

 

I unchecked the box indicating "Correct any Parity-Check errors by writing the Parity disk with corrected parity.". The non-correcting parity check is now in progress. The check status shows 4TB with 0 sync errors, current position 15.42 GB at estimated 105.11 MB/s with finish time in 632 minutes.

 

 

Link to comment

It does appear to have completely worked. The parity check finished; Last checked on Sun Sep 7 21:41:36 2014 EDT, finding 0 errors.

 

The first disk rebuild was done only over the size of the actual data drive (2TB), while the final parity check was over the entire array (4TB).

 

Sep  6 21:51:34 REAVER kernel: mdcmd (41): check CORRECT

Sep  6 21:51:34 REAVER kernel: md: recovery thread woken up ...

Sep  6 21:51:34 REAVER kernel: md: recovery thread rebuilding disk1 ...

Sep  6 21:51:34 REAVER kernel: md: using 6688k window, over a total of 1953514552 blocks.

Sep  7 05:14:34 REAVER kernel: md: sync done. time=26579sec

Sep  7 05:14:34 REAVER kernel: md: recovery thread sync completion status: 0

<<...snip...snip...>>

Sep  7 09:55:55 REAVER kernel: mdcmd (46): check NOCORRECT

Sep  7 09:55:55 REAVER kernel: md: recovery thread woken up ...

Sep  7 09:55:55 REAVER kernel: md: recovery thread checking parity...

Sep  7 09:55:55 REAVER kernel: md: using 6688k window, over a total of 3907018532 blocks.

Sep  7 19:58:01 REAVER kernel: mdcmd (47): spindown 2

Sep  7 19:58:01 REAVER kernel: mdcmd (48): spindown 3

Sep  7 19:58:02 REAVER kernel: mdcmd (49): spindown 4

Sep  7 19:58:03 REAVER kernel: mdcmd (50): spindown 5

Sep  7 21:41:36 REAVER kernel: md: sync done. time=42340sec

Sep  7 21:41:36 REAVER kernel: md: recovery thread sync completion status: 0

 

Link to comment
  • 1 year later...

I have created an updated wiki page for the Parity Swap procedure ->  The Parity Swap procedure

 

I would really appreciate review and corrections, especially from Brit if he has time.  It's wordy, no pictures (afraid that's not my strong point), but I believe has extra hand holding, for both new users and all of us that rarely run it.

 

I've called it the 'Parity Swap' procedure, not the 'Swap Disable' procedure, which it's called more often.  I hope that's not a problem, and I can change it, but I think 'Parity Swap' is clearer, easier to understand.

 

It's not well tested.  I just used it successfully on my own v6.1 system, but not with a failed drive, so there may be behavioral quirks with other versions and situations.  PLEASE let us know!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.