Jump to content

Upgrade a drive easily


Recommended Posts

The process to upgrade a drive in an array is either convoluted or risky.

 

There should be a way to add a drive to an array and tell unRAID to replace an existing drive.

 

Methods are up for discussion but making the original drive RO, cloning it then swapping it with no/little array downtime should be the drivers. for the feature.

 

We should also have a verification step.

 

Drives get bigger and upgrading should be trivial and mostly automatic

Link to comment

Thumbs up to this ... Frankly I'm a bit surprised this hasn't been a staple feature from the outset. 

 

That said, what you describe is only possible with a spare drive port.  If all your ports are full then you have no choice but to add ports, or ... pull the "old" drive, install the "new" drive and rebuilt from parity.  If something goes wrong you at least still have the "old" drive which you can put back. 

 

But absent a lack of ports, having a swap-in-place feature as described by NAS seems like a no brainer.

Link to comment

Thumbs up to this ... Frankly I'm a bit surprised this hasn't been a staple feature from the outset. 

 

That said, what you describe is only possible with a spare drive port.  If all your ports are full then you have no choice but to add ports, or ... pull the "old" drive, install the "new" drive and rebuilt from parity.  If something goes wrong you at least still have the "old" drive which you can put back. 

 

But absent a lack of ports, having a swap-in-place feature as described by NAS seems like a no brainer.

If you  new drive is physically installed, and if you have a parity drive installed, and all the other disks in the array (other than the one being replaced )are online.

All you need to do is

Stop the array  (press the Stop button)

Using the drop-down box, assign the new drive in place of the one being replaced.

Start the array.  (The contents of the old will be re-constructed onto the new)

 

There is very little down time. The array is fully available at all times except for the minute or two it takes to assign the replacement disk to the slot being replaced.

 

That feature has been in unRAID since the very beginning. 

Link to comment

Trying to brain out the request, I think it has more to do with the array being unprotected during the data rebuild rather than offline (At least, it isn't offline...so that shouldn't be the request).

 

It does make sense to be able to point at a new drive and say "Use this to replace that" and copy the data over while maintaining the parity to deal with a failure during the rebuild.  Sounds complex to me, but in theory I can see how it could work.

 

 

Link to comment

but during that rebuild time you are not parity protected for as long as it take to rebuilt.  No?

 

Doing it NAS's way means you are only not parity protected for well ... never as best i can tell.  At all times you have at least one valid copy of data drives and one valid copy of parity, all live at one time (live vs. the "old" disk sitting on your desk and needing to be reinstalled)

 

As i said in my first post, sure you do have the "old" drive sitting around just in case but that is not as seemless. If by some unfortunate chance you suffer a drive failure, you have a lot more work to do in order to recover.  With NAS's way, if drive Y fails while creating the copy of drive X onto drive Z then the copy process stops and the array is taken off-line.  You even now have the option of using Z to replace Y without having to dig around your case again and risking cabling issues.

Link to comment

Just for clarity the "risky" way i mentioned in the OP is using parity to rebuild the drive. Even with a 2TB HDD this is many hours of every drive spinning. Yes it is slick but it is inelegant and inherently risky especially when there is a low risk option logically possible.

 

I was thinking if this was broken into two steps clone and replace we also gain the benefit of cloning as well which is useful in itself.

Link to comment

This would be an operation where the drive is "tentatively" added to the array, upon which a 'copy' operation is started in the raid engine, to copy data from the "old" drive to the "new" drive, while also writing data to both the old drive and new drive if writes takes place during the copy operation.  When copy operation finishes, then array config is changed so that new drive takes the place of the old drive, and the old drive is unassigned.  If copy operation fails, or write to new drive fails, the new drive is unassigned from the array.

 

That about describe it?

Link to comment

Sounds about right. You would have to copy the old drive data and then zero the rest of the new empty space for this to work. 

 

NAS suggested making the existing data drive read only so you would not have to worry about any writes during this upgrade process.

 

 

Link to comment

So... to take it to the logical conclusion, you would be temporarily creating a RAID1 volume with the old and new drive, rebuilding the new member of the temporary RAID1, and then breaking the RAID1 at the end of a successful build, with the option to have either drive become the permanent member of the full array, depending on any errors reported.

 

You wouldn't have to make any part of the array read only, just make sure all writes happen to both parts of the RAID1.

 

If the new drive has more free space than the original, you could then expand the filesystem, permanently breaking the RAID1 relationship.

 

I personally would LOVE to see the option of a super redundancy setup with the option of keeping the RAID1 enabled on specified volumes. It would enable you to designate a super safe share with double drive failure tolerated, at the cost of a drive slot. The rest of the array would still only tolerate a single drive failure.

Link to comment

I personally would LOVE to see the option of a super redundancy setup with the option of keeping the RAID1 enabled on specified volumes. It would enable you to designate a super safe share with double drive failure tolerated, at the cost of a drive slot. The rest of the array would still only tolerate a single drive failure.

 

:)  I hesitated bringing this up yet as a new feature request since it seemed to get a luke warm reception that last time I asked about it.

 

But yea, the ability to have a share automatically mirrored across more than one drive for extra protection would be a nice middle ground before getting around to double parity for super important irreplaceable data.  Of course for data like that, off-site really is something to be considered.

Link to comment

This would be an operation where the drive is "tentatively" added to the array, upon which a 'copy' operation is started in the raid engine, to copy data from the "old" drive to the "new" drive, while also writing data to both the old drive and new drive if writes takes place during the copy operation.  When copy operation finishes, then array config is changed so that new drive takes the place of the old drive, and the old drive is unassigned.  If copy operation fails, or write to new drive fails, the new drive is unassigned from the array.

 

That about describe it?

 

Absolutely perfect. When old drive is unassigned you still have it just incase as well, while you have a glimise at the logs and browse the new drive. If you wish to repurpose the old drive, assign it to an unused port, it gets formatted and your on your way. If possible and you have the time this would be nice, especially for up sizing a parity drive (say from 2tb to 3tb, or 3tb to 4tb).

 

 

Link to comment

This would be an operation where the drive is "tentatively" added to the array, upon which a 'copy' operation is started in the raid engine, to copy data from the "old" drive to the "new" drive, while also writing data to both the old drive and new drive if writes takes place during the copy operation.  When copy operation finishes, then array config is changed so that new drive takes the place of the old drive, and the old drive is unassigned.  If copy operation fails, or write to new drive fails, the new drive is unassigned from the array.

 

That about describe it?

 

That sounds exactly correct, and stops the RO operation being required making it even more useful.

Link to comment
  • 5 months later...

Replacing a working disk is not as risky as portrayed. The original disk is available if something goes wrong during a rebuild. A copy of the original config directory is also needed and he array can be reverted to it's prior configuration. The failure can then be handled.

 

Instead of maintaining a concurrent duplicate, it seems simpler to automate the backup of the required config files. If a failure occurs during a rebuild then the previous config and set of disks can be restored and the failure condition resolved.

 

EDIT: This does required the the drive being replaced is not written during the upgrade. Maintaining a live duplicate allows the drive to be written during the upgrade.

Link to comment

Sorry dgaschk,

 

but if You do it like it is actual done, it is much more risky, then the copy over and replace idea, of the OP.

 

Cause actually You get the Rebuild-Load to the whole array.

 

The nicest Way would be a Pre-Clear, so the Drive is checked, an there is no need to check Parity, then just copy form old to new, and replace.

...may if you like with a re-read after the copy-process.

 

Paryty fine, Disk pre checked, copyed Data checked (as an option) and no unnecessary Load on the rest of the array.

 

So the 'new' Replace-Process would be much better.

 

Just my 2Cent...

 

Matthias

Link to comment

The disks all being accessed during a rebuild is immaterial, you access all drives as much each time you do a parity check as each time you'd do a drive rebuild. I'm not sure about everyone else, but I run monthly parity checks to ensure the array is healthy so I operate my disks way more during normal use than the times they get run during a disk replacement.

Link to comment

I am about to upgrade a 2TB disk in my array to a 3TB disk.  Is there a way to simply copy the data from the 2TB disk to the new disk, and then put the new disk in place of the old one, to avoid a rebuild?  Add the new drive outside the array (after preclear of course), copy data, then swap new for old?  I know the array will still freak out... But can this be mitigated with some simple commands?

 

Seems a waste to have the data rebuilt from parity when it can just be copied over to the new disk...

Link to comment

I am about to upgrade a 2TB disk in my array to a 3TB disk.  Is there a way to simply copy the data from the 2TB disk to the new disk, and then put the new disk in place of the old one, to avoid a rebuild?  Add the new drive outside the array (after preclear of course), copy data, then swap new for old?  I know the array will still freak out... But can this be mitigated with some simple commands?

 

Seems a waste to have the data rebuilt from parity when it can just be copied over to the new disk...

 

The safest course is to rebuild to disk because this maintains parity protection. Keep the old disk available if something goes wrong. Make a backup of the config folder before starting and perform a parity check before and after the rebuild.

 

Do not write to the disk during the rebuild. It needs to be identical to the one being replaced if failure recovery is required. Reading is fine and the other disks can be used normally.

Link to comment

Sorry dgaschk,

 

but if You do it like it is actual done, it is much more risky, then the copy over and replace idea, of the OP.

 

Cause actually You get the Rebuild-Load to the whole array.

 

The nicest Way would be a Pre-Clear, so the Drive is checked, an there is no need to check Parity, then just copy form old to new, and replace.

...may if you like with a re-read after the copy-process.

 

Paryty fine, Disk pre checked, copyed Data checked (as an option) and noch unnecessary Load on the rest of the array.

 

So the 'new' Replace-Process would be much better.

 

Just my 2Cent...

 

Matthias

 

I agree that that augmenting unRAID to perform a duplication is preferable. However, the first few posts argued that protection is not available during a rebuild and this is not strictly true. The procedure I describe in my last post allows allows for recovery if something does go wrong during rebuild. unRAIDs safety model can be maintained during a rebuild with a few manual steps.

 

The disks all being accessed during a rebuild is immaterial, you access all drives as much each time you do a parity check as each time you'd do a drive rebuild. I'm not sure about everyone else, but I run monthly parity checks to ensure the array is healthy so I operate my disks way more during normal use than the times they get run during a disk replacement.

 

Totally agree. A few hours on drives designed to last for tens of thousands of hours is immaterial.

Link to comment

Another approach that would allow keeping a level of protection in such a scenario is to add support for dual parity disks.  That way one could still handle failure of another drive while replacing a drive in the array and rebuilding it.  I thought that support for dual parity disks was on the roadmap so maybe this is the way to solve the problem and may be easier to implement?

Link to comment

I also have found the current disk upgrade procedure to be lacking in the "warm fuzzy" department.

I support the:

Lock Source drive

Init Target drive (skip if pre cleared)

Copy files from Source to Target (verify optional)

Unmap Source drive - Map Target drive

I understand that rebuild in place is required for failed and "no free port" conditions.

Link to comment

The latest versions of 5.0 have a "Parity is already valid." checkbox that appears after a new config has been created. It would be very helpful if there was a corresponding "Missing disk." checkbox that appears next to each drive slot if the "Parity is already valid." checkbox has been selected. Only a single disk may be selected as missing.

 

I'm going to double post in the rc forum as a feature request.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...