Unraid has written off a HDD within 24 hours of use


Recommended Posts

Hi folks,

 

I'm really hoping I am wrong, but it looks like Unraid has managed to destroy one of my array disks within about the first 24 hours of (fairly sedate) use.  Any help identifying the problem would be much appreciated.

 

I had just finished configuring a 3-drive array (3x 3TB WD Red plus an older HDD pulled from a desktop for cache).  Parity check was underway but I don't think it had completed, and I was working on getting Plex etc installed and adding content.  All of a sudden I started getting heat warnings from both the cache and one of the two non-parity disks in the array, then I got an error alarm on the disk, and it now shows up as unmountable.  The warnings were as follows:

 

Event: unRAID Disk 1 temperature

Subject: Warning - Disk 1 is hot (45 C)

 

Event: unRAID Disk 1 error

Subject: Alert - Disk 1 in error state (disk dsbl)

 

Event: unRAID array errors

Subject: Warning - array has errors

 

Event: unRAID Parity sync:

Subject: Notice - Parity sync: finished (338 errors)

 

I have run a SMART diagnostic on the drive and it appears to be dead - namely:

Self-test execution status: 121 The previous self-test completed having the read element of the test failed.

 

The parity sync hadn't caught up with what I had done so I think I have lost my data, and apparently written off a fairly expensive drive under pretty minimal loading, so you will understand I'm a bit annoyed.  Have I indeed written off the drive, and if so how on Earth do I prevent it from happening again?  I'm amazed I have any cooling issues as the drives are mounted in a 2U server case that has a whole bank of fans for airflow - and surely Unraid is smarter than to keep pushing data down a drive that is reporting high temperatures?

 

As I say, hoping I am wrong - any advice would be very welcome!!

 

Guy

Link to comment

Running preclear on new disks before adding them to the array has multiple benefits, one of  them is to help avoid problems like yours by triggering marginal drives before actually trusting them with data. The other great thing is that array expansion later on will be more or less instantaneous with a precleared drive.

 

You appear to be under the impression that the system was under relatively light load. I disagree - If parity was being built, it means that all drives were being used at the same time, data drives constantly read as fast as possible and parity drive written to in corresponding fashion. In addition to this, you added content to Plex. I would assume that this means copying files to the array and then having Plex scan the files and generate index files in the background. Aside from stressing the I/O system even more during parity generation, this also puts the CPU to 100% use by Plex. Maxing out both the hard drives and the CPU would certainly explain rising temperatures…

 

That said, this is not a problem for a healthy drive in a decently cooled system.

 

If your components are overheating, you have a hardware problem of some kind.  Since both your cache drive and one of the data drives had temperature warnings, I would first check that the fans are fully functional and that nothing is obstructing the airflow. What was the ambient temperature? Were the drives with temperature warnings located next to each other?

Link to comment

Also, while I don't like my drives to get to 45, that default setting for the temp warning gives you a pretty safe margin.

 

Most likely infant mortality like I said. If you haven't already learned about preclear, see search tips in my sig.

 

If you didn't preclear, I would suggest starting over with preclearing all your drives and return any that don't pass. It is very important that all your drives be trustworthy since every bit of all the others will be required to rebuild if one of them fails. Then take things a little more slowly (preclear will give you a lesson in patience).

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.