hwilker

Getting a FAILED health report while rebuilding array

7 posts in this topic Last Reply

Recommended Posts

I've been running an array of 12 disks, with a cache drive and a hot spare for several years now with no complaint an great ease,

 

The box I use is a Norco 4224 with IBM SAS cards and native SATA slots on a ASRockZ77Extreme with an Intel Celeron G1610 and 4GB of memory since about 2013 with virtually no incidents.

 

Tonight I wanted to replace an old 2GB drive with a new 8GB drive. I had already updated my parity drive to 8gb successfully some six months ago,'

 

I took the array off line, removed the old disk, replaced the new larger disk in the same physical slot that the old one had been in (just for a sense of safety) and restarted the array. Evevything was going according to plan until a got a notice that the rebuilding disk was 'warm' 45 degrees C. Several minutes later I got the same notice but it now read 46 degrees,  

 

I took the top off the case and discovered that the fan closest to the drive being rebuilt had stopped, I removed the lid completely, placed a desk fan pointing at the problem disk and waited for the temperature to go down. It did about 20 minutes later and is now running at a comfortable 41 degrees .

 

But after about another     3 hours I got a popup fail message, which read "Notice [TOWER} = array health reort [FAIL]. Array has 14 disks(including parity and cache.)

 

I looked in the log but couldn't see anything untoward. The array was still rebuilding so I let it continue to see what would happen. Now about another 1-1.5 hrs later it seems to be humming along, My plan, such as it is is to let it finish and the run a parity check (assuming nothing further happens.)   I also purposely didn't repurpose the old disk that I was replacing, and since I didn't any new content to the unraid system during this process, I hope that if something is wrong I can simply replace the old disk in the slot where it was temporarily until I know what's wrong. I've attached both the syslog from right after I noticed the problem though my eye sees nothing wrong with it,   and the diagnostics file requested by the how-to post. I also captured an image of the toastr image that warned me of the fail.

 

What I would like to know is:

1. is there any point in letting this process complete. If not should I place aside the new hard drive and do a new preclear on it before trying to use it again (I did two cycles before starting this process.           

 

2. if not what should I do.

 

3, what do more trained eyes than min glean from the log and diagnostics report.

 

I'm anxious of course to get my  array back up and running but I'm hoping a  being deliberate will keep me from doing something rash, hence allowing it to complete and having the old disk still intact and untouched so it could be put back in the array.                     

 

Any help or guildance would be much apprecited.

 

 

hwilker                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             

tower-diagnostics-20180810-0057.zip

tower-syslog-20180810-0033.zip

healthreportfail.JPG

Share this post


Link to post

Health report fail Is normal during the rebuild, array will return to healthy when it finishes.

Share this post


Link to post

Thx. Hope you're right. Would you do a parity check afterwards to be certain?

Share this post


Link to post
6 minutes ago, hwilker said:

Would you do a parity check afterwards to be certain?

I usually don't, if there are no errors during the rebuild it should be fine, but It's always good practice

Share this post


Link to post
6 hours ago, johnnie.black said:

I usually don't, if there are no errors during the rebuild it should be fine, but It's always good practice

"no errors during the rebuild"

 

If that FAIL in the red array health report popup isn't an error, what is it?

Share this post


Link to post

What's failing is the array health during the rebuild, since one disk is invalid, as soon as the rebuild finishes without errors array turns healthy, as it will say in the next report you get.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


Copyright © 2005-2018 Lime Technology, Inc.
unRAID® is a registered trademark of Lime Technology, Inc.