markguy Posted May 14, 2009 Share Posted May 14, 2009 So, I've got 4.5.b6 running (although it did this with 4.5.b5 and whatever the most recent version of 4.4(.2?) was). This is a new box, with four drives sitting in it (only three being used with basic version at the moment). Boots up fine, starts a parity check and then roughly 30% of the way through that, I get a weird clicking every couple of seconds, the web UI goes away and nothing short of yanking the cord will get the box to shut down. I can still telnet in, which seems fairly odd. The clicking doesn't sound like a drive failure (although I don't know what else it could be), just a click. Oh, memtest went 20 cycles without an error. EDIT: And the drive temps, when last I saw a report on them, were all less than 35C. Thanks in advance! pastebin of syslog.txt... and add ~15,000 more lines of the same error codes when you get to the end of that file. Even pastebin didn't want any part of that! EDIT: Whoops. Hardware: Supermicro C2SEE, Intel Celeron E1400 (BX80557E1400), 2x1GB Crucial memory (CT2KIT12872BA1067), COOLMAX CU-700B 700W PSU. Quote Link to comment
fitbrit Posted May 14, 2009 Share Posted May 14, 2009 This sounds to me like you have a write error on a non-parity disk, and so unRAID is stopping the server to prevent any irreplacable damage from occuring. I was seeing the same thing recently. If I'm right, try to figure out which drive it is. It could just be the SATA/power cables, but fix the problem asap before this happens to you: http://lime-technology.com/forum/index.php?topic=3785.0 Quote Link to comment
SSD Posted May 14, 2009 Share Posted May 14, 2009 So, I've got 4.5.b6 running (although it did this with 4.5.b5 and whatever the most recent version of 4.4(.2?) was). This is a new box, with four drives sitting in it (only three being used with basic version at the moment). Boots up fine, starts a parity check and then roughly 30% of the way through that, I get a weird clicking every couple of seconds, the web UI goes away and nothing short of yanking the cord will get the box to shut down. I can still telnet in, which seems fairly odd. The clicking doesn't sound like a drive failure (although I don't know what else it could be), just a click. Oh, memtest went 20 cycles without an error. EDIT: And the drive temps, when last I saw a report on them, were all less than 35C. Thanks in advance! pastebin of syslog.txt... and add ~15,000 more lines of the same error codes when you get to the end of that file. Even pastebin didn't want any part of that! EDIT: Whoops. Hardware: Supermicro C2SEE, Intel Celeron E1400 (BX80557E1400), 2x1GB Crucial memory (CT2KIT12872BA1067), COOLMAX CU-700B 700W PSU. Looks to me that you have a cabling problem to the drive. I would suspect that you are losing power, but it could be either cable Of course the drive could be bad as well, the only way to tell is to run a smartctl report (see troubleshooting link in my sig). Change the data cable and connect the drive to a different power connector from the PSU. Hopefully that will take care of it. If you have a backplane in play, inspect it carefully for any lose connections. Quote Link to comment
GoChris Posted May 14, 2009 Share Posted May 14, 2009 I've had this before. It would work for a period of time, then start clicking. I narrowed it down to a power splitter. I ran a different power connection to the drive, and the problem went away. Quote Link to comment
RobJ Posted May 15, 2009 Share Posted May 15, 2009 I'm leaning toward bad sectors on the drive (sdb, Samsung HD501LJ), both because of the clicking (very bad sign!), and the media errors with UNC flag. There are suspicious elements of the error sequences (BMDMA & 'Unhandled sense code'), but the uncorrectable media errors combined with clicking, sounds more like failing sectors on the drive. As Brian said, get the SMART report for a more definitive answer. Minor point, you probably have a jumper on the Seagate 1TB drive, check the Improving unRAID Performance, Remove SATA150 Jumper section. Minor point 2, ACPI had a small issue (worked around) while starting. You might keep an eye out for a BIOS upgrade. Quote Link to comment
markguy Posted May 15, 2009 Author Share Posted May 15, 2009 Thanks for the help, folks. I switched out the Samsung drive, checked the cabling as well as I could and things seem to be working now (parity check finished with no errors). I have the smartctl reports just as a measure to ensure things are as stable as I can make them, but will have to put those up later. I'm out the door in a series of errands to... no doubt a futile effort... prepare for the arrival of our third kid next week. Thanks again! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.