homejones Posted October 31, 2014 Share Posted October 31, 2014 Hi, Long time user and I am currently on 5.0.5. I installed an SSD cache disk about 6 months ago and have been using Dynamix for some time now. I recently logged into the server and saw a number of messages reported by Dynamix (the below messages repeat in this order every 2 minutes): unRAID Cache disk SMART message: 10/30/2014 12:56 Notice: Cache disk passed SMART health check M4-CT128M4SSD2_00000000115209006A26 (sdj) unRAID Cache disk SMART failure: 10/30/2014 12:58 Alert: Cache disk failed SMART health check M4-CT128M4SSD2_00000000115209006A26 (sdj) This is my log from today: Oct 31 09:45:10 Tower last message repeated 195 times Oct 31 09:45:11 Tower emhttp: clear: 2% complete Oct 31 09:45:12 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Oct 31 09:45:43 Tower last message repeated 117 times Oct 31 09:46:44 Tower last message repeated 191 times Oct 31 09:47:45 Tower last message repeated 189 times Oct 31 09:47:56 Tower last message repeated 40 times Oct 31 09:47:57 Tower emhttp: clear: 3% complete Oct 31 09:47:58 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Oct 31 09:48:29 Tower last message repeated 93 times Oct 31 09:49:30 Tower last message repeated 202 times Oct 31 09:50:31 Tower last message repeated 175 times Oct 31 09:50:49 Tower last message repeated 63 times Oct 31 09:50:50 Tower emhttp: clear: 4% complete Oct 31 09:50:51 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Oct 31 09:51:22 Tower last message repeated 92 times Oct 31 09:52:24 Tower last message repeated 180 times Oct 31 09:53:05 Tower last message repeated 131 times Oct 31 09:53:05 Tower dhcpcd[1061]: eth0: renewing lease of 10.0.1.100 Oct 31 09:53:05 Tower dhcpcd[1061]: eth0: acknowledged 10.0.1.100 from 10.0.1.1 Oct 31 09:53:05 Tower dhcpcd[1061]: eth0: leased 10.0.1.100 for 14400 seconds Oct 31 09:53:06 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Oct 31 09:53:38 Tower last message repeated 86 times Oct 31 09:53:46 Tower last message repeated 33 times Oct 31 09:53:47 Tower emhttp: clear: 5% complete Oct 31 09:53:48 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Oct 31 09:54:19 Tower last message repeated 95 times Oct 31 09:55:20 Tower last message repeated 186 times Oct 31 09:56:22 Tower last message repeated 186 times Oct 31 09:56:46 Tower last message repeated 66 times Oct 31 09:56:47 Tower emhttp: clear: 6% complete Oct 31 09:56:47 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Oct 31 09:57:18 Tower last message repeated 93 times Oct 31 09:58:19 Tower last message repeated 179 times Oct 31 09:59:20 Tower last message repeated 184 times Oct 31 09:59:40 Tower last message repeated 74 times Oct 31 09:59:41 Tower emhttp: clear: 7% complete Oct 31 09:59:41 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Oct 31 10:00:12 Tower last message repeated 99 times Oct 31 10:01:13 Tower last message repeated 167 times Is my cache drive dying? Anything specific I can do to fix this (aside from replacing)? Thank you. Quote Link to comment
WeeboTech Posted October 31, 2014 Share Posted October 31, 2014 do smartctl -a on the drive device (/dev/sdj from what I can see below) and post it here. Quote Link to comment
homejones Posted October 31, 2014 Author Share Posted October 31, 2014 Thank you - will do and post back. Just realized I accidentally posted this in the Unraid 6 forum - mods, please move as necessary. Thank you. Quote Link to comment
homejones Posted October 31, 2014 Author Share Posted October 31, 2014 root@Tower:~# smartctl -a /dev/sdj smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build) Copyright © 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: /8:0:0:0 Product: >> Terminate command early due to bad response to IEC mode page A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. root@Tower:~# smartctl -a /dev/sdj -T permissive smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build) Copyright © 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: /8:0:0:0 Product: scsiModePageOffset: response length too short, resp_len=47 offset=50 bd_len=46 >> Terminate command early due to bad response to IEC mode page === START OF READ SMART DATA SECTION === Error Counter logging not supported Device does not support Self Test logging Quote Link to comment
homejones Posted October 31, 2014 Author Share Posted October 31, 2014 ... running two successive health tests from the command line results in: root@Tower:~# smartctl -H /dev/sdj smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build) Copyright © 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Health Status: OK root@Tower:~# smartctl -H /dev/sdj smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build) Copyright © 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === Log Sense failed, IE page [scsi response fails sanity test] Quote Link to comment
WeeboTech Posted October 31, 2014 Share Posted October 31, 2014 I might look further in the log to see if there were other ATA errors. Maybe there are SATA interface errors. if you can still access the cache drive, I would back it up and/or run the mover. After that I would probably reboot the server. If the drive got into a bad state a power cycle may reset it or you may loose it 100%. Which is why you should back it up if it's accessible. Quote Link to comment
homejones Posted October 31, 2014 Author Share Posted October 31, 2014 Hoping it's an I/O issue. I am swapping motherboards next weekend, so let's see what happens when I do that. Thank you! Quote Link to comment
WeeboTech Posted October 31, 2014 Share Posted October 31, 2014 Do a power cycle on your server, then smartctl -a on the drive. Post the syslog, let's see if that resets something internally. I used to have an OCZ turbo model that would go offline like that intermittently. In addition, smartd would constantly report sectors going offline, Sure did make me nervous as that was my vmware partition which had an XP instance on it. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.