Jump to content

SMART errors on SSD cache disk


Recommended Posts

Hi,

 

Long time user and I am currently on 5.0.5. I installed an SSD cache disk about 6 months ago and have been using Dynamix for some time now. I recently logged into the server and saw a number of messages reported by Dynamix (the below messages repeat in this order every 2 minutes):

 

unRAID Cache disk SMART message: 10/30/2014 12:56

Notice: Cache disk passed SMART health check

M4-CT128M4SSD2_00000000115209006A26 (sdj)

 

unRAID Cache disk SMART failure: 10/30/2014 12:58

Alert: Cache disk failed SMART health check

M4-CT128M4SSD2_00000000115209006A26 (sdj)

 

 

This is my log from today:

 

Oct 31 09:45:10 Tower last message repeated 195 times

Oct 31 09:45:11 Tower emhttp: clear: 2% complete

Oct 31 09:45:12 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Oct 31 09:45:43 Tower last message repeated 117 times

Oct 31 09:46:44 Tower last message repeated 191 times

Oct 31 09:47:45 Tower last message repeated 189 times

Oct 31 09:47:56 Tower last message repeated 40 times

Oct 31 09:47:57 Tower emhttp: clear: 3% complete

Oct 31 09:47:58 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Oct 31 09:48:29 Tower last message repeated 93 times

Oct 31 09:49:30 Tower last message repeated 202 times

Oct 31 09:50:31 Tower last message repeated 175 times

Oct 31 09:50:49 Tower last message repeated 63 times

Oct 31 09:50:50 Tower emhttp: clear: 4% complete

Oct 31 09:50:51 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Oct 31 09:51:22 Tower last message repeated 92 times

Oct 31 09:52:24 Tower last message repeated 180 times

Oct 31 09:53:05 Tower last message repeated 131 times

Oct 31 09:53:05 Tower dhcpcd[1061]: eth0: renewing lease of 10.0.1.100

Oct 31 09:53:05 Tower dhcpcd[1061]: eth0: acknowledged 10.0.1.100 from 10.0.1.1

Oct 31 09:53:05 Tower dhcpcd[1061]: eth0: leased 10.0.1.100 for 14400 seconds

Oct 31 09:53:06 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Oct 31 09:53:38 Tower last message repeated 86 times

Oct 31 09:53:46 Tower last message repeated 33 times

Oct 31 09:53:47 Tower emhttp: clear: 5% complete

Oct 31 09:53:48 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Oct 31 09:54:19 Tower last message repeated 95 times

Oct 31 09:55:20 Tower last message repeated 186 times

Oct 31 09:56:22 Tower last message repeated 186 times

Oct 31 09:56:46 Tower last message repeated 66 times

Oct 31 09:56:47 Tower emhttp: clear: 6% complete

Oct 31 09:56:47 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Oct 31 09:57:18 Tower last message repeated 93 times

Oct 31 09:58:19 Tower last message repeated 179 times

Oct 31 09:59:20 Tower last message repeated 184 times

Oct 31 09:59:40 Tower last message repeated 74 times

Oct 31 09:59:41 Tower emhttp: clear: 7% complete

Oct 31 09:59:41 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Oct 31 10:00:12 Tower last message repeated 99 times

Oct 31 10:01:13 Tower last message repeated 167 times

 

Is my cache drive dying? Anything specific I can do to fix this (aside from replacing)? Thank you.

Link to comment

root@Tower:~# smartctl -a /dev/sdj

 

smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build)

Copyright © 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

 

=== START OF INFORMATION SECTION ===

Vendor:              /8:0:0:0

Product:

>> Terminate command early due to bad response to IEC mode page

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

 

root@Tower:~# smartctl -a /dev/sdj -T permissive

 

smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build)

Copyright © 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

 

=== START OF INFORMATION SECTION ===

Vendor:              /8:0:0:0

Product:

scsiModePageOffset: response length too short, resp_len=47 offset=50 bd_len=46

>> Terminate command early due to bad response to IEC mode page

 

=== START OF READ SMART DATA SECTION ===

 

Error Counter logging not supported

 

Device does not support Self Test logging

Link to comment

... running two successive health tests from the command line results in:

 

root@Tower:~# smartctl -H /dev/sdj

 

smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build)

Copyright © 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

 

=== START OF READ SMART DATA SECTION ===

SMART Health Status: OK

 

root@Tower:~# smartctl -H /dev/sdj

 

smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build)

Copyright © 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

 

=== START OF READ SMART DATA SECTION ===

Log Sense failed, IE page [scsi response fails sanity test]

 

Link to comment

I might look further in the log to see if there were other ATA errors. Maybe there are SATA interface errors.

if you can still access the cache drive, I would back it up and/or run the mover.

 

After that I would probably reboot the server. If the drive got into a bad state a power cycle may reset it or you may loose it 100%.

Which is why you should back it up if it's accessible.

Link to comment

Do a power cycle on your server, then smartctl -a on the drive.  Post the syslog, let's see if that resets something internally.

 

 

I used to have an OCZ turbo model that would go offline like that intermittently.

In addition, smartd would constantly report sectors going offline, Sure did make me nervous as that was my vmware partition which had an XP instance on it.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...