What Happened - Red Dot


smdion

Recommended Posts

Dec  9 08:34:12 Sean-unRAID kernel: BTRFS info (device loop8): The free space cache file (6471811072) is invalid. skip it
Dec  9 08:34:12 Sean-unRAID kernel: 
Dec  9 09:45:51 Sean-unRAID kernel: mdcmd (368): spindown 5
Dec  9 10:09:12 Sean-unRAID kernel: mdcmd (369): spindown 3
Dec  9 10:39:34 Sean-unRAID sshd[3304]: Did not receive identification string from 172.17.0.6
Dec  9 10:39:34 Sean-unRAID emhttp: read_line: client closed the connection
Dec  9 10:39:34 Sean-unRAID sshd[3307]: Accepted password for root from 172.17.0.6 port 45154 ssh2
Dec  9 10:39:34 Sean-unRAID sshd[3307]: error: Received disconnect from 172.17.0.6: 10: 
Dec  9 10:39:38 Sean-unRAID sshd[3484]: Did not receive identification string from 172.17.0.6
Dec  9 10:39:39 Sean-unRAID emhttp: read_line: client closed the connection
Dec  9 10:39:39 Sean-unRAID sshd[3485]: Accepted password for root from 172.17.0.6 port 45190 ssh2
Dec  9 10:39:39 Sean-unRAID sshd[3485]: error: Received disconnect from 172.17.0.6: 10: 
Dec  9 10:39:44 Sean-unRAID sshd[3915]: Did not receive identification string from 172.17.0.6
Dec  9 10:39:44 Sean-unRAID sshd[3916]: Accepted password for root from 172.17.0.6 port 45258 ssh2
Dec  9 10:39:44 Sean-unRAID emhttp: read_line: client closed the connection
Dec  9 10:39:44 Sean-unRAID sshd[3916]: error: Received disconnect from 172.17.0.6: 10: 
Dec  9 10:39:49 Sean-unRAID sshd[4107]: Did not receive identification string from 172.17.0.6
Dec  9 10:39:49 Sean-unRAID sshd[4108]: Accepted password for root from 172.17.0.6 port 45303 ssh2
Dec  9 10:39:49 Sean-unRAID sshd[4108]: error: Received disconnect from 172.17.0.6: 10: 
Dec  9 10:39:49 Sean-unRAID emhttp: read_line: client closed the connection
Dec  9 10:39:54 Sean-unRAID sshd[4272]: Did not receive identification string from 172.17.0.6
Dec  9 10:39:54 Sean-unRAID emhttp: read_line: client closed the connection
Dec  9 10:39:54 Sean-unRAID sshd[4273]: Accepted password for root from 172.17.0.6 port 45341 ssh2
Dec  9 10:39:54 Sean-unRAID sshd[4273]: error: Received disconnect from 172.17.0.6: 10: 
Dec  9 10:39:59 Sean-unRAID sshd[4444]: Did not receive identification string from 172.17.0.6
Dec  9 10:39:59 Sean-unRAID emhttp: read_line: client closed the connection
Dec  9 10:39:59 Sean-unRAID sshd[4445]: Accepted password for root from 172.17.0.6 port 45379 ssh2
Dec  9 10:39:59 Sean-unRAID sshd[4445]: error: Received disconnect from 172.17.0.6: 10: 
Dec  9 10:40:04 Sean-unRAID sshd[4618]: Did not receive identification string from 172.17.0.6
Dec  9 10:40:04 Sean-unRAID emhttp: read_line: client closed the connection
Dec  9 10:40:04 Sean-unRAID sshd[4619]: Accepted password for root from 172.17.0.6 port 45423 ssh2
Dec  9 10:40:04 Sean-unRAID sshd[4619]: error: Received disconnect from 172.17.0.6: 10: 
Dec  9 11:04:41 Sean-unRAID kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1
Dec  9 11:04:41 Sean-unRAID kernel: sas: trying to find task 0xffff880038b59180
Dec  9 11:04:41 Sean-unRAID kernel: sas: sas_scsi_find_task: aborting task 0xffff880038b59180
Dec  9 11:04:41 Sean-unRAID kernel: sas: sas_scsi_find_task: task 0xffff880038b59180 is aborted
Dec  9 11:04:41 Sean-unRAID kernel: sas: sas_eh_handle_sas_errors: task 0xffff880038b59180 is aborted
Dec  9 11:04:41 Sean-unRAID kernel: sas: ata8: end_device-1:1: cmd error handler
Dec  9 11:04:41 Sean-unRAID kernel: sas: ata7: end_device-1:0: dev error handler
Dec  9 11:04:41 Sean-unRAID kernel: sas: ata8: end_device-1:1: dev error handler
Dec  9 11:04:41 Sean-unRAID kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec  9 11:04:41 Sean-unRAID kernel: sas: ata9: end_device-1:2: dev error handler
Dec  9 11:04:41 Sean-unRAID kernel: sas: ata10: end_device-1:3: dev error handler
Dec  9 11:04:41 Sean-unRAID kernel: ata8.00: failed command: SMART
Dec  9 11:04:41 Sean-unRAID kernel: ata8.00: cmd b0/d0:01:00:4f:c2/00:00:00:00:00/00 tag 30 pio 512 in
Dec  9 11:04:41 Sean-unRAID kernel:         res 40/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
Dec  9 11:04:41 Sean-unRAID kernel: ata8.00: status: { DRDY }
Dec  9 11:04:41 Sean-unRAID kernel: ata8: hard resetting link
Dec  9 11:04:41 Sean-unRAID kernel: sas: ata12: end_device-1:5: dev error handler
Dec  9 11:04:41 Sean-unRAID kernel: sas: ata11: end_device-1:4: dev error handler
Dec  9 11:04:41 Sean-unRAID kernel: sas: ata13: end_device-1:6: dev error handler
Dec  9 11:04:41 Sean-unRAID kernel: sas: ata14: end_device-1:7: dev error handler
Dec  9 11:04:42 Sean-unRAID kernel: sas: sas_form_port: phy1 belongs to port1 already(1)!
Dec  9 11:04:43 Sean-unRAID kernel: drivers/scsi/mvsas/mv_sas.c 1532:mvs_I_T_nexus_reset for device[1]:rc= 0
Dec  9 11:04:48 Sean-unRAID kernel: ata8.00: qc timeout (cmd 0xec)
Dec  9 11:04:48 Sean-unRAID kernel: ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Dec  9 11:04:48 Sean-unRAID kernel: ata8.00: revalidation failed (errno=-5)
Dec  9 11:04:48 Sean-unRAID kernel: ata8: hard resetting link
Dec  9 11:04:51 Sean-unRAID kernel: drivers/scsi/mvsas/mv_sas.c 1532:mvs_I_T_nexus_reset for device[1]:rc= 0
Dec  9 11:04:51 Sean-unRAID kernel: sas: sas_ata_task_done: SAS error 8a
Dec  9 11:04:51 Sean-unRAID kernel: sas: sas_ata_task_done: SAS error 8a
Dec  9 11:04:51 Sean-unRAID kernel: ata8.00: both IDENTIFYs aborted, assuming NODEV
Dec  9 11:04:51 Sean-unRAID kernel: ata8.00: revalidation failed (errno=-2)
Dec  9 11:04:51 Sean-unRAID kernel: mvsas 0000:02:00.0: Phy1 : No sig fis
Dec  9 11:04:55 Sean-unRAID kernel: sas: sas_form_port: phy1 belongs to port1 already(1)!
Dec  9 11:04:56 Sean-unRAID kernel: ata8: hard resetting link
Dec  9 11:05:01 Sean-unRAID kernel: ata8.00: qc timeout (cmd 0x27)
Dec  9 11:05:01 Sean-unRAID kernel: ata8.00: failed to read native max address (err_mask=0x4)
Dec  9 11:05:01 Sean-unRAID kernel: ata8.00: HPA support seems broken, skipping HPA handling
Dec  9 11:05:01 Sean-unRAID kernel: ata8.00: revalidation failed (errno=-5)
Dec  9 11:05:01 Sean-unRAID kernel: ata8.00: disabled
Dec  9 11:05:01 Sean-unRAID kernel: ata8: hard resetting link
Dec  9 11:05:03 Sean-unRAID kernel: drivers/scsi/mvsas/mv_sas.c 1532:mvs_I_T_nexus_reset for device[1]:rc= 0
Dec  9 11:05:03 Sean-unRAID kernel: ata8: EH complete
Dec  9 11:05:03 Sean-unRAID kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
Dec  9 11:05:03 Sean-unRAID kernel: sd 1:0:1:0: [sdg]  
Dec  9 11:05:03 Sean-unRAID kernel: Result: hostbyte=0x04 driverbyte=0x00
Dec  9 11:05:03 Sean-unRAID kernel: sd 1:0:1:0: [sdg] CDB: 
Dec  9 11:05:03 Sean-unRAID kernel: cdb[0]=0x88: 88 00 00 00 00 01 2a 5d 51 70 00 00 00 08 00 00
Dec  9 11:05:03 Sean-unRAID kernel: end_request: I/O error, dev sdg, sector 5005726064
Dec  9 11:05:03 Sean-unRAID kernel: sd 1:0:1:0: [sdg]  
Dec  9 11:05:03 Sean-unRAID kernel: Result: hostbyte=0x04 driverbyte=0x00
Dec  9 11:05:03 Sean-unRAID kernel: sd 1:0:1:0: [sdg] CDB: 
Dec  9 11:05:03 Sean-unRAID kernel: cdb[0]=0x88: 88 00 00 00 00 00 00 00 d4 88 00 00 00 50 00 00
Dec  9 11:05:03 Sean-unRAID kernel: end_request: I/O error, dev sdg, sector 54408
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 read error, sector=5005726000
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 read error, sector=54344
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 read error, sector=54352
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 read error, sector=54360
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 read error, sector=54368
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 read error, sector=54376
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 read error, sector=54384
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 read error, sector=54392
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 read error, sector=54400
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 read error, sector=54408
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 read error, sector=54416
Dec  9 11:05:03 Sean-unRAID kernel: sd 1:0:1:0: [sdg]  
Dec  9 11:05:03 Sean-unRAID kernel: Result: hostbyte=0x04 driverbyte=0x00
Dec  9 11:05:03 Sean-unRAID kernel: sd 1:0:1:0: [sdg] CDB: 
Dec  9 11:05:03 Sean-unRAID kernel: cdb[0]=0x8a: 8a 00 00 00 00 01 2a 5d 51 70 00 00 00 08 00 00
Dec  9 11:05:03 Sean-unRAID kernel: end_request: I/O error, dev sdg, sector 5005726064
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 write error, sector=5005726000
Dec  9 11:05:03 Sean-unRAID kernel: md: recovery thread woken up ...
Dec  9 11:05:03 Sean-unRAID kernel: md: recovery thread has nothing to resync
Dec  9 11:05:03 Sean-unRAID kernel: sd 1:0:1:0: [sdg]  
Dec  9 11:05:03 Sean-unRAID kernel: Result: hostbyte=0x04 driverbyte=0x00
Dec  9 11:05:03 Sean-unRAID kernel: sd 1:0:1:0: [sdg] CDB: 
Dec  9 11:05:03 Sean-unRAID kernel: cdb[0]=0x8a: 8a 00 00 00 00 00 00 00 d4 88 00 00 00 50 00 00
Dec  9 11:05:03 Sean-unRAID kernel: end_request: I/O error, dev sdg, sector 54408
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 write error, sector=54344
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 write error, sector=54352
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 write error, sector=54360
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 write error, sector=54368
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 write error, sector=54376
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 write error, sector=54384
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 write error, sector=54392
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 write error, sector=54400
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 write error, sector=54408
Dec  9 11:05:03 Sean-unRAID kernel: md: disk1 write error, sector=54416
Dec  9 11:05:03 Sean-unRAID kernel: mvsas 0000:02:00.0: Phy1 : No sig fis
Dec  9 11:05:08 Sean-unRAID kernel: sas: sas_form_port: phy1 belongs to port1 already(1)!
Dec  9 12:14:39 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:39 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:39 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:42 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:42 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:42 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:52 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:52 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:52 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:55 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:55 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:55 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:56 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:56 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:56 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:58 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:58 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:14:58 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:15:23 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:15:23 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Dec  9 12:15:23 Sean-unRAID kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

 

Red thumb down on that disk for SMART.  99.99% sure its a bad drive, which is under warranty (thank god).  Just looking for second opinion.

 

Beta6-12

 

root@Sean-unRAID:~# smartctl -a -d ata /dev/sdg
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.17.4-unRAID] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

Read Device Identity failed: Input/output error

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

Link to comment

I have an older motherboard, and a preferred setup of parity and cache on the motherboard with data drives on an AOC-SAS2LP.  This works under unRAID 5.x, but does not work under 6.x - I experience failures on the data drives like you've seen below.  jphipps has seen something similar as well.  Not sure if your setup has any similarities...

Link to comment

I too have seen that exact scenario elsewhere, in some long threads, where drives on a SAS controller were suddenly dropped, unrecoverable until after reboot.  The drives were always completely fine, so it was a controller or controller driver problem, never resolved as far as I know.

Link to comment

Did you rebuild from parity on the same drive?

Yes, the first time this happened.  As I experimented with it (I've done this a number of times now) I became convinced that I was seeing a driver problem and that the data drive was fine so I switched to doing a New Config and rebuilding parity instead.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.