several dockers just stopped when parity check started this morning

JustinChase · September 1, 2015

I woke up and checked on the bandwith report for my server overnight (I'm on a capped plan with unlimited from midnight to 5am) and saw that it suddenly dropped to nothing at exactly 2am.

I checked and saw that I still have internet, so that wasn't the problem. I then checked and SABnzbd and deluge were no longer responding.

I checked and see that they (and OwnCloud and CouchPotate and SickRage) had all stopped. I tried restarting them, but they just won't start now. If I hit the start icon, I see the system attempt (shows the IP loading banner for a bit), then back to just being stopped. Parity check is still running.

I guess I'll update to 6.1 today, but I've never seen this behavior before and thought I should report it.

diagnostics file it too large to post (which is also weird). The docker log has the attached when I tried to start the dockers, and the last lines continue hundreds (thousands?) of times to the end of the log.

**It seems my cache is no longer recognized, but starting at exactly 2am, when parity check started.

is it really a cache disk problem or a system issue?

docker.zip

syslog.1.txt.zip

mr-hexen · September 1, 2015

I woke up and checked on the bandwith report for my server overnight (I'm on a capped plan with unlimited from midnight to 5am) and saw that it suddenly dropped to nothing at exactly 2am.

Sorry but I don't have anything to help fix your docker issue, but with regards to this ^ I am on a similar plan. A tip for you if you haven't already is to set the deluge scheduler to paused outside of your free hours. My free hours are 2am-8am so I only allow deluge to run during those times. It's worked great! See screen shot of the scheduler attached.

JustinChase · September 1, 2015

yes, thank you. Someone (maybe you) helped me get that setup a couple months ago, and i agree, it works great.

* at least until your dockers just die suddenly

mr-hexen · September 1, 2015

Your dockers won't start because your cache drive was dropped from service:

Sep  1 01:31:10 media logger: mover finished
Sep  1 02:00:01 media kernel: mdcmd (157): check 
Sep  1 02:00:01 media kernel: md: recovery thread woken up ...
Sep  1 02:00:01 media kernel: md: recovery thread checking parity...
Sep  1 02:00:01 media kernel: md: using 2048k window, over a total of 3907018532 blocks.
Sep  1 02:03:38 media kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Sep  1 02:03:38 media kernel: ata8.00: failed command: IDENTIFY DEVICE
Sep  1 02:03:38 media kernel: ata8.00: cmd ec/00:01:00:00:00/00:00:00:00:00/00 tag 5 pio 512 in
Sep  1 02:03:38 media kernel:         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep  1 02:03:38 media kernel: ata8.00: status: { DRDY }
Sep  1 02:03:38 media kernel: ata8: hard resetting link
Sep  1 02:03:38 media kernel: ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Sep  1 02:03:43 media kernel: ata8.00: qc timeout (cmd 0xec)
Sep  1 02:03:43 media kernel: ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Sep  1 02:03:43 media kernel: ata8.00: revalidation failed (errno=-5)
Sep  1 02:03:43 media kernel: ata8: hard resetting link
Sep  1 02:03:44 media kernel: ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Sep  1 02:03:54 media kernel: ata8.00: qc timeout (cmd 0xec)
Sep  1 02:03:54 media kernel: ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Sep  1 02:03:54 media kernel: ata8.00: revalidation failed (errno=-5)
Sep  1 02:03:54 media kernel: ata8: limiting SATA link speed to 1.5 Gbps
Sep  1 02:03:54 media kernel: ata8: hard resetting link
Sep  1 02:03:54 media kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Sep  1 02:04:24 media kernel: ata8.00: qc timeout (cmd 0xec)
Sep  1 02:04:24 media kernel: ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Sep  1 02:04:24 media kernel: ata8.00: revalidation failed (errno=-5)
Sep  1 02:04:24 media kernel: ata8.00: disabled
Sep  1 02:04:24 media kernel: ata8: hard resetting link
Sep  1 02:04:25 media kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Sep  1 02:04:25 media kernel: ata8: EH complete
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#9 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#9 CDB: opcode=0x28 28 00 57 9f 85 5f 00 00 10 00
Sep  1 02:04:25 media kernel: blk_update_request: I/O error, dev sdi, sector 1470072159
Sep  1 02:04:25 media kernel: blk_update_request: I/O error, dev sdi, sector 977145920
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#11 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Sep  1 02:04:25 media kernel: XFS (sdi1): metadata I/O error: block 0x3a3e1001 ("xlog_iodone") error 5 numblks 64
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#11 CDB: opcode=0x28 28 00 3c 33 e9 57 00 00 08 00
Sep  1 02:04:25 media kernel: blk_update_request: I/O error, dev sdi, sector 1010035031
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#13 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#13 CDB: opcode=0x2a 2a 00 1d 1c 20 0f 00 00 08 00
Sep  1 02:04:25 media kernel: blk_update_request: I/O error, dev sdi, sector 488382479
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#15 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#15 CDB: opcode=0x2a 2a 00 1d 1c 4d 8f 00 00 10 00
Sep  1 02:04:25 media kernel: blk_update_request: I/O error, dev sdi, sector 488394127
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#17 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#17 CDB: opcode=0x2a 2a 00 1d 21 40 0f 00 00 08 00
Sep  1 02:04:25 media kernel: blk_update_request: I/O error, dev sdi, sector 488718351
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#19 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#19 CDB: opcode=0x2a 2a 00 1f 5b 82 ff 00 00 10 00
Sep  1 02:04:25 media kernel: blk_update_request: I/O error, dev sdi, sector 526091007
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#21 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#21 CDB: opcode=0x2a 2a 00 57 54 52 60 00 00 01 00
Sep  1 02:04:25 media kernel: blk_update_request: I/O error, dev sdi, sector 1465143904
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#23 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#23 CDB: opcode=0x2a 2a 00 57 54 71 87 00 00 08 00
Sep  1 02:04:25 media kernel: blk_update_request: I/O error, dev sdi, sector 1465151879
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#25 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#25 CDB: opcode=0x2a 2a 00 57 5a da 67 00 00 08 00
Sep  1 02:04:25 media kernel: blk_update_request: I/O error, dev sdi, sector 1465571943
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#27 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00
Sep  1 02:04:25 media kernel: sd 8:0:0:0: [sdi] tag#27 CDB: opcode=0x28 28 00 00 10 61 47 00 00 08 00
Sep  1 02:04:25 media kernel: XFS (sdi1): metadata I/O error: block 0x1d1c1fd0 ("xfs_buf_iodone_callbacks") error 5 numblks 8
Sep  1 02:04:25 media kernel: XFS (sdi1): metadata I/O error: block 0x7d70 ("xfs_trans_read_buf_map") error 5 numblks 8
Sep  1 02:04:25 media kernel: XFS (sdi1): page discard on page ffffea0008928c00, inode 0xc0a7bb88, offset 0.

My guess a reboot might bring it back online, but who knows for how long.

JustinChase · September 1, 2015

Yeah, I saw that, but have no idea why it happened exactly as the parity check started. that's never happened before.

i know my cache drive is getting up there in age, so maybe I just need a new one. SSD's are dropping in price enough it might be time to make the switch.

mr-hexen · September 1, 2015

Could be power related as that's just after spin up for a parity check.

Reboot and start a non correcting check, see what happens.

JustinChase · September 3, 2015

I started another parity check last night, and SABnzbd died and cannot be restarted. here are the last few lines from the syslog...

Sep 3 09:25:16 media emhttp: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog

Sep 3 09:25:33 media kernel: blk_update_request: I/O error, dev loop0, sector 1202400

Sep 3 09:25:33 media kernel: BTRFS: bdev /dev/loop0 errs: wr 7, rd 147113, flush 1, corrupt 0, gen 0

Sep 3 09:25:33 media kernel: blk_update_request: I/O error, dev loop0, sector 3299552

Sep 3 09:25:33 media kernel: BTRFS: bdev /dev/loop0 errs: wr 7, rd 147114, flush 1, corrupt 0, gen 0

Sep 3 09:25:33 media kernel: blk_update_request: I/O error, dev loop0, sector 1202400

Sep 3 09:25:33 media kernel: BTRFS: bdev /dev/loop0 errs: wr 7, rd 147115, flush 1, corrupt 0, gen 0

Sep 3 09:25:33 media kernel: blk_update_request: I/O error, dev loop0, sector 3299552

Sep 3 09:25:33 media kernel: BTRFS: bdev /dev/loop0 errs: wr 7, rd 147116, flush 1, corrupt 0, gen 0

Sep 3 09:25:33 media kernel: blk_update_request: I/O error, dev loop0, sector 1202400

Sep 3 09:25:33 media kernel: BTRFS: bdev /dev/loop0 errs: wr 7, rd 147117, flush 1, corrupt 0, gen 0

Sep 3 09:25:33 media kernel: blk_update_request: I/O error, dev loop0, sector 3299552

Sep 3 09:25:33 media kernel: BTRFS: bdev /dev/loop0 errs: wr 7, rd 147118, flush 1, corrupt 0, gen 0

Sep 3 09:25:33 media kernel: blk_update_request: I/O error, dev loop0, sector 1202400

Sep 3 09:25:33 media kernel: BTRFS: bdev /dev/loop0 errs: wr 7, rd 147119, flush 1, corrupt 0, gen 0

Sep 3 09:25:33 media kernel: blk_update_request: I/O error, dev loop0, sector 3299552

Sep 3 09:25:33 media kernel: BTRFS: bdev /dev/loop0 errs: wr 7, rd 147120, flush 1, corrupt 0, gen 0

Sep 3 09:25:34 media kernel: XFS (sdi1): xfs_log_force: error -5 returned.

Sep 3 09:25:36 media kernel: XFS (sdj1): xfs_log_force: error -5 returned.

mr-hexen · September 3, 2015

How long between when you started the parity check and the docker dropping?

Please post the diagnostics report as well.

JustinChase · September 3, 2015

I don't know. I lost all user share communications also, so I rebooted the server.

I'll start another check later, and see how long it takes, then post a diagnostics report. I suspect it will look very much like the one I already posted. The symptoms are exactly the same.

mr-hexen · September 3, 2015

I'm just trying to determine the power supply theory that when the drives all spin up when a parity check starts the voltage dips too much and causes your cache drive to drop offline.

If the parity check was started and you had no issues for say 15-30 minutes after the check had begun I would rule this theory out.

JustinChase · September 3, 2015

Good idea. I hope you're wrong, since i've already replaced that power supply at least once.

What's weird is that this same equipment (other than the new video card) has been the same for many parity checks. I suppose the new video card is taking the load over the edge, but that would make me very sad.

Hopefully something can be done to stagger the power load to prevent this; if that is actually what's going on.

JustinChase · February 1, 2016

another month, another dying server when I woke up this morning. After about 6 hours one of my several user shares was unavailable, and my unassigned drive was 'blank' and barely recognized. I could still putty in, and could still access most of my user shares. History tells me they would all have fallen offline eventually, and like last time, I'd have to hard boot the server to restore usability; which would just force another parity check, so i cut my losses and stopped the parity check and rebooted.

This has been happening for many months, and I'm not getting any real support from LimeTech to try to resolve. I know there is a lot of other stuff needing attention, and the fact that others aren't reporting this puts it at the bottom of the priority list, but I have not successfully completed a parity check in almost 6 months.

What can I do to help get to the bottom of this issue and resolve?

media-diagnostics-20160201-0839.zip

kgregg · February 1, 2016

is your new power supply more powerful (watts) than the one used before you added new video card? seems like power supply is not enough to run all the drives plus video card. can you remove video card and start parity check?

JustinChase · February 1, 2016

Yes, its more powerful than the old one. I could try removing both video cards and try again, but if the diagnostics doesn't point to this being the issue then it seems a bit of a waste. Unfortunately, I can't read the diagnostics well enough to point to a cause. I do know from looking at it this morning that it took 20 minutes after the parity check started for the next entry in the log; which tells me it's not a power issue, as all disks should have spun up when it started, and I would expect a power issue to show up while (or shortly after) this power spike happened.

Feb  1 02:00:01 media kernel: mdcmd (57): check 
Feb  1 02:00:01 media kernel: md: recovery thread woken up ...
Feb  1 02:00:01 media kernel: md: recovery thread checking parity...
Feb  1 02:00:01 media kernel: md: using 2048k window, over a total of 3907018532 blocks.
Feb  1 02:20:27 media kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Feb  1 02:20:27 media kernel: ata8.00: failed command: SMART
Feb  1 02:20:27 media kernel: ata8.00: cmd b0/d0:01:00:4f:c2/00:00:00:00:00/00 tag 30 pio 512 in
Feb  1 02:20:27 media kernel:         res 40/00:00:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout)
Feb  1 02:20:27 media kernel: ata8.00: status: { DRDY }
Feb  1 02:20:27 media kernel: ata8: hard resetting link
Feb  1 02:20:27 media kernel: ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb  1 02:20:32 media kernel: ata8.00: qc timeout (cmd 0xec)
Feb  1 02:20:32 media kernel: ata8.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Feb  1 02:20:32 media kernel: ata8.00: revalidation failed (errno=-5)
Feb  1 02:20:32 media kernel: ata8: hard resetting link

It appears ata8.00 is my (brand new) Cache disk. I replaced it, and the unassigned SSD a week or so ago. Both of these drives are still connected to the same connections there were previously. I know the SSD is connected to my SATA card, I'm not positive if the cache is connected to that same card or not. I've have to climb up into the attic and take the server apart to check.

RobJ · February 2, 2016

I don't have answers, but here's what I see.

6 onboard ports - 6 array data drives

1st AsMedia card (2 ports) - 2 array data drives

2nd AsMedia card (2 ports) - 2 array data drives

Marvell 9215 card (4 ports) - parity drive (sdh, ata7, 1st port), Seagate 250GB (Cache drive) (sdi, ata8, 2nd port), Kingston 128GB (has Docker image) (sdj, ata9, 3rd port), empty port (ata10, 4th port)

20 minutes after Parity check started, Cache drive dropped out, unresponsive, and was quickly disabled by kernel. You can ignore all subsequent errors concerning ata8 or sdi. About a half minute later, the Kingston was unresponsive too, and was quickly disabled. So you can ignore all errors involving ata9 and sdj from then on. But the docker.img file was on it, so btrfs throws a bunch of errors and quits too, closing down all docker operations.

As best as I can tell, the Marvell controller failed when under high load to the parity drive on the first port, and dropped the other 2 drives. It did NOT drop the parity drive though, so the parity check continued on without issue. But the btrfs crash caused a Call Trace, which may have corrupted system operations, and which *may* be the cause of the subsequent issues you see later.

I think (just my opinion) that replacing the Marvell controller may be your best bet for success. You don't have any empty ports off that controller, or I would test moving the parity drive to one.

JustinChase · February 2, 2016

I don't have answers, but here's what I see.

6 onboard ports - 6 array data drives

1st AsMedia card (2 ports) - 2 array data drives

2nd AsMedia card (2 ports) - 2 array data drives

Marvell 9215 card (4 ports) - parity drive (sdh, ata7, 1st port), Seagate 250GB (Cache drive) (sdi, ata8, 2nd port), Kingston 128GB (has Docker image) (sdj, ata9, 3rd port), empty port (ata10, 4th port)

20 minutes after Parity check started, Cache drive dropped out, unresponsive, and was quickly disabled by kernel. You can ignore all subsequent errors concerning ata8 or sdi. About a half minute later, the Kingston was unresponsive too, and was quickly disabled. So you can ignore all errors involving ata9 and sdj from then on. But the docker.img file was on it, so btrfs throws a bunch of errors and quits too, closing down all docker operations.

As best as I can tell, the Marvell controller failed when under high load to the parity drive on the first port, and dropped the other 2 drives. It did NOT drop the parity drive though, so the parity check continued on without issue. But the btrfs crash caused a Call Trace, which may have corrupted system operations, and which *may* be the cause of the subsequent issues you see later.

I think (just my opinion) that replacing the Marvell controller may be your best bet for success. You don't have any empty ports off that controller, or I would test moving the parity drive to one.

That sounds like a good analysis and would explain what happened.

I'm sad to think my relatively new sata card is bad, but it wouldn't surprise me.

Any suggestions for a good replacement?

Alternatively, what about moving the parity drive to a different controller, since it being pegged seems to be the cause if the failures of the other 2 drives. Maybe swap it for my smallest array disk to see if that prevents this issue?

jonp · February 2, 2016

another month, another dying server when I woke up this morning...

This has been happening for many months, and I'm not getting any real support from LimeTech to try to resolve. I know there is a lot of other stuff needing attention, and the fact that others aren't reporting this puts it at the bottom of the priority list, but I have not successfully completed a parity check in almost 6 months.

What can I do to help get to the bottom of this issue and resolve?

Justin,

The bottom line is that we haven't responded because we cannot recreate your issue. I truly believe you're dealing with issues with your hardware (storage controllers, cables, devices, and/or motherboard). I know that isn't the answer you want to hear, but that's what seems to be indicated in the logs. We're not seeing anything in your logs to point to unRAID itself.

- Jon

JustinChase · February 2, 2016

Justin,

The bottom line is that we haven't responded because we cannot recreate your issue. I truly believe you're dealing with issues with your hardware (storage controllers, cables, devices, and/or motherboard). I know that isn't the answer you want to hear, but that's what seems to be indicated in the logs. We're not seeing anything in your logs to point to unRAID itself.

- Jon

Thanks for the update Jon, I appreciate it. I fully understand that this appears to be a problem unique to me, but I am nowhere near a Linux expert, and am not skilled at reading the log files, so i honestly have NO IDEA what might be going on. I'd hate to think the solution to my problem is to replace my hardware (storage controllers, cables, devices, power supply, motherboard, etc). That's a bit like a car mechanic that just starts replacing parts to fix an issue they don't understand.

I was hoping to just get some feedback on what the logs pointed at, and perhaps some suggestions on what I can do to help pinpoint the cause and get to a solution. It's when I get no response at all that I get frustrated, since all I know is that my system doesn't work as expected for some unknown reason.

With all that said, this seems like a solid analysis to me...

I don't have answers, but here's what I see.

6 onboard ports - 6 array data drives

1st AsMedia card (2 ports) - 2 array data drives

2nd AsMedia card (2 ports) - 2 array data drives

Marvell 9215 card (4 ports) - parity drive (sdh, ata7, 1st port), Seagate 250GB (Cache drive) (sdi, ata8, 2nd port), Kingston 128GB (has Docker image) (sdj, ata9, 3rd port), empty port (ata10, 4th port)

20 minutes after Parity check started, Cache drive dropped out, unresponsive, and was quickly disabled by kernel. You can ignore all subsequent errors concerning ata8 or sdi. About a half minute later, the Kingston was unresponsive too, and was quickly disabled. So you can ignore all errors involving ata9 and sdj from then on. But the docker.img file was on it, so btrfs throws a bunch of errors and quits too, closing down all docker operations.

As best as I can tell, the Marvell controller failed when under high load to the parity drive on the first port, and dropped the other 2 drives. It did NOT drop the parity drive though, so the parity check continued on without issue. But the btrfs crash caused a Call Trace, which may have corrupted system operations, and which *may* be the cause of the subsequent issues you see later.

I think (just my opinion) that replacing the Marvell controller may be your best bet for success. You don't have any empty ports off that controller, or I would test moving the parity drive to one.

That makes sense, and points to a possible solution. Yeah, it involves replacing hardware, but at least it's not just a stab in the dark, which is what I would have had to do absent any help.

I responded that I may just move the parity drive to an onboard controller to see if the constant use of that drive being separated from the 2 drives that are crashing might alleviate the problem. If that doesn't work, a new controller card seems like a good next step.

What would be a good/reliable 4 port SATA card I could purchase as a replacement?

mr-hexen · February 2, 2016

perhaps it's best to replace all 3 SATA cards with something more robust.

kgregg · February 2, 2016

i may be in the minority, but i have questions about the power supply being powerful enough for all those drives plus TWO video cards (presumably, high end video cards that draw a lot of power). How about removing all hardware added since last time the parity check completed successfully (September? August?) and see if a new parity check will finish properly? Just another option. I am no expert in reading the logs.

JustinChase · February 2, 2016

perhaps it's best to replace all 3 SATA cards with something more robust.

Only 1 is a card, all the rest are onboard/built into the motherboard

JustinChase · February 2, 2016

i may be in the minority, but i have questions about the power supply being powerful enough for all those drives plus TWO video cards (presumably, high end video cards that draw a lot of power). How about removing all hardware added since last time the parity check completed successfully (September? August?) and see if a new parity check will finish properly? Just another option. I am no expert in reading the logs.

Well, they aren't that high end of video cards, only one (nVidia 550Ti) has an external/additional power connection and specifies a max power requirement of 116W, the other (nVidia GT720) is a fanless card with no extra power requirements specifies a max power of 23W. So, between both cards, if they were running a full power draw (unlikely during a parity check) they only need a max of 140W, and my power supply is 750W. Seems reasonable to me.

See screenshot of my server sitting at idle and with both cards watching movies. It idles at 138 with all drives spun down, but with both VM's running, but not showing any video. It only jumps to 190W when I spin up all drives and play movies on both VM's. I have 3.5 times that power left in reserve. Seems sufficient to me; but I could certainly be wrong.

bungee91 · February 2, 2016

What would be a good/reliable 4 port SATA card I could purchase as a replacement?

If you have a 4X slot (wired, likely a 16X length slot), a used Adaptec 1430SA works extremely well, and are reasonably priced (~$30).

I'm sure there's a Syba 4 port (or something similar) that is regularly used by users here.

With 4 ports, a 1X card would be a bit insufficient, however if that's all the slots you had, so be it.

itimpi · February 2, 2016

What would be a good/reliable 4 port SATA card I could purchase as a replacement?

I would be tempted to go for a 8 port card even if you do not need it. That gives you some spare ports in case any play up, and you could also use them for SSDs as they can normally be fitted somewhere inside virtually any case.

JustinChase · February 2, 2016

I would be tempted to go for a 8 port card even if you do not need it. That gives you some spare ports in case any play up, and you could also use them for SSDs as they can normally be fitted somewhere inside virtually any case.

Good idea. I'll try to look thru the forums and see what is generally recommended.

I'm going to try to move the cables around and put the parity drive on the motherboard SATA controller before I buy more stuff, in case that works better. The cache and the unassigned drives shouldn't even be communicating because of a parity check, so it's only the parity drive on that controller card which is getting so much activity. if I swap it for one of my smallest array drives, it should only be getting hammered during that one drive's processing, which might not kill it.

If I still have problems, a new card it is (which will be the 3rd one I've purchased for this server )

several dockers just stopped when parity check started this morning

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation