brandon Posted September 12, 2011 Share Posted September 12, 2011 I think I'm experiencing slow write performance (parity protected) with my array. It's 9 data drives plus the parity (no cache disk). Usually, I'm getting 8-12MB/s writing to the array from network. I'm connected at 1000MB/s at full-duplex according to ethtool. I didn't think anything of my slow speeds until reading more on the forums here. I've had roughly the same performance on version 4.7 and now with 5.0 b12a. The CPU is an AMD Sempron LE-1200, 1GB of DDR2 memory, Giga-byte GA-MA78G-DS3H motherboard (HPA disabled). I have a Monoprice 2-port pci-E 1x (SIL3132) card and a Rosewill RC-218 4 port SATAII pciE-4x (Marvell 88SX7042) card. The power supply is an Antec Signature 850W. I'm more concerned right now with the write test below. The numbers seem slow, but I'm not sure. If the numbers are high here and low between copies across the network, then I can isolate it to a network issue. But it seems like writing to the array is slow internally within the server. The first column is the average of two dd if=/dev/zero of=//mnt/disk#/test.dd count=8192000 test runs. The second column is hdParm -t Parity drive is a 2TB Seagate Barracuda LP (5900rpm). Disk 1 (sdj) 16.9 80 500GB Western Digital Caviar Blue (7200rpm) Disk 2 (sdh) 15.2 123 1TB Seagate Barracuda 7200.12 (7200rpm) Disk 3 (sdg) 17 108 640GB Western Digital Caviar Blue (7200rpm) Disk 4 (sdk) 12.5 113 1TB Samsung F1 (7200rpm) Disk 5 (sdb) 19.5 75 500GB Hitachi P7K500 (7200rpm) Disk 6 (sdc) 20.2 83 500GB Hitachi P7K500 (7200rpm) Disk 7 (sdi) 16.1 108 1TB Samsung F1 (7200rpm) Disk 8 (sde) 17.6 108 1TB Western Digital Caviar Green (5400rpm) Disk 9 (sdd) 13.3 120 2TB Western Digital Caviar Green (5400rpm) syslog-2011-09-12.txt Quote Link to comment
dgaschk Posted September 12, 2011 Share Posted September 12, 2011 That PSU has 4 12V rails. See this for PSU info: http://lime-technology.com/forum/index.php?topic=12219.0 Quote Link to comment
brandon Posted September 12, 2011 Author Share Posted September 12, 2011 That PSU has 4 12V rails. See this for PSU info: http://lime-technology.com/forum/index.php?topic=12219.0 My mistake. I realized that the power supply for the server is actually a Thermaltake Toughpower 1000w Cable Management. One more thing I realized is that the parity drive isn't 4k aligned. Could that be what is causing the performance issue? If so, I have a brand new 2TB Seagate LP 64MB cache drive that I'll preclear with 4k alignment, and replace the existing parity drive with it. Quote Link to comment
dgaschk Posted September 12, 2011 Share Posted September 12, 2011 The Thermaltake also has four 12V rails. The maximum amperage is 36A. Quote Link to comment
brandon Posted September 12, 2011 Author Share Posted September 12, 2011 The Thermaltake also has four 12V rails. The maximum amperage is 36A. What is nice about that power supply is that each modular connection is labelled as to which voltage rail it is part of. I've distributed the rails pretty evenly (which was easy since half the drives are direct SATA power connections, and the other half are in hotswap racks that take molex. Quote Link to comment
Joe L. Posted September 12, 2011 Share Posted September 12, 2011 That PSU has 4 12V rails. See this for PSU info: http://lime-technology.com/forum/index.php?topic=12219.0 My mistake. I realized that the power supply for the server is actually a Thermaltake Toughpower 1000w Cable Management. One more thing I realized is that the parity drive isn't 4k aligned. Could that be what is causing the performance issue? If so, I have a brand new 2TB Seagate LP 64MB cache drive that I'll preclear with 4k alignment, and replace the existing parity drive with it. 4K alignment is ONLY an issue with 2TB EARS (so-called "advanced format") drives. Your parity drive is a seagate... Alignment will not make any difference. What will make a difference is replacing the parity drive with one with a faster rotational speed. (since you have a large number of 7200 RPM drives, a 7200 RPM parity drive will make writing to those drives about a third faster.) When writing to the array, the slowest rotational speed drive involved dictates the overall write speed. Quote Link to comment
brandon Posted September 12, 2011 Author Share Posted September 12, 2011 That PSU has 4 12V rails. See this for PSU info: http://lime-technology.com/forum/index.php?topic=12219.0 My mistake. I realized that the power supply for the server is actually a Thermaltake Toughpower 1000w Cable Management. One more thing I realized is that the parity drive isn't 4k aligned. Could that be what is causing the performance issue? If so, I have a brand new 2TB Seagate LP 64MB cache drive that I'll preclear with 4k alignment, and replace the existing parity drive with it. 4K alignment is ONLY an issue with 2TB EARS (so-called "advanced format") drives. Your parity drive is a seagate... Alignment will not make any difference. What will make a difference is replacing the parity drive with one with a faster rotational speed. (since you have a large number of 7200 RPM drives, a 7200 RPM parity drive will make writing to those drives about a third faster.) When writing to the array, the slowest rotational speed drive involved dictates the overall write speed. So are the speeds I'm getting typical? It seems like others with all Green/LP drives are able to get faster speeds than me. I know that the Green/LP drives in my array are holding me back, but I'd assume that I'd still be able to hit 20MB/s on transfers. Most of the internal writes using dd are around 15MB/s. Quote Link to comment
prostuff1 Posted September 12, 2011 Share Posted September 12, 2011 perhaps some drives are registering as IDE instead of SATA and AHCI. Check the BIOS on the board and if you see anything about IDE disable it. Quote Link to comment
brandon Posted September 12, 2011 Author Share Posted September 12, 2011 perhaps some drives are registering as IDE instead of SATA and AHCI. Check the BIOS on the board and if you see anything about IDE disable it. I'm in AHCI mode in the BIOS. Is there anything I would need to do for my pci-e controller cards? Quote Link to comment
mbryanr Posted September 12, 2011 Share Posted September 12, 2011 Which controller are these on? Line 1102: Sep 8 15:34:51 fileserver kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Line 1114: Sep 8 15:55:20 fileserver kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Line 1134: Sep 8 16:20:14 fileserver kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Line 1339: Sep 10 02:12:00 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Line 1448: Sep 10 18:58:50 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Line 1691: Sep 12 12:25:29 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Line 1704: Sep 12 12:27:07 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Line 1717: Sep 12 12:27:27 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Are these normal for your system/card? http://lime-technology.com/wiki/index.php?title=The_Analysis_of_Drive_Issues#Drive_interface_issue_.232 Quote Link to comment
brandon Posted September 12, 2011 Author Share Posted September 12, 2011 Which controller are these on? Line 1102: Sep 8 15:34:51 fileserver kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Line 1114: Sep 8 15:55:20 fileserver kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Line 1134: Sep 8 16:20:14 fileserver kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Line 1339: Sep 10 02:12:00 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Line 1448: Sep 10 18:58:50 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Line 1691: Sep 12 12:25:29 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Line 1704: Sep 12 12:27:07 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Line 1717: Sep 12 12:27:27 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Are these normal for your system/card? http://lime-technology.com/wiki/index.php?title=The_Analysis_of_Drive_Issues#Drive_interface_issue_.232 How can I check what drives those correspond to? I will check in an hour or two when I'm at home. Quote Link to comment
brandon Posted September 12, 2011 Author Share Posted September 12, 2011 Which controller are these on? Line 1102: Sep 8 15:34:51 fileserver kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Line 1114: Sep 8 15:55:20 fileserver kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Line 1134: Sep 8 16:20:14 fileserver kernel: ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Line 1339: Sep 10 02:12:00 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Line 1448: Sep 10 18:58:50 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Line 1691: Sep 12 12:25:29 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Line 1704: Sep 12 12:27:07 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Line 1717: Sep 12 12:27:27 fileserver kernel: ata6.00: exception Emask 0x50 SAct 0x0 SErr 0x280900 action 0x6 frozen Are these normal for your system/card? http://lime-technology.com/wiki/index.php?title=The_Analysis_of_Drive_Issues#Drive_interface_issue_.232 How can I check what drives those correspond to? I will check in an hour or two when I'm at home. Ok so I checked the system now that I'm home from work. They are both devices on the onboard SATA ports. The device ata7 had a semi loose SATA power cable. It was plugged in, but when I pushed on it, it moved maybe a milimetre and it clicked in. I also didn't remember that there was a Promise PCI SATA150 card installed (no drives on it), and I removed it. I also swapped the ata7 device from the onboard SATA ports to the remaining port on the SIL3132 pci-E card. I figured might as well since I was getting SB600/700 softreset errors from the onboard ports. One other final change. I checked the manual for the motherboard, and found that since I was using a specific pci-E slot for the Monoprice SATA controller on my motherboard, the pci-E 4x slot where the Rosewill controller card was plugged into was running at 1x speeds. I moved over the Monoprice card to the pci-E 16x slot, so now the Rosewill controller can now use the full 4x slot (rather than being limited to 1x). Attached is a new syslog and screenshot of my BIOS settings. syslog-2011-09-121.txt Quote Link to comment
brandon Posted September 12, 2011 Author Share Posted September 12, 2011 Attached are my AHCI bios settings. Quote Link to comment
mbryanr Posted September 12, 2011 Share Posted September 12, 2011 Don't have a chance to look right now, but looks like you found a few things that should improve your writes. Did find this on the SB600 [ 2.056412] ata5: applying SB600 PMP SRST workaround and retrying The above two are expected. It's a bug in SB600 controller being worked around. http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-09/msg05203.html Quote Link to comment
brandon Posted September 12, 2011 Author Share Posted September 12, 2011 Don't have a chance to look right now, but looks like you found a few things that should improve your writes. Did find this on the SB600 [ 2.056412] ata5: applying SB600 PMP SRST workaround and retrying The above two are expected. It's a bug in SB600 controller being worked around. http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-09/msg05203.html I just got a new drive today as I'm running out of space. It's preclearing now, so I'll redo tests tomorrow or the day after when it's part of the array (replacing one of the 500GB drives). Quote Link to comment
brandon Posted September 15, 2011 Author Share Posted September 15, 2011 Don't have a chance to look right now, but looks like you found a few things that should improve your writes. Did find this on the SB600 [ 2.056412] ata5: applying SB600 PMP SRST workaround and retrying The above two are expected. It's a bug in SB600 controller being worked around. http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-09/msg05203.html I just got a new drive today as I'm running out of space. It's preclearing now, so I'll redo tests tomorrow or the day after when it's part of the array (replacing one of the 500GB drives). So the new drive I was going to add ended up having lots (1000+) bad sectors during preclear. So the new drive is being held off while I get it RMA'd. I did do new tests on my file server. The numbers in brackets are the new results. The numbers are generally the same. I'll have to check when I go home, but the drives (1,2,6,7) that write faster seem to be on the Rosewill pcie 4x card. Could the motherboard's SATA controller be saturated? I have 5 drives (parity is one of the drives) on it. I have a spare Promise 4-port SATA300 PCI controller I could use dedicated for just the parity drive. Is that worth trying? The first column is the average of two dd if=/dev/zero of=//mnt/disk#/test.dd count=8192000 test runs. The second column is hdParm -t Parity drive is a 2TB Seagate Barracuda LP (5900rpm). Disk 1 (sdj) 16.9(15.6) 80(81) 500GB Western Digital Caviar Blue (7200rpm) Disk 2 (sdh) 15.2(17.5) 123(92) 1TB Seagate Barracuda 7200.12 (7200rpm) Disk 3 (sdg) 17(19.6) 108(108) 640GB Western Digital Caviar Blue (7200rpm) Disk 4 (sdk) 12.5(13. 113(115) 1TB Samsung F1 (7200rpm) Disk 5 (sdb) 19.5(14.4) 75(73) 500GB Hitachi P7K500 (7200rpm) Disk 6 (sdc) 20.2(18.9) 83(84) 500GB Hitachi P7K500 (7200rpm) Disk 7 (sdi) 16.1(18.4) 108(105) 1TB Samsung F1 (7200rpm) Disk 8 (sde) 17.6(13.0) 108(108) 1TB Western Digital Caviar Green (5400rpm) Disk 9 (sdd) 13.3(13.9) 120(122) 2TB Western Digital Caviar Green (5400rpm) Quote Link to comment
brandon Posted September 16, 2011 Author Share Posted September 16, 2011 Are the speeds I'm getting normal for my setup? It still seems like writes are slow, especially those speeds are the internal write speeds without adding network overhead yet. Quote Link to comment
dgaschk Posted September 16, 2011 Share Posted September 16, 2011 Those speeds are slow. There is trouble with ata6 and ata7; but that should not effect the speed of other drives unless one of them,i.e. ata6 or ata7, is the parity drive. Quote Link to comment
brandon Posted September 20, 2011 Author Share Posted September 20, 2011 Those speeds are slow. There is trouble with ata6 and ata7; but that should not effect the speed of other drives unless one of them,i.e. ata6 or ata7, is the parity drive. Nope, neither ata6 or ata7 are the parity drive. Quote Link to comment
brandon Posted September 20, 2011 Author Share Posted September 20, 2011 So I checked my memory usage and top processes when running dd (which would be the same as writing anything to the array right?), and I'm concerned with the results. top - 08:39:48 up 3 days, 13:30, 2 users, load average: 1.89, 1.70, 1.04 Tasks: 96 total, 2 running, 94 sleeping, 0 stopped, 0 zombie Cpu(s): 3.2%us, 18.3%sy, 0.0%ni, 50.1%id, 26.4%wa, 0.0%hi, 2.1%si, 0.0%st Mem: 901436k total, 892476k used, 8960k free, 48316k buffers Swap: 0k total, 0k used, 0k free, 722876k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11058 root 20 0 2304 692 520 R 75.3 0.1 2:42.12 dd It looks like cpu usage is high for the dd process, and my load average is quite high too right? I'm using a single core Sempron LE-1200. Could that be holding things back? Quote Link to comment
brandon Posted September 21, 2011 Author Share Posted September 21, 2011 So I think I figured out what the problem is. I swapped the Sempron LE-1200 processor (2.1GHz single-core) with a Phenom II 7750 (2.7GHz dual-core). Speeds have increased. With the old CPU, it was maxing out all the time when doing a write to the array. The first column is the average of two dd if=/dev/zero of=//mnt/disk#/test.dd count=8192000 test runs. The second column is hdParm -t Parity drive is a 2TB Seagate Barracuda LP (5900rpm). The unformatted numbers are the original system, then round bracketed ones are after my initial attempts to fix the issue (an earlier post), and the square bracketed numbers are the current numbers. Disk 1 (sdj) 16.9(15.6)[25.5] 80 (81 )[80 ] 500GB Western Digital Caviar Blue (7200rpm) Disk 2 (sdb) 15.2(17.5)[27.7] 123(92 )[121] 1TB Seagate Barracuda 7200.12 (7200rpm) Disk 3 (sdh) 17.0(19.6)[28.8] 108(108)[108] 640GB Western Digital Caviar Blue (7200rpm) Disk 4 (sdk) 12.5(13.[22.7] 113(115)[114] 1TB Samsung F1 (7200rpm) Disk 5 (sdc) 19.5(14.4)[26.4] 75 (73 )[75 ] 500GB Hitachi P7K500 (7200rpm) Disk 6 (sdf) 20.2(18.9)[16.5] 83 (84 )[81 ] 500GB Hitachi P7K500 (7200rpm) ----> now a Seagate LP 2TB Disk 7 (sdi) 16.1(18.4)[26.7] 108(105)[110] 1TB Samsung F1 (7200rpm) Disk 8 (sde) 17.6(13.0)[24.5] 108(108)[122] 1TB Western Digital Caviar Green (5400rpm) Disk 9 (sdd) 13.3(13.9)[22.0] 120(122)[121] 2TB Western Digital Caviar Green (5400rpm) The new Seagate 2TB is on the Promise PCI controller. I moved it to that controller for the preclear, but I forgot to move it back to the Monoprice 2-port PCI-e controller. I'm assuming it'll be fast when I hook it back up. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.