RC11 on X9SCL-F-O LGA w/ i3-2100 slow Parity Checks with new SAS2LP-MV8


Recommended Posts

It all started when I added the remaining 4 drives to my SASLP-MV8.  Previous to this I was using the 6 ports on the mobo and 4 ports on the SASLP-MV8 running unraid 4.7.  With only 4 drives on the SASLP-MV8 parity checks were 100MB/s and ended in the 70MB/s range.  All 2TB Ears or Earx.  After adding the remain 4 drives to the SASLP-MV8 my parity checks went down to 70MB/s constant.    So to me that looked like a bandwidth restriction on the 4X PCIe bus.  So I went out and got the SAS2LP-MV8.  Upgraded unraid to 5RC11.  The parity check then drop to less than 50MB/sec.  I checked my smart status on the drives and noticed two  "current pending sector" on the parity drive (2tb ears).  Order a 3TB WD red and pre-cleared.  Added it to the array and re-built speed of the parity drive started at 108MB/s.  At the ended of 2TB it was 60MB/s and averaged 75MB/s overall.  Sounds like the drive was the slowdown and I have fixed my bottleneck.  Nope this morning I started a parity check to make sure the parity re-built was correct and it is checking at less the 50MB/s again.  I am running unmenu with very little packages installed (APC, PHP, smart history)  I ran the user script to check hard drive speeds and they all show 100+MB/s.  I do have the SAS2LP-MV8 installed in a 8X PCIe slot.  I'm guessing this is why Unraid is still a RC.  Any suggestion on how to restore my parity check speed? 

 

parity WDC_WD30EFRX-68AX9N0_WD-WMC1T1589815    (sda) 2930266532  30°C 3 TB

disk1  WDC_WD20EARS-00MVWB0_WD-WCAZA1105446  (sdb) 1953514552  30°C 2 TB 

disk2  WDC_WD20EARS-00MVWB0_WD-WCAZA1136222  (sdc)  1953514552    30°C 2 TB

disk3  WDC_WD20EARS-00MVWB0_WD-WMAZA1092316  (sdd) 1953514552    29°C 2 TB

disk4  WDC_WD20EARS-00MVWB0_WD-WMAZA3680791  (sde) 1953514552    30°C 2 TB

disk5  WDC_WD20EARS-00MVWB0_WD-WMAZA1058936  (sdf)  1953514552    34°C 2 TB

disk6  WDC_WD20EARS-00MVWB0_WD-WCAZA2643619  (sdn) 1953514552    30°C 2 TB

disk7  WDC_WD20EARS-00S8B1_WD-WCAVY5678574      (sdm) 1953514552  35°C 2 TB

disk8  WDC_WD20EARS-00MVWB0_WD-WMAZA4276078  (sdl)  1953514552    35°C 2 TB

disk9  WDC_WD20EARX-00PASB0_WD-WMAZA5701378    (sdo) 1953514552    27°C 2 TB

disk10 WDC_WD20EARX-00ZUDB0_WD-WCC1H0740724  (sdk) 1953514552    28°C 2 TB

disk11 WDC_WD20EARX-00ZUDB0_WD-WCC1H0748053  (sdj)  1953514552    29°C 2 TB

disk12 WDC_WD20EARX-00AZ6B0_WD-WCC070165700    (sdi)  1953514552    32°C 2 TB

disk13 WDC_WD20EARS-00MVWB0_WD-WMAZA3792065  (sdh) 1953514552    29°C 2 TB

flash  JD_FireFly - 2 GB 1.5 GB 763 232 0

syslog-2013-02-11.txt

Link to comment

The SAS2LP is w PCIe x4 2.0 card. It will run at PCIe x4 even if inserted in a a PCI-e x8 slot. It may improve performance if the MB has PCIe 2.0 slots and the drives each operate at over 150MBps, otherwise it should make no difference at all. I think the EARS and EARX drive operate at about 110MBps maximum so 8 of them will not saturate a PCIe x4 1.1 bus.

 

The problem is more likely cause by read errors or a bad or loose SATA cable. It's normal for drives to operate more slowly at the end of the disks. The screenshot shows that the check is in the last part of the 2T disks The speed should improve once the check is past the 2T point. What was the overall speed at completion?

Link to comment

Once pass 2T the parity check went to 112MB/s and ended at 80MB/s.  The Mobo has 2 PCIe  v2.0 8x and 1 PCIe v2.0 4x in a 8x slot.  The old SASLP-MV8 was 4x and the SAS2LP-MV8 is 8x. 

 

It is the fact that the SASLP-MV8 slowed down when I filled all 8 ports.  It is a 4x card that is maxed out at 75MB/s with 8 drives attatched. 

 

I changed the  card out for the SAS2LP-MV8 (8x) and I stayed at 50MB/s for the whole 2TB on parity check.  That is why I changed the parity drive, it showed errors.  I don't have a backplane, just direct cable connects.  I might have to check cables once more and maybe change out.  After I  replace the parity drive I did a re-build of the parity drive.  Rebuild started out a 110MB/s.  At 2TB it was 60MB/s.  Then from 2-3TB it was 112-80MB/s.  Once the array was protected and up and running I did a parity check.  The check stayed at 50MB/s for the first 2TB and went up to 112MB/s.  It makes no sense to me.  The parity check just reads from all drives and the parity rebuild reads from 13 and writes to 1.

Link to comment

Not sure which part doesn't make sense, if it's the part where speed drops until the 2TB mark, then speeds up much higher, please see the Parity Check Speed section of the Improving unRAID Performance wiki page for an explanation.

 

If what doesn't make sense is the 50MB/s top speed during the first 2TB, yes that is a little troubling.  You may have reached the real world performance level of that combination of hardware.  Just one idea, have you seen this post by dheg, and the associated thread.  Any chance it could apply to your card too?  Or perhaps there is a similar update for it?

Link to comment
I think the EARS and EARX drive operate at about 110MBps maximum so 8 of them will not saturate a PCIe x4 1.1 bus.

 

Each lane of the PCIe v1.x interface will, theoretically, run at 250MB/s.  Four lanes will give 1000MB/s.  Real world performance may approach 800MB/s.  Eight drives, at 110MB/s would add up to 880MB/s and can, therefore, saturate a PCIe V1.x x4 connection.

Link to comment
The check stayed at 50MB/s for the first 2TB and went up to 112MB/s.

 

That suggests, to me, that either:

 

    one (or more?) of your 2TB drives is underperforming.

 

or:

 

    your mv8 is not correctly negotiating a x8 connection.

 

Obviously, once the parity process passes 2TB, it only has one drive (the parity) to read/write.  It would be worrying if this couldn't achieve the full 110MB/s capability of the drive.

 

I ran the user script to check hard drive speeds and they all show 100+MB/s.

 

Each and every one of them?  Parity processing cannot run any faster than the slowest drive.

 

I'm guessing this is why Unraid is still a RC.

 

I think it highly unlikely that this is anything to do with unRAID.  It could, conceivably, be a kernel or driver issue, but it's much more likely that there is some constraint within your hardware.

Link to comment
The part that doesn't make sense is the parity re-build did what it was suppose to do  (Start at 112MB/s) and the parity check didn't. (Started at 50MB/s)

 

Okay, I guess that does seem a little odd.  I would just point out that parity check is reading from all drives, without writing.  Parity build reads from all drives but one, then writes to the parity drive.

Link to comment

Here is my log during the parity check.  It is a real boring read.  I'm guessing it is a driver issue with the SAS2LP-MV8.  The avg parity check speed is 57MB/s.  Might just stick the SASLP-MV8 back in for now. 

 

Feb 11 05:37:30 ACNAS01 kernel: mdcmd (71): check CORRECT

Feb 11 05:37:30 ACNAS01 kernel: md: recovery thread woken up ...

Feb 11 05:37:30 ACNAS01 kernel: md: recovery thread checking parity...

Feb 11 05:37:30 ACNAS01 kernel: md: using 1536k window, over a total of 2930266532 blocks.

Feb 11 10:59:37 ACNAS01 kernel: mdcmd (72): clear

Feb 11 13:47:45 ACNAS01 kernel: mdcmd (73): clear

Feb 11 18:09:00 ACNAS01 kernel: mdcmd (74): spindown 1

Feb 11 18:09:01 ACNAS01 kernel: mdcmd (75): spindown 2

Feb 11 18:09:01 ACNAS01 kernel: mdcmd (76): spindown 3

Feb 11 18:09:02 ACNAS01 kernel: mdcmd (77): spindown 5

Feb 11 18:12:12 ACNAS01 kernel: mdcmd (78): spindown 6

Feb 11 18:12:13 ACNAS01 kernel: mdcmd (79): spindown 7

Feb 11 18:12:14 ACNAS01 kernel: mdcmd (80): spindown 8

Feb 11 18:12:14 ACNAS01 kernel: mdcmd (81): spindown 9

Feb 11 18:12:14 ACNAS01 kernel: mdcmd (82): spindown 10

Feb 11 18:12:15 ACNAS01 kernel: mdcmd (83): spindown 11

Feb 11 18:12:15 ACNAS01 kernel: mdcmd (84): spindown 12

Feb 11 18:12:16 ACNAS01 kernel: mdcmd (85): spindown 13

Feb 11 20:12:56 ACNAS01 kernel: md: sync done. time=52526sec

Feb 11 20:12:56 ACNAS01 kernel: md: recovery thread sync completion status: 0

 

Link to comment

People will just blame your hardware, or your cables, etc. I personally would take it all with a grain of salt. Fact is I have 2 servers with roughly the same hardware as you. Anything after unRAID 5.0 Beta12a results in 50-60MB/s parity checks, anything before results in 100+. Fresh parity rebuilds go 100MB/s+ on any versions (same as you).  The problem is something was changed in unRAID, if you look back at old beta you will see many people experienced this. It was never fixed, never acknowledged, etc. I've reported it many times and now people seem to have forgot about it and always blame the user's server(s) when someone reports it.

 

I'm 100% confident the problem is unRAID having some type of driver/kernel issue, probably with the SAS2LP cards. I don't see it ever being fixed because the problem is apparently always on the users side. I would tell you to email tom and ask him to look into it, but i've only done that a dozen times with no replies back.

 

By the way, I've also ran tests on each and every drive in my servers, the slowest one was 116MB/s. The fact we get 100MB/s+ during parity rebuild proves the bandwidth is there, and is further proved by downgrading to 12a. The problem is something in the newer versions of unRAID is slowing down parity syncs on our hardware. Sadly, I do NOT suggest downgrading, the older versions of unRAID had more serious bugs with these cards.

 

I think it highly unlikely that this is anything to do with unRAID.  It could, conceivably, be a kernel or driver issue, but it's much more likely that there is some constraint within your hardware.

 

I'm confident it is not his server. It is very likely a kernel or driver issue, and it really needs to be fixed. I spent about $1800 on my servers (not including hard drives) just so I could get faster parity checks and get off my old PCI-X interface. It worked great, then a few weeks later I upgraded unRAID and my new servers became slower than my old servers. Downgrading results in good performance, but then you have to deal with the SAS bug that randomly brings the whole server down. I've had to choose between speed and stability for almost a year now, and Tom has not acknowledged the problem. I was confident he would get this fixed, or at least figure out what was causing it, before 5.0 went final... but i'm having my doubts now.

Link to comment

People will just blame your hardware, or your cables, etc. I personally would take it all with a grain of salt. Fact is I have 2 servers with roughly the same hardware as you. Anything after unRAID 5.0 Beta12a results in 50-60MB/s parity checks, anything before results in 100+. Fresh parity rebuilds go 100MB/s+ on any versions (same as you).  The problem is something was changed in unRAID, if you look back at old beta you will see many people experienced this. It was never fixed, never acknowledged, etc. I've reported it many times and now people seem to have forgot about it and always blame the user's server(s) when someone reports it.

 

I'm 100% confident the problem is unRAID having some type of handshaking or driver issue, probably with the SAS2LP cards. I don't see it ever being fixed because the problem is apparently always on the users side. I would tell you to email tom and ask him to look into it, but i've only done that a dozen times with no replies back.

 

By the way, I've also ran tests on each and every drive in my servers, the slowest one was 116MB/s. The fact we get 100MB/s+ during parity rebuild proves the bandwidth is there, and is further proved by downgrading to 12a. The problem is something in the newer versions of unRAID is slowing down parity syncs on our hardware. Sadly, I do NOT suggest downgrading, the older versions of unRAID had more serious bugs with these cards.

 

Thank you.  I was hoping someone would say what you just said.  I have two servers and my other server is the same except it has 2 SASLP-MV8 and it parity checks at 75MB/s.  Looks like I will not be upgrading that server with 2 SAS2LP-MV8 any time soon. 

Link to comment
I'm confident it is not his server. It is very likely a kernel or driver issue, and it really needs to be fixed.

 

Okay, with your additional evidence, there might be something to go on.  You're both using SAS2LP-MV8 cards ... are there any other commonalities in your hardware?

 

The SAS2LP uses a Marvel chip - perhaps there is something odd in the driver.  I don't know whether there may be any connection, but I also have a Marvel-based controller (HighPoint RocketRAID 620) which has been performing badly ever since I installed it.  Parity checks went down from 100+MB/s to around 30MB/s as soon as I put one of my array drives on the RocketRAID.  A couple of weeks ago, I upgraded my system from Intel DH55-TC/i3-530, to a Supermicro X9SCM-iiF/Xeon X1230.

 

I was very (pleasantly) surprised to find that the RocketRAID now transfers data as fast as the disk will allow and parity checks run at 100+MB/s again..

 

So, was this a software problem or a hardware problem? - I can't be sure, but changing the underlying hardware effected a cure.

 

For the record, the RocketRAID is a PCIe v2 x1 card and I'm using the basic AHCI driver on it.

 

I think that it might be worth canvassing for other SAS2LP-MV8 users to find out what transfer rates they achieve and what other hardware they have in their systems.  I suspect that this problem, and the slow writes reported on X9-SCM motherboards, are both caused by a combination of more than one item of hardware.

 

Link to comment

I think that it might be worth canvassing for other SAS2LP-MV8 users to find out what transfer rates they achieve and what other hardware they have in their systems.  I suspect that this problem, and the slow writes reported on X9-SCM motherboards, are both caused by a combination of more than one item of hardware.

 

Have you checked the motherboard PCI-E settings?  Before upgrading my board to a newer  ASROCK Extreme 4 board, I was using an older board and my SASLP-MV8 was being limited to PCI-E 1 speeds due to motherboard settings.  Changed the motherboard setting and started getting much better speeds.

 

I'm sure that isn't it, but wanted to make sure the basics are covered.  My thought was perhaps you set it specifically for the SASLP-MV8, and once you replaced it with the SAS2LP-MV8 never changed it.

 

-Marcus

Link to comment

The SAS2LP is w PCIe x4 2.0 card. It will run at PCIe x4 even if inserted in a a PCI-e x8 slot. It may improve performance if the MB has PCIe 2.0 slots and the drives each operate at over 150MBps, otherwise it should make no difference at all. I think the EARS and EARX drive operate at about 110MBps maximum so 8 of them will not saturate a PCIe x4 1.1 bus.

 

According to this link:

 

http://www.supermicro.com/products/accessories/addon/aoc-sas2lp-mv8.cfm

 

The AOC-SAS2LP-MV8 card has a PCIe x8 interface.  This is what I see in my syslog:

 

mvsas 0000:02:00.0: mvsas: PCI-E x8, Bandwidth Usage: 5.0 Gbps (Drive related)

 

All my drives are connected via an AOC-SAS2LP-MV8. I'm slowly moving to 4TB 7200rpm drives, but right now I have a lot of 2TB 5900rpm drives.  Still, my parity checks are often in the 80-85MB/s range ... and this is on 5.0-RC11a.

 

Never seen speeds of over 100MB/s during parity checks, but I'd love to.  Maybe once I complete the migration to the 4TB drives, but that's gonna take a while.

Link to comment

 

All my drives are connected via an AOC-SAS2LP-MV8. I'm slowly moving to 4TB 7200rpm drives, but right now I have a lot of 2TB 5900rpm drives.  Still, my parity checks are often in the 80-85MB/s range ... and this is on 5.0-RC11a.

 

Never seen speeds of over 100MB/s during parity checks, but I'd love to.  Maybe once I complete the migration to the 4TB drives, but that's gonna take a while.

 

What mobo are you using?

Link to comment

 

All my drives are connected via an AOC-SAS2LP-MV8. I'm slowly moving to 4TB 7200rpm drives, but right now I have a lot of 2TB 5900rpm drives.  Still, my parity checks are often in the 80-85MB/s range ... and this is on 5.0-RC11a.

 

Never seen speeds of over 100MB/s during parity checks, but I'd love to.  Maybe once I complete the migration to the 4TB drives, but that's gonna take a while.

 

What mobo are you using?

 

Sorry, always meant to put together a .sig:

 

OS: UnRaid 5.0-rc11a

MB: ASUS M3A78-T AM2+/AM2

CPU: AMD Phenom II X6 1100T 3.3GHz (downclocked to 2 cores)

RAM: Mushkin 991762 4x4GB DDR2 800 PC2 6400 (16GB)

LAN: Marvell Yukon 88E8056 PCI-E Gigabit Ethernet Controller (on Motherboard)

SATA: 2 x AOC-SAS2LP-MV8 (SAS/SATA)

Cage: 3 x Icy Dock MB455SPF-B (5 in 3 Drive Cage)

PSU: SeaSonic X750 80 PLUS Gold 750W

CASE: Lian Li Armorsuit PC-P50

UPS: CyberPower CP1500AVRLCD 1500VA 900W

 

Drives:

 

3 x Hitachi_HDS724040ALE640 (4TB/7200rpm)

8 x SAMSUNG_HD204UI (2TB/5900rpm)

2 x Seagate_ST2000DL004_HD204UI (2TB/5400rpm?)

 

I did see over 100 MB/s speeds while expanding (parity rebuild) one of the 2TB to a 4TB drive, but we're back down to ~88 MB/s for the parity check.

Link to comment
  • 2 weeks later...

I'm having the same issue with my hardware (see sig).

 

On rc5 my parity check speed would start at about 140MB/s and end at about 90MB/s. On rc11 it starts at a lousy 70-80MB/s and ends at about 40MB/s, doubling the time it takes from 10 hours on rc5 to almost 20 hours on rc11...

 

My sas2lp is also running at 5Gbps.

 

Data transfer rates to and from the array over the network are the same as with rc5, so it looks like a parity check problem only.

Link to comment

Currently running 2 drives on a SAS2LP-MV8 and 3 on a M1015 here, all various WD Green drives.

 

My parity sync speeds are fine with RC11, starting at around 130MB/s and dropping to below 100MB/s towards the end. Speeds is about the same both on a bare metal unRAID and on a virtual machine with the controllers passed through.

 

Link to comment

Currently running 2 drives on a SAS2LP-MV8 and 3 on a M1015 here, all various WD Green drives.

 

My parity sync speeds are fine with RC11, starting at around 130MB/s and dropping to below 100MB/s towards the end. Speeds is about the same both on a bare metal unRAID and on a virtual machine with the controllers passed through.

Strange that you dont experience it and others like me, do... maybe some bios setting?

Unlikely, because if i simply reboot into rc5, the speed is back... ???

Link to comment

Just for test, i've disabled simplefeatures and all plugins, rebooted and running a stock rc11, bare-metal. It still wont go beyond 50MB/s while starting parity checking... if i return to rc5, even with all plugins and simplefeatures and whatnot enabled, it starts out at a whopping 169MB/s for the first few minutes and then keeps a steady 140MB/s...

 

WHAT THE F%$#K is wrong with this rc11 ?

Link to comment

Just for test, i've disabled simplefeatures and all plugins, rebooted and running a stock rc11, bare-metal. It still wont go beyond 50MB/s while starting parity checking... if i return to rc5, even with all plugins and simplefeatures and whatnot enabled, it starts out at a whopping 169MB/s for the first few minutes and then keeps a steady 140MB/s...

 

WHAT THE F%$#K is wrong with this rc11 ?

 

Since all the RC''s are available for download, I would suggest you try each one intermittently so you can give Tom a narrower window of good vs. bad parity speed checks.

Link to comment

I'll see what i can do, but i think Tom knows best what changed when and why...

 

Right, but between Rc5 and Rc11, there were...

 

Rc5

Rc5-r8168

Rc6-8168-test

Rc6-8168-test2

Rc7

Rc8

Rc8a

Rc9

Rc9a

Rc10

Rc10-test

Rc11

 

 

So as you can see, there were many changes between Rc5 and Rc11, it would help to determine where the change occurred.

Link to comment

Tom doesn't, necessarily, know precisely what changed in each release - don't forget that he is dependent on the Linux kernel and drivers.

 

You don't have to try every intermediate release - try doing a 'binary chop' on the set of releases.  ie, try the one mid-way between rc5 and rc11 and use the result of that to determine which half of the set to investigate now.  That way you don;t have to try more than four of the ten intermediate releases.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.