Parity check reads


Recommended Posts

This is not a problem ... just something I've wondered about for a long time and am curious if someone knows the reason ==>  If I clear all the stats (so I can see the numbers based solely on the parity check); and then do a parity check, the number of reads is wildly different between my drives, despite the fact they're all identical drives (in this case all 3TB WD Reds; but the same thing is true on my other server with primarily 2TB drives).

 

It would logically seem that the same number of reads would be needed on all drives, since they're all reading exactly the same amount of data.    Any idea why the numbers are so different?

 

Link to comment

Bump to top => Anyone ??  Tom?  Joe? WeeboTech?  Prostuff1?

 

There's got to be some explanation as to why the reads are so wildly disparate (a factor of 3 or 4), even though all drives have exactly the same amount of data read.

 

My "guess" is that the reads counter is actually counting "read requests" ... and that different requests are of different sizes, based on the number of currently available read buffers.  But it'd be nice to confirm that ... or if that's not it, know just what causes the disparity.

 

Link to comment

This is an interesting one.

 

As I think about it I have to consider. How fast is my button press vs the CPU?

When I press that button. Does everything stop while I take statistics?

 

Then as you suggest below, there are buffering conditions. Pre-read, etc, etc.

However I would expect a block read, is a block read.

 

On one machine I am doing a parity generate.

I have 7781, 7772, 7774 with 7641 writes.  This is at 3gb

 

Those numbers don't seem so skewed when you take into account the md-numstripes and other md driver tunings.

 

Now after I've refreshed, I am at 3.37tb. Just past the 3tb drive.

 

parity      = 64255315

4tb drive = 64255457

3tb drives are 5723317 and 5723319 (gee why are those numbers different?)

 

Could it have something to do with filesystem housekeeping? Sure it can.

However in my case, the drives are unformatted.

So perhaps theres a few reads of the mbr or partition table.

 

 

Link to comment

This is an interesting one.

 

As I think about it I have to consider. How fast is my button press vs the CPU?

When I press that button. Does everything stop while I take statistics?

 

Then as you suggest below, there are buffering conditions. Pre-read, etc, etc.

However I would expect a block read, is a block read.

 

On one machine I am doing a parity generate.

I have 7781, 7772, 7774 with 7641 writes.  This is at 3gb

 

Those numbers don't seem so skewed when you take into account the md-numstripes and other md driver tunings.

 

Now after I've refreshed, I am at 3.37tb. Just past the 3tb drive.

 

parity      = 64255315

4tb drive = 64255457

3tb drives are 5723317 and 5723319 (gee why are those numbers different?)

 

Could it have something to do with filesystem housekeeping? Sure it can.

However in my case, the drives are unformatted.

So perhaps theres a few reads of the mbr or partition table.

 

My numbers are never anywhere near that close -- it's always been like that on both my old v4.7 system (4+ years) and my new v5 system (< 1 yr).  For example, I just did a "clear stats" and then started a parity check.    It'll take 7 1/2 hrs to complete, but I'll post the final numbers just for grins ... but a quick refresh shows that at 31.28GB into the array the parity disk has 427,171 reads; the data disks vary between a low of 306,737 reads and a high of 523,072 reads.

 

I've always wondered why the wide gap.  Note a big deal ... the system performs great ... but it just seems that logically with all drives the same (3TB WD Reds) they should have the same number of reads.

 

Link to comment

A bit over 5 hours in to the test ... thought I'd post the current state just in case I forget to later when it's done.

 

Currently at 2.32TB (72%).

Parity drive shows 18,258,160 reads

Data drives show between 15,500,278 reads and 33,788,485 reads.

 

Note that this is not at all unusual -- my v4.7 system has had similar disparities ever since I built it.    I've always wondered why the read counts didn't correlate well with each other;  but never bothered to ask.

 

Obviously it doesn't really matter -- I'd just like to know WHY the counts are so different !!

 

Link to comment

By the way, the tunable parameters don't make any notable difference.  I don't have records of the actual numbers, but there have always been significant differences in the read counts.    I did recently experiment with the tunable parameters a bit, and now run with the md_sync_windows set to 896 instead of the default ... shave nearly an hour off parity check times.    But that's NOT the reason the counters are different.

 

Link to comment

Here are mine.

 

 

ST4000DM000-1F2168_W3005993 (sdb) 3907018532 - 136    - 7631000 (Parity)

ST4000DM000-1F2168_W3005LMP (sdc) 3907018532 - 7631132 - 5

ST3000DM001-1CH166_Z1F2WE0X (sdd) 2930266532 - 5723317 - 5

ST3000DM001-1CH166_Z1F2WFKV (sde) 2930266532 - 5723319 - 5

 

That's amazing that yours are all effectively the same ... now I really wonder why mine aren't !!

 

Even your ratio is correct [i.e. the # of reads on the 4TB parity divided by the # of reads on your 3TB drive = 1.33 ... exactly what it logically should be (4/3).

 

Link to comment

Also, you do realize, the mere act of mounting a filesystem causes reads and writes.

Any write, involves the journal. Which then creates other reads and writes.

 

Understand.  But I never do ANYTHING during a parity check ... don't even "Refresh" the display more than a few times over the entire period.    I simply (a) reset the stats ... so all the counters are zero;  and then (b) click the Check Now button (although if I'm timing it I'll first click the Spin Up button to spin all the drives up.

 

Nothing else.    Is your process different ?

 

Link to comment

garycase you are not mad, I notice same... and I'm sure it's not any reads from anything else done at same time as parity check... I know that when I start array there are some (very few) reads/writes at mounting file systems (a fixed amount for all hdd's, something like 22/11 or similar, I can't see exact numbers now as I'm doing full parity check but it's near that) however after that I can leave it running for hours that they will not increase for sure, also I have near stock unraid, no plugins, no unmenu.

 

When doing parity check my parity drive (wd red) shows about double reads of my data drives (wd green), and even wd green ones show some considerable difference between them. I would also like to know a reason for this... maybe because there is differences on drive speeds (green near 120MB/s and red near 150MB/s at start of hdd, also access times different, etc) that cause some weird read ahead or some synchronization scheme that may eventually need to re-read data or something similar to get on sync while trying to achieve best possible performance? Maybe only Tom may reply this...

 

85% parity check (sorry I can't wait it finish now, will post finished values tomorrow):

parity WDC_WD20EFRX-68AX9N0_WD-WMC300050457 (sda) 1953514552 31°C 2 TB - 14259728 22 0

disk1 WDC_WD20EARX-00PASB0_WD-WCAZAE848879 (sdc) 1953514552 34°C 2 TB 389.08 GB 6868085 11 0

disk2 WDC_WD20EARX-00PASB0_WD-WCAZAE866220 (sdd) 1953514552 31°C 2 TB 1.61 TB 5873243 11 0

Link to comment

Some time way back, pre-4.7, the unraid driver maintained the read/write counts based on read/write sent down to the disk drivers.  At that time the counters were almost identical during parity operations.  But at some point I changed the code to get the counters from a /proc file for each disk, maintained by the disk drivers (I can't remember the exact reason at the moment, but I remember it was necessary because it was a PITA to change at the time).  So the disk drivers try to do all kinds of I/O combining for efficiency-sake, that is instead of sending say 10 4K contiguous requests it would rather send down a single 40K request.  Well to do that it has to "wait" a bit before sending an i/o for another contiguous i/o to arrive at the queue.  But at some point it has to quit waiting and get on with things.  So I think what you are seeing is simply anomalies associated with this activity which ends up being very timing dependent.

Link to comment

Some time way back, pre-4.7, the unraid driver maintained the read/write counts based on read/write sent down to the disk drivers.  At that time the counters were almost identical during parity operations.  But at some point I changed the code to get the counters from a /proc file for each disk, maintained by the disk drivers (I can't remember the exact reason at the moment, but I remember it was necessary because it was a PITA to change at the time).  So the disk drivers try to do all kinds of I/O combining for efficiency-sake, that is instead of sending say 10 4K contiguous requests it would rather send down a single 40K request.  Well to do that it has to "wait" a bit before sending an i/o for another contiguous i/o to arrive at the queue.  But at some point it has to quit waiting and get on with things.  So I think what you are seeing is simply anomalies associated with this activity which ends up being very timing dependent.

 

Thanks Tom.    Sounds like my "guess" above was essentially correct ["... the reads counter is actually counting "read requests" ... and that different requests are of different sizes, based on the number of currently available read buffers."]

 

HOWEVER, this then begs the question of why are WeeboTech's counters NOT different !! ??

 

Link to comment

Maybe his ST's are more equal in speed then not causing such waits and I/O combining?

 

I don't think that's the reason ... it's hard to get more "equal" than having six identical drives all on motherboard ports (my v5 system) ==> in fact his system has one "different" drive (the 4TB parity) ... and yet even that one has a counter that's perfectly in sync with the 3TB drives.

 

Hopefully WeeboTech is doing something "different" that may explain this -- he may have even recompiled the kernel and removed the change Tom referred to  :)

Link to comment

Tom => Just noted one more anomaly.    When I did the parity check to get the read counts I just posted, the system also showed 1 write/disk with 5 writes on the parity disk (which seems right since there are 5 other disks that had one write each).    I assume this was some kind of status info written to the disks, as the parity check was error-free.

 

However, I just now kicked off another parity check after upgrading to RC15 => and this time the write column is all zeroes !!  I assume that's just some minor difference between RC14 & RC15 ... right?

 

I don't recall with certainty, but I don't think I've usually seen ANY writes in that column with previous versions -- so I assume it had something to do with the earlier kernel you used for RC14 ... and that all zeroes is the "normal" thing to expect in the Writes column as long as you clear stats before kicking off the check.

 

Link to comment

Indeed I didn't noticed yesterday that WeeboTech's ST's where different sizes, then surely they should be very different on speeds etc... but well maybe controller related or something. One small thing I can also add re my system it's that my wd red's are connected on sata3 ports while greens on sata2.

 

Btw, my final values (with rc11a, still using it on my 'production' server) are:

 

parity WDC_WD20EFRX-68AX9N0_WD-WMC300050457 (sda) 1953514552 36°C 2 TB - 17038735 22 0

disk1 WDC_WD20EARX-00PASB0_WD-WCAZAE848879 (sdc) 1953514552 34°C 2 TB 389.08 GB 7871592 11 0

disk2 WDC_WD20EARX-00PASB0_WD-WCAZAE866220 (sdd) 1953514552 33°C 2 TB 1.61 TB 6798799 11 0

 

Concerning the '1' writes you are seeing... the kernel on rc14 is 3.4.x like on my rc11a, then there should be no huge changes on that... and I can tell you that on rc11a I don't get any writes at all exactly during parity check... but... I did always noticed that when I start array there is some immediate read/writes and... after some 15-30 seconds there is 1 write done to each hdd... this may be what you are seeing if you started array and then just started parity check before these 15-30 seconds? on rc15 these '1' writes 15-30 seconds after array start indeed doesn't exist (and in fact the count of the ones done immediately after starting array are also less, I did also noticed that on rc13) but I guess it may be most probably related to file system handling differences on latest kernels.

Link to comment

WeeboTech =>    Do you have any idea why your counters are all the same??    As Tom noted, it's "normal" for the counts to be different ... the reason is basically what I had originally surmised -- the counters are counting the actual number of physical I/O's the driver makes as opposed to the number of logical I/O requests (which are, of course, equal for all drives of the same size).

 

Given Tom's reply, however, it makes your results very interesting !!

 

Link to comment

WeeboTech =>    Do you have any idea why your counters are all the same??

Not the answer you are looking for, and probably not even correct, but consider that semi random results that appear to be a pattern are just that, semi random results. The human mind loves to find patterns in everything, and if given truly random results, will attempt to see patterns even if none truly exist.

 

Maybe his results just happen to be similar to each other.

Link to comment

WeeboTech =>    Do you have any idea why your counters are all the same??

Not the answer you are looking for, and probably not even correct, but consider that semi random results that appear to be a pattern are just that, semi random results. The human mind loves to find patterns in everything, and if given truly random results, will attempt to see patterns even if none truly exist.

 

Maybe his results just happen to be similar to each other.

 

No, his results are not in any way random -- the 3TB drives are all identical; and the 4TB drive has exactly 1.33 times the number of reads as the 3TB units.    THAT is not a random event  8)

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.