Improving Write Speed?


Recommended Posts

(Both disks being 7200 RPM will result in faster throughput than when either is rotating at 5400 RPM)

 

Not always true.  Areal density and throughput, along with the buffering algorithms can provide a 5400 RPM drive that will actually have faster writes in unRAID than a 7200 RPM drive.

 

Sustained, yes.

 

Given the way UnRAID parity works, the disk still has to rotate round 360 degrees to get back to the same sector, hence why rotational speed is an issue for parity and why I got a fairly hefty write increase with a WD Black as parity not a WD Green.

 

But yes, my 2TB WD Greens can write just a tad quicker than my 1TB WD Black.

 

It is a shame that the way UnRAID works slows things right down. :(

Link to comment

Even given that I understand the need for the parity drive to read, spin 360, and write ...

 

What I still don't quite get is why those reads and writes can't be batched, reorded (NCQ doesn't seem to help sadly), etc such that if you have 10 "things" to write instead of:

 

This: read , spin 360, write x10

 

Do this: read one x10, spin 360, write one x10

 

That would provide a huge reduction of rotational overhead.

 

And I picked 10 arbitrarily, and I assume reading out and then writing out an entire file has some issues / uninteded consequences I don't understand.  At the least exceeding buffer size would seem to be Bad .  But any level of batching of reads followed by writes would seem to help.

 

or OMG ... don't tell me this already happens and it would be way way worse if the system really did work a single bit/byte/block at a time :o

Link to comment
or OMG ... don't tell me this already happens and it would be way way worse if the system really did work a single bit/byte/block at a time :o

 

It does, sort of... but that is done at the controller/drive level.  Most modern drives have read-ahead, so you read a sector, and the rest of the track is loaded and in cache for the other 9 reads.  Writes are often buffered similarly.

Link to comment

At the least exceeding buffer size would seem to be Bad .  But any level of batching of reads followed by writes would seem to help.

 

or OMG ... don't tell me this already happens and it would be way way worse if the system really did work a single bit/byte/block at a time :o

It is my understanding that unRAID works in units of a "stripe" of data.  That is some number of 512 byte blocks. http://lime-technology.com/forum/index.php?topic=14934.msg141474#msg141474

I think it is 288 blocks.  That is what the "md" driver issues to the "sd" driver.  The question is, what does the sd driver do?

 

Looking at the blockdev command, it seems the read-ahead buffer is set to 1024.  (1024 blocks)  That should all fit in the typical 8 to 64 Meg buffer on a disk.  Perhaps some playing with the tunables, in combination with enabling of ncq queuing, might be interesting.

Link to comment

At the least exceeding buffer size would seem to be Bad .  But any level of batching of reads followed by writes would seem to help.

 

or OMG ... don't tell me this already happens and it would be way way worse if the system really did work a single bit/byte/block at a time :o

It is my understanding that unRAID works in units of a "stripe" of data.  That is some number of 512 byte blocks. http://lime-technology.com/forum/index.php?topic=14934.msg141474#msg141474

I think it is 288 blocks.  That is what the "md" driver issues to the "sd" driver.  The question is, what does the sd driver do?

 

Looking at the blockdev command, it seems the read-ahead buffer is set to 1024.  (1024 blocks)  That should all fit in the typical 8 to 64 Meg buffer on a disk.  Perhaps some playing with the tunables, in combination with enabling of ncq queuing, might be interesting.

 

Yeah i already have NCQ turned on but i don't think I'm getting much of a boost.  The tunables now ... hmmm i guess the question is how hardware specific might they be and how much effort did Limetech put into optimizing those tunables.

Link to comment

A drive supporting NCQ would also likely take less rotations for r/w than one which does not.

 

NCQ is disabled by default in the unRAID settings (Force NCQ disabled = Yes).

 

This is interesting. I wonder if Tom could elaborate as to why NCQ is disabled by default (perhaps to allow for the fact that not all of the drives in the array necessarily support it?). There isn't much info in the wiki from what I could see, other than this:

 

Force NCQ disabled - Disable native command queuing.

 

    Recommend leaving this as yes.

    Disable NCQ on all disk devices that support NCQ. This typically results in much better write throughput. A setting called "Force NCQ disabled [yes/no]" is also available in the Disk section of the Settings page of the System Management Utility to override this new behavior. That is, if this setting is 'yes', then we force NCQ off; if setting is 'no', we leave NCQ queue_depth as-is, ie, whatever linux driver sets it to.

 

Has anyone done much testing with NCQ enabled?

Link to comment

WARNING, "bright idea for someone else" on the way ...

 

So would it be possible use a script to profile drive performance during typical writes and parity checks?  It would need to vary the tunables and then run some writes and parity checks from various portions of the drive (but crickey not the whole drive) and logs the results. 

 

Thoughts?

Link to comment

WARNING, "bright idea for someone else" on the way ...

 

So would it be possible use a script to profile drive performance during typical writes and parity checks?  It would need to vary the tunables and then run some writes and parity checks from various portions of the drive (but crickey not the whole drive) and logs the results. 

 

Thoughts?

 

This should be easy using the existing performance script that is floating around in the forum.

Link to comment

A drive supporting NCQ would also likely take less rotations for r/w than one which does not.

 

NCQ is disabled by default in the unRAID settings (Force NCQ disabled = Yes).

 

This is interesting. I wonder if Tom could elaborate as to why NCQ is disabled by default (perhaps to allow for the fact that not all of the drives in the array necessarily support it?). There isn't much info in the wiki from what I could see, other than this:

 

Force NCQ disabled - Disable native command queuing.

 

    Recommend leaving this as yes.

    Disable NCQ on all disk devices that support NCQ. This typically results in much better write throughput. A setting called "Force NCQ disabled [yes/no]" is also available in the Disk section of the Settings page of the System Management Utility to override this new behavior. That is, if this setting is 'yes', then we force NCQ off; if setting is 'no', we leave NCQ queue_depth as-is, ie, whatever linux driver sets it to.

 

Has anyone done much testing with NCQ enabled?

 

See this thread (skip on down about 11 posts):  Write Performance Within the unRAID Server

 

That's basically where NCQ/queue_depth setting started.  Things may have changed since then, since that is quite a few kernel versions back, so you are welcome to test for yourself.  Please let us know if you see different results now than we did then.

Link to comment

A drive supporting NCQ would also likely take less rotations for r/w than one which does not.

 

NCQ is disabled by default in the unRAID settings (Force NCQ disabled = Yes).

 

This is interesting. I wonder if Tom could elaborate as to why NCQ is disabled by default (perhaps to allow for the fact that not all of the drives in the array necessarily support it?). There isn't much info in the wiki from what I could see, other than this:

 

Force NCQ disabled - Disable native command queuing.

 

    Recommend leaving this as yes.

    Disable NCQ on all disk devices that support NCQ. This typically results in much better write throughput. A setting called "Force NCQ disabled [yes/no]" is also available in the Disk section of the Settings page of the System Management Utility to override this new behavior. That is, if this setting is 'yes', then we force NCQ off; if setting is 'no', we leave NCQ queue_depth as-is, ie, whatever linux driver sets it to.

 

Has anyone done much testing with NCQ enabled?

 

See this thread (skip on down about 11 posts):  Write Performance Within the unRAID Server

 

That's basically where NCQ/queue_depth setting started.  Things may have changed since then, since that is quite a few kernel versions back, so you are welcome to test for yourself.  Please let us know if you see different results now than we did then.

Don't forget, we now have three tunable parameters that may be set in the unRAID GUI.  These too might be tested to see how they affect performance once NCQ is enabled.
Link to comment

Don't forget, we now have three tunable parameters that may be set in the unRAID GUI.  These too might be tested to see how they affect performance once NCQ is enabled.

 

I noticed in the Release Notes for 5.0-beta11 the following:

    emhttp: change md_sync_window default from 288 to 384 for better sync performance with fast drives

Those of us that started with UnRAID versions older than 5.0-beta11 probably still have it set to 288, and may want to update it to 384.

 

The current default settings are:

set md_num_stripes 1280
set md_write_limit 768
set md_sync_window 384

But there is a small group of users with the following:

set md_num_stripes 10000
set md_write_limit 5000
set md_sync_window 2500

I'm hoping that someone knowledgeable with these settings will comment, tell us why and how they came up with those numbers, and what performance differences they have seen.

Link to comment

See this thread (skip on down about 11 posts):  Write Performance Within the unRAID Server

 

That's basically where NCQ/queue_depth setting started.  Things may have changed since then, since that is quite a few kernel versions back, so you are welcome to test for yourself.  Please let us know if you see different results now than we did then.

Don't forget, we now have three tunable parameters that may be set in the unRAID GUI.  These too might be tested to see how they affect performance once NCQ is enabled.

 

Quoted from the above thread:

 

Posted by limetech (Tom) 08 Mar 2009

 

I have been ignoring this issuing, hoping it might be fixed in the linux kernel update; apparently it wasn't.  unRAID does not 'tweak' the disk drivers in any way (maybe it should though, see below).

 

I remember early on (like a couple years ago), AHCI was very problematic & not all controllers supported it, so I always configure this 'off' in the bios.  In addition, I believe that NCQ in linux is only implemented in the AHCI driver (could be wrong on that), hence I don't have much specific performance testing with AHCI/NCQ.

 

Anyway, one reason I haven't been much interested in NCQ is because it probably will not increase performance at all, and instead if anything, might decrease performance or add instability.  This is because the linux disk drivers are already highly optimized for ordering seeks.  By the time commands are sent down to the disk, the internal NCQ algorithm probably could not improve the order further.  Especially in media server application where large files are access, NCQ will certainly make no difference.

 

NCQ, like queue reordering in SCSI, is mainly a meaningless feature precisely because the host O.S. driver is so much better at re-ordering the queue.  (Who are you going to trust to write better queuing code: kernel developers or hard drive firmware coders - no offense to any hard drive firmware coders reading this - but you know I'm right 

 

NCQ, like queue reordering in SCSI makes sense if you have one hard disk being accessed asynchronously by two different computers - not bloody likely, and certainly not in SATA.  Historical note: NCQ is the evil step-child of SCSI tagged command queuing.  Tagged commands are necessary in SCSI because of the bus nature of SCSI (ie, multiple devices that share the same physical bus).  After finally getting multiple target code right, disk firmware designers noticed that tagged commands could be used to manage internal disk queues, and hey, those kernel developers don't know how to write proper queuing code, so what the heck, let's write it ourselves.  Then when the marketing folks discovered that this new-fangled mult-task O.S. (Windows) does perform better with disk-side queue management (because the hard drive firmware guys could write better queuing code than the windows guys), the myth of NCQ was born!

 

I found this post really helped in clarifying why NCQ might be better disabled. I wonder how much development there has been on this in the last 3 years, where NCQ sits with modern drives & chipsets with the newer Linux kernels. Is NCQ still the 'myth' it was back then?

Link to comment

wonder how much development there has been on this in the last 3 years, where NCQ sits with modern drives & chipsets with the newer Linux kernels. Is NCQ still the 'myth' it was back then?

Actually, we are far more concerned how NCQ affects "write" speed.  It might really help if it managed by keeping entire cylinders in the internal disk cache AND by reading ahead the following cylinder(s)  Granted, that needs huge amounts of disk internal cache RAM, so it might make sense for that to occur in the low-level linux driver. 
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.