To Cache drive or not to Cache drive?


Recommended Posts

I often find myself explaining my rationale/philosophy behind using a cache drive to various new unRAID users.  Instead of repeating those same points over and over, I figured I would start this thread to both consolidate them all into one place, and to shed some light on the current situation of the advantages and disadvantages of using a cache drive circa unRAID 4.5.3 (after the massive write speed improvements).  I'm also adding this thread to the wiki in the FAQs section.

 

First, some of the obligatory stuff:

- A Cache drive is completely optional, and is only available to paying users of unRAID (meaning unRAID Plus or Pro licensees).

- What is a cache drive? - While a bit dated, this post contains LimeTech's official explanation of a cache drive, the mover script, and how to change the time(s) at which the mover script runs.

- How does a cache drive work? - Short answer: transparently.  Some Hero Members impart some knowledge on the underlying functionality of a cache drive.

 

Also, just to get it out of the way,

Let's talk speed

How much of a write speed improvement can you expect to see from using a cache drive?  That depends on your hardware.  Slower hardware will of course result in slower transfer speeds.  Assuming you have Gigabit LAN (GigE) capable network controllers on both sides of the transfer (the client PC and the unRAID server), a Gigabit LAN router or switch, appropriate network cables (Cat5e or Cat6), and modern SATA I or better hard drives, then you should see write speeds in the following ranges:

 

These numbers represent average transfer speeds when writing data to unRAID:

Without a cache drive

unRAID 4.5.3 - average 20-30 MB/s, peak reported 40 MB/s*

 

With a cache drive

unRAID 4.5.3 - average 50-60 MB/s, peak reported 118 MB/s*

 

So generally speaking, a server with a cache drive has write speeds 2-3x faster than the same server without a cache drive.

 

unRAID compared to other NAS solutions: GaryMaster's Benchmarks

 

*Note: These figures (especially the averages) are based upon my own personal observations of my server and of the reports from other reputable sources in these forums.  I have linked to sources where possible.

 

The Advantages (summary: speed, piece of mind, defragmented data)

  • Increased Perceived Write Speed - Emphasis on 'perceived'.  The real, behind-the-scenes write speed of your unRAID server is unchanged by the addition of a cache drive.  A cache drive simply grants you the fastest transfer that your hardware will allow by deferring the inevitable parity calculation until a later time (3:40 am server time, by default, but this value can be changed, here's how)
     
  • Warm Spare (Note: requires that your cache drive be as large or larger than your parity drive and your largest data drive) - The purpose of the Warm Spare is to minimize the amount of time your array is without parity protection after a drive failure.  A Warm Spare is a hard drive that you have installed in your server to prepare for the eventuality of one of your other hard drives failing.  When a drive does eventually fail, you simply stop the array (via the Main page), unassign the dead drive (via the Devices page), assign the Warm Spare (also on the Devices page), and then start the array (back on the Main page).  At this point, your data from the dead drive will be automatically rebuilt onto your Warm Spare - this will take many hours, but your array and all your data is still available for use during this time (though performance will likely be degraded).  In the classic application, a Warm Spare sits in your server preinstalled (therefore taking up a SATA/PATA slot and using a small amount of power), but constantly spun down and unused (as it is not assigned to a disk slot in unRAID).  However, why not use a Warm Spare as a cache drive while you wait around for another drive to fail?  This application will of course add a bit of wear and tear to your Warm Spare, but nothing outside the scope of ordinary use.
     
  • Tepid SpareRajahal  ;) - Inspired by the Warm Spare concept though fundamentally different - the Tepid Spare is concerned with giving you time to shop around for a new hard drive, NOT with decreasing the amount of time your array goes without parity protection. In this case the drive can be any size; it can even be your smallest drive (which is my recommendation).  The purpose of a Tepid Spare is to tide you over from the time in which you run out of space on your server (all data disks full) until you purchase a new drive to expand your array (either by adding it to the array or by replacing a smaller drive in the array).  This 'tide you over' strategy gives you the luxury of waiting for a hard drive to go on sale or to come in stock.  It gives you time to browse the Good Deals forum and to read up on the current batch of drive's failure rates, firmware issues, and other gotchas.  It gives you breathing room. 
     
    To implement this Tepid Spare concept, simply take any drive of your choosing (I recommend your smallest data drive, so that most of your storage capacity is devoted to long-term storage) and install it as a cache drive as usual.  When you run out of storage capacity on your server, follow these steps:
        1) Run the mover script (via the Share page; Note: this may take a long time, depending on how much data is sitting on your cache drive and the speed of your hardware)
        2) Stop the array (via the Main page)
        3) Unassign the Tepid Spare from the Cache slot (via the Devices page)
        4) Assign the Tepid Spare to an unused disk slot (disk10, for example; also on the Devices page)
        5) Start the array (back on the Main page).  unRAID will clear the drive and add it to the parity protected array (this also may take a long time, and will involve complete server downtime, meaning that your data will be unavailable during this time).  Keep in mind that from this point forward you will see significantly slower write speeds to your server, since you are no longer using a cache drive.
     
    You now have some more space in your array to 'tide you over' until you can buy that new hard drive.  Once you do and successfully add it to your array following the standard procedure, you can do the following:
        1) Stop the array (via the Main page)
        2) Unassign the Tepid Spare from the disk slot (disk10 in our previous example, double check the disk slot to ensure you have it right; via the Devices page)
        3) Assign the new drive to the tepid spare's now empty slot (again, disk10 in our example)
        4) Assign the Tepid Spare to the Cache slot (also on the Devices page)
        5) Start the array (back on the Main page).  unRAID will clear your Tepid Spare then enable it as a cache drive once again.  The Tepid Spare's old contents will then be rebuilt onto the new drive, which will take many hours.  This process involves no server downtime.  At this point you will see your write speeds jump back up, since you are once again using a cache drive.
     
    A final note on the Tepid Spare concept: If you are using any of the 'fancier' features of User Share customization such as Includes and Excludes (found on the Shares page), you may have to reset them manually during this process.
     
  • Eliminating/reducing file fragmentation on your data disks - If you are a reasonably advanced Windows user, you probably know that it is a good idea to defragment your hard drive from time to time.  While not as much of an issue on unRAID, file fragmentation is still possible in the unRAID environment - a simple way to prevent it is to use a cache drive.  When the mover script runs, it writes your new files (which may or may not be fragmented, depending on how you copied them) to the protected array in a defragmented fashion (known as a sequential write, I believe).  Therefore, no matter how fragmented the data on your cache drive may be, by the time it makes it into your protected array it will be completely defragmented.
  • Running other software on top of unRAID (a.k.a. an 'Apps' drive) - (thanks to BRiT for pointing this out!) Since a cache drive is outside of the parity-protected array, it can be used for alternate software and unRAID add-ons that need to read and write data often.  Examples include running a full Slackware distro with unRAID from the HDD, the ability to multi-boot into a full Windows OS from the HDD, and the ability to use as a swap drive/partition for Linux.  BubbaQ has detailed these processes hereEdit: As of November 2011 the use of a cache drive is more or less required for certain add-ons, such as the Transmission torrent downloader, and the Newsgroups downloading trifecta SABnzbd, SickBeard, and CouchPotato, all of which have been developed by prostuff1 (who has been a very busy beaver).

 

The Disadvantages (summary: reduced storage size, risk)

  • 'Wasted' HDD and HDD slot - One of your hard drives and SATA/PATA slots is dedicated to your cache drive, instead of long term data storage
     
  • Short term risk of data loss - If your cache drive dies, it will take all of the data currently residing on it to the grave.  While it is possible to run ReiserFS data recovery software on your dead cache drive, there is still a good possibility of permanent data loss.  By default, the cache drive's mover script (which move's all of the day's new data from the unprotected cache drive and into the parity-protected array) runs at 3:40 am (based on your server's clock).  You can edit this value so that your mover script runs at any time and at any interval you want, for example, once at noon and once at midnight, or every hour on the hour (here's how).  This is a safety vs. performance trade off; the more frequently the mover script runs, the 'safer' your data will be (since this minimizes the amount of time that your data is not protected by parity), however, if you happen to be using the server during the time the mover script is running, you may notice performance decreases.  Will it make your movie or music skip?  Most likely not, but if you have slower hardware, maybe.
  • Potential for long term data storage without parity protection - This is a very particular scenario that would only affect a user who is generally inattentive to the status of their unRAID server.  Say you are using a cache drive on your server, and you write 10 GBs of new data to your server every night as an automated backup, this happens while you sleep.  In normal operation, the 10 GBs of new data is written first to the cache drive (at, say, midnight), then at 3:40 am, the data is written to the parity-protected array.  What happens when your array is at capacity?  The automated backup keeps running and adding 10 GBs of new information to your cache drive every night.  As far as it knows, everything is working properly.  However, at 3:40 am, the mover script fails because it has no where to send the data.  The data therefore sits on the cache drive indefinitely, until the user intervenes.  If this situation persisted unattended, the automated backup program would eventually fail when the cache drive filled to capacity, but this could take a long time.  My concern is that at the current time there is no default warning system in place to inform the user of this situation.  The user would have to notice the situation based on the free space readings for the data drives being low, and the free space readings for the cache drive shrinking on a daily basis (instead of resetting every night, as per normal).  Keep in mind that this is a very particular hypothetical scenario - as far as I know, it has never actually happened.  As long as you make a point of checking your server's web management page periodically (once per week seems reasonable) and taking note of the status of your cache drive, then you won't have this issue.
     

 

 

General Advice

Ultimately your decision to use or not to use a cache drive comes down to your usage patterns and your priorities.

 

You may want to use a Cache Drive if you...

A) Write a lot to your unRAID server ('a lot' being relative, of course, but I would say an average of 10 GBs or more per day is 'a lot')

B) Write to your unRAID server often (daily)

C) Often start multiple data transfers simultaneously (the benefit here being reducing data defragmentation in the array - your transfer speeds will still slow down considerably if you initiate multiple simultaneous transfers)

D) Want to maximize your write speeds (if performance is a priority to you)

E) Have an extra disk (and slot to support it) that is small enough to be negligible for use as a data drive (again, this is relative; for me, anything under 500 GBs is 'negligible')

 

You may not want to use a Cache Drive if you...

A) Don't write to your unRAID server very often (perhaps you use it as a media archive for your HTPC, so you read a lot but don't write much new data)

B) Never want your data to be without parity protection, even for a short time

C) Don't want to sacrifice a disk slot and disk to a purpose that won't grant you more storage capacity

D) Have a 'set it and forget it' attitude towards your unRAID server (due to the hypothetical scenario described above)

 

Also keep in mind that you can Mix and Match your cache drive use.  On the Share page, you can enable or disable use of the cache drive for each of your user shares.  This means you can have certain 'vital' shares that do not use the cache drive, and other 'nonvital' shares that do.  This is my approach.  All my media shares, Movies, TV, Music, etc. I consider to be nonvital, since all of their data is replaceable.  I allow all of these shares to use my cache drive, and I enjoy higher write speeds and the other benefits listed above because of it.  My vital shares are Backups, Pictures, and Documents.  I have disabled the cache drive for all of these shares.  Therefore, when I copy data to Backups, for example, the data is written directly into the parity protected array (at significantly slower speeds).

 

How to choose a cache drive

OK, so you've decided you want to use a cache drive...but which hard drive should you allocate for the purpose?  Whether you are reallocating an old hard drive or buying a new one, there are a few factors to consider.  First, what is your intended purpose behind using a cache drive?  There are three options, and each one will lead you to different drive parameters:

If your purpose in using a cache drive is:

A) Increased Perceived Write Speed, then you will want a drive that is as fast as possible.  For the fastest possible speed, you'll want an SSD (which has the added benefit of being low power) or a 10,000 rpm drive.  If you are on a tighter budget, a 7200 rpm drive such as a WD Black will do fine.  Unless you write a lot of very small files, the size of the hard drive's cache (8, 16, 32, 64 mb etc.) won't matter much.  You can also eek a bit more performance out of a slower drive by short stroking it (meaning confine it to only the outside sectors, which are higher density and can therefore be written to and read from faster), but this is of course an advanced maneuver.

B) As a Warm Spare, then you will want a drive that is the same size or larger than your parity disk.  This drive can be any speed you choose, just remember that in the event of a parity rebuild (if your parity disk dies) or a rebuild from parity (if a data disk dies), the process will proceed at the speed of your slowest disk.

C) As a Tepid Spare, then you can use a drive of any size and speed.  This is a good application for that slow, small drive that you have laying around, the one that would give you only a negligible increase in storage capacity (negligible being relative, of course).

 

The final consideration in choosing a cache drive is to think about the amount of data you expect to pass through it.  If you write ~10 GBs per day, then any drive 10 GB or larger will do (a 30 GB SSD may be a good fit in this case).  If you write 100 GB in one day every few weeks, then you will want a cache drive that is larger than 100 GB.  If you attempt a data transfer that is larger than the size of your cache drive, the transfer will fail.  The transfer will not automatically 'spill over' into the parity protected array (though that may be improved in future releases of unRAID...one can always dream).

 

Edit: (3/7/2011) I'll add that LimeTech has mentioned in several threads their intention to create a 'hot spare' option for unRAID, which would allow for the use of a cache drive AND a spare drive side-by-side.  Great news!

 

Edit: (8/13/2012) It has come to my attention that the Intel Atom CPUs are slow enough to cripple write performance when using modern SSDs as a cache drive! If your goal is a low power yet high performance server, then you will be best served by an i3 CPU (or better) as opposed to an Atom. For more info and a detailed account of the Atom's limitations, see jowi's plight later in this thread. The discussion spans four pages, so to skip right to the hard evidence of the Atom maxing out and limiting write speeds, click here. Evidence of fast, non-crippled SSD write speeds can be found here.

Link to comment
  • Replies 366
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

Hi Rajahal. I for one personally wouldn't go down the cache drive path (I'm paranoid about the data I copy). The 'Warm Spare' is an interesting idea and I didn't know it was possible to do this. It is a good idea to bring this up and have one dedicated thread to discuss this in detail. Thanks.

 

Link to comment

OK, whew, I think I'm finally done.  This is what happens when I have too much downtime at work... ::)

 

At this point I would like to open this up to discussion, comments, feedback, and so on.  I'm particularly interested to hear other Hero Members' take on my 'tepid spare' idea, since, as a product of my mind, it may be rife with errors and misjudgments.  :P

Link to comment

As well as the real time write benefit, I also like the cache disk as it keeps all my other disks spun down.

 

That way I can happily use it as scratch space for torrent client(s), news clients, web servers and anything else going on.

 

For most of the day I have only one cache disk in the array spun up, as opposed to at least two - parity and one data - and more likely many more whilst writes are occuring / lock files being held open.

 

It's a small power saving in the scheme of things but it's useful and helps justify leaving the machine on 24/7.

 

The more disks you have, the more chance of a benefit in this respect.

 

And of course all your other points stand valid as well!

Link to comment
  • 7 months later...

I really like the idea of a Cache drive but have a question.

1) Can you add a cache drive after you have loaded up your server will the bulk of your information

  a) for example you have 6TB of data laying around, you load it all into an unRAID setup with 10TB of space

    i) you do this because your cache drive is only say 80GB and trying to transfer 6TB in 80GB chunks each night will take forever

  b) once all the information is loaded you turn on the cache drive

  c) from this point forward you only transfer 80GB a day to the unRAID server

Link to comment

I really like the idea of a Cache drive but have a question.

1) Can you add a cache drive after you have loaded up your server will the bulk of your information

  a) for example you have 6TB of data laying around, you load it all into an unRAID setup with 10TB of space

    i) you do this because your cache drive is only say 80GB and trying to transfer 6TB in 80GB chunks each night will take forever

  b) once all the information is loaded you turn on the cache drive

  c) from this point forward you only transfer 80GB a day to the unRAID server

 

Yes you can add the cache drive quite happily afterwards with no issue.

 

You can also choose to enable / disable it on a per user share basis so you don't necessarily have an 80GB a day limit. You may only use it for some shares.

 

You can also still write directly to the disk shares.

 

And you can manually fire the cache -> array 'mover' process at any time if you wish to reclaim space on the cache drive.

 

Link to comment

Here is what I think.

 

With the cache drive your write speeds are governed by the speed of the cache drive so if you put a WD Black in (or a Hitachi 2 TB which is a lot cheaper) you can get 100-120 MB/s of sequential write on large files (like media files).

 

Without a cache drive your write speeds are going to be max around 30 MB/s which is not very much on GbE (basically it sucks compared to a regular unprotected file share).

 

The only issue here is the small period of time when your data is not protected by parity but really if your data is that important you should be using backups and no Unraid. Hard drive failures are not very common these days (unless it is old hard drives or your system is poorly engineered aka too running too hot) so in my opinion it's not too much of an issue.

 

Summary: if you care about your write speeds and have a spare hard disk - yes it is very worth it.

 

Link to comment

But if you use a WD black or Hitachi 7200rpm, then you're stuck with those drives. The word "stuck" may be a bit strong here; those are fine drives. I just think they're ill-suited to function as media playback devices. They're quick, but relatively power hungry and hot. If you use one for a cache drive, you either have to be okay with assigning one as a data disk, or relegate it strictly to cache duty, which negates the benefit of the "warm spare" concept.

 

Conceptually, I think sticking with "green" drives for a media server makes total sense. I have some older 7200rpm drives in my system, but as I phase them out over the next couple of years I will replace them exclusively with "green" drives (which didn't exist when I was buying these drives). I also think the performance benefits of a 7200rpm drive are not that great in the unRAID environment. Seek times are a bit higher, and long writes are a bit slower. But for a server with upwards of 20 drives who spend most of their time spun down, the slower, more power efficient drives make far more sense.

 

Also, is anyone actually getting 100-120MB/s sequential write speeds on a WD black or similar cache drive? I suspect not. Since Rajahal has reported far less than that with a (budget) SSD, I would guess that you won't see those sorts of speeds from a platter-based drive. The best speeds I've seen reported from cache drive users is in the 75-80 MB/s range, which is nice, but frankly underwhelming. unRAID just isn't built for speed. I'm fairly "meh" about installing a cache drive for speed. I'm more excited about using it as a warm spare, with the write speed as a nice side benefit.

 

 

Link to comment

Here are the results I have obtained when I initially tested Unraid perf. Those are not using the actual cache drive but from what I saw perf of "no parity" config is the same as with the cache drive. I saw ~100 MB/s transfers for the modern SATA drives. I currently have Hitachi 500 GB as the cache drive and performance over the network is identical to what I have benchmarked this drive in Windows previously locally and remotely. This is all on commodity hardware no fancy stuff.

 

Single disk tests with no parity drive - 6.5 GB file				

Hitachi 500 GB 7200 rpm
write - 80-90 MB/s
read - 95-100 MB/s

Seagate 2TB 5900 rpm
write - 95-100 MB/s
read - 100-120 MB/s

Hitachi 2 TB 7200 rpm
write: 90-100 MB/s
read: 100-120 MB/s

Link to comment

Here's my cache drive question:

 

Let's say I'm using 5 drive system.  1 is parity, 1 is cache, 3 are data.  The data is made up of a mixture of DV and mpeg files shot by camcorder.  If when opening Vegas Video or some software tool it references 6 files some are video some are audio tracks.  Those files would be located on the data drives.  Now if I build the output video it will get written to the cache drive I assume.

 

During the night the output video will get moved from the cache drive to the data drive. 

If the next day I rebuild the output video with some changes will the file on the data drive be removed and the new file written to the cache drive?  Or will the one on the data disk remain until the nighttime move happens?

Link to comment

If the next day I rebuild the output video with some changes will the file on the data drive be removed and the new file written to the cache drive?  Or will the one on the data disk remain until the nighttime move happens?

 

As far as I know, the prior one will remain on the array while the updated one will be on the cache until it is overwritten.

When accessing via user shares, the matching file on the cache will take precedence.

 

But the only way to be really sure is to test it out.

Link to comment

Ok, next question is:

 

If I have a file on a data disk and I'm going to append data to it, does it:

 

A. Simply append data to the data disk file?

 

This one, I am pretty sure

 

I think the cache drive only comes into effect when you are writing a new file.  If you do a save of the current file then you will go directly to the disk drive, if you do a save as you will go to the cache drive.

Link to comment

Ok, next question is:

 

If I have a file on a data disk and I'm going to append data to it, does it:

 

A. Simply append data to the data disk file?

 

This one, I am pretty sure

 

I think the cache drive only comes into effect when you are writing a new file.  If you do a save of the current file then you will go directly to the disk drive, if you do a save as you will go to the cache drive.

 

 

Although, if the file only exists on the cache drive you will edit the copy on the cache drive.

Link to comment
  • 2 months later...

Has anyone played with different drives to see what kind of results they have. I do have a spare 60GB SSD and a 500GB WD Blue. Now if the blue drive can acheive write speeds that is still signicantly faster than running without a cache drive, i'd rather use that, sell my 60gb ssd and use the money to buy 2 more tb of storage.

Link to comment

ok - I guess the same goes, if I connect via e-sata?

Via e-sata would be fine.  They will show up on the devices page for assignment  as long as they are plugged in before the array is started.  The only difference is an external case and a longer shielded cable to the drive.

 

Joe L.

Link to comment
  • 3 weeks later...

  • [*]Tepid SpareRajahal  ;)

...

You now have some more space in your array to 'tide you over' until you can buy that new hard drive.  Once you do and successfully add it to your array following the standard proceedure, you can do the following:

   1) Move/copy all the data off your Tepid Spare and onto your new disk

   2) Stop the array (via the Main page)

   3) Unassign the Tepid Spare from the disk slot (disk10 in our previous example, double check the disk slot to ensure you have it right; via the Devices page)

   4) Assign the Tepid Spare to the Cache slot (also on the Devices page)

   5) Start the array (back on the Main page).  unRAID will clear your Tepid Spare then enable it as a cache drive once again.  This process involves no server downtime.  At this point you will see your write speeds jump back up, since you are once again using a cache drive.

...

 

Just found this thread, and great work and layout, Rajahal!

 

I did notice a couple of things.  First, just a reminder about BRiT's post (Reply #3), those are really important uses of the Cache drive, especially for 'power' users.  Would be good to add them to your great All-about-the-Cache-Drive post.

 

Secondly, I couldn't help noticing the section above, from the description of the Tepid Spare.  (great idea by the way!)  As I know you know, you cannot just remove a drive from an unRAID array (step 3 above).  So I wonder if a better procedure would be to replace the Tepid spare with the new drive, and let it rebuild on to it.  That way, the array stays protected, and you no longer need Step 1, copying all of the data to the new drive.  Plus, the data in question is fully backed up on the Tepid Spare while it is being rebuilt onto the new drive, in case anything should go wrong with the rebuild.

 

And there's a 'drive drive' in the first line of the Warm Spare paragraph.

Link to comment

Thanks Rob, you are absolutely right, I'll make those changes.

 

Edit: I made the changes.  I still really don't know much about the newly added cache drive benefit (running software on top of unRAID).  You and all the mods are welcome to edit my post to elaborate, if you wish.  I figure any post like this should be treated more like a wiki page with more than one author.

Link to comment

Secondly, I couldn't help noticing the section above, from the description of the Tepid Spare.  (great idea by the way!)  As I know you know, you cannot just remove a drive from an unRAID array (step 3 above).  So I wonder if a better procedure would be to replace the Tepid spare with the new drive, and let it rebuild on to it.  That way, the array stays protected, and you no longer need Step 1, copying all of the data to the new drive.  Plus, the data in question is fully backed up on the Tepid Spare while it is being rebuilt onto the new drive, in case anything should go wrong with the rebuild.

 

And there's a 'drive drive' in the first line of the Warm Spare paragraph.

 

I'm still new to unRAID, but I would've thought you can remove cache drives because they're not REALLY in the array... I guess I'm wrong. Are you saying, once you implement cache drives in your array, you can't ever remove it anymore without rebuilding the entire parity? I guess the same would apply if someone were to shrink their array size (for whatever reason), right?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.