Guest

SSDs as array drives question

34 posts in this topic Last Reply

Recommended Posts

Guest

I read in the getting started guide that this isnt a good idea because of no TRIM support. What do people think of this, is this a big issue? Do others have all SSD as their array? Please forgive the newbie question  :)

Share this post


Link to post

Depends on your use case, I had an SSD only array for some time, ended up moving them to a RAID10 cache pool due to the low write speed on the array, tried various SSDs but but never got sustained writes above 100-150MB/s, I still use a couple of SSDs on my main server due to low power and read speed (write speed is not important here because I use a NVMe device for cache)

Share this post


Link to post

 

I think the question is why would I want to use SSDs as array drives?

 

The array is designed to give you a large, cheap, protected, storage space.

SSDs are not large well not the cheap ones anyway

TRIM would mess with the already calculated parity so you would lose protection (maybe there is a way around this)

 

SSDs are good for speedy data access but if your accessing you array over gigabit ethernet a decent HDD will saturate this (on reads at least anyway as these is a slight slowdown when writing due to parity being calculated/written

 

You would probably be better off building an array to protect your archived (does not change very often) data and then having a protected cache made up of SSDs

you can assign shares to live on just the cache drive so you would have the benefit of speed and avoid the problems of having SSDs in your array

Share this post


Link to post

TRIM would mess with the already calculated parity so you would lose protection (maybe there is a way around this)

 

No, trim doesn't work on array devices, parity is maintained (but write performance takes a hit).

 

SSDs are good for speedy data access but if your accessing you array over gigabit ethernet a decent HDD will saturate this (on reads at least anyway as these is a slight slowdown when writing due to parity being calculated/written

 

True, but they can be worth using for the read speeds when using 10GbE, I get 500MB/s read speed from my array SSDs.

 

Like I said, depends on your use case, but they can be a valid choice.

Share this post


Link to post

A file system issues TRIM commands as a result of deleting a file to tell the ssd that the set of blocks which previously made up the file are no longer being used.  The sdd can then mark those blocks as 'free'.  Later when the ssd internal garbage collection runs, then it knows that it doesn't have to preserve the contents of those blocks.  This makes garbage collection more efficient.  There are lots of articles that explain this.

 

The trouble this causes for parity-based array organizations is that the data returned from a TRIM'ed data block can be indeterminate.  This paper is a bit wordy but lays it out on p. 13:

 

Since the parity information must always be consistent with the data, it has

to be updated after a TRIM command was processed. A useful drive characteristic

that would ease the maintenance of a consistent state in parity-based RAIDs

would be that subsequent read requests to a trimmed LBA range always return

the same data. However, the SATA standard [14] tolerates that subsequent reads

of trimmed logical blocks may return different data. The exact behavior of a subsequent

read request to a trimmed sector is reported by a particular SATA device

when the IDENTIFY DEVICE command [14] is issued to it. There are three

possibilities: The first is that each read of a trimmed sector may return different

data, i.e., an SSD shows non-deterministic trim behavior. We denote this variant

of trim behavior as "ND_TRIM" for the rest of this report. The remaining two possibilities

represent a deterministic trim behavior, where all subsequent reads of a

trimmed logical block return the same data, which can be either arbitrary (denoted

as "DX_TRIM") or contain only zero-valued bytes (denoted as "DZ_TRIM").

The SATA standard [14] leaves for the variant DX_TRIM open whether the returned

data will be the same for dfferent sectors or not.

 

To boil this down for unRAID: it should work to use SSD's in an unRaid P or P+Q array if TRIM is not used.  This is current behavior.  However note that:

a) Write performance can degrade faster on data disks depending on how many file deletions take place.

b) The parity disk is also written for each data disk write.

c) The data disks really should be completely written first because theoretically a block that was never written from the point of view of the SSD, can return non-deterministic data for those blocks.  We have not seen this happen, but then again we have not run too many SSD arrays (it would show up as parity sync errors).  This is pretty undesirable thing to do however since it will guarantee slowing down subsequent writes.

d) If you don't want to pre-write the disks as above, then only use SSD's that support "DX_TRIM" or "DZ_TRIM", and instead of writing the disks with zeros, simply use 'blkdisard' command to first TRIM the entire device instead.

 

You can use the 'hdparm' command to determine if your SSD's have this support:

 

hdparm -I /dev/sdX   # substitute X for your ssd device assignment

 

You want to look near the end of the "Commands/features:" section for:

 

          *    Data Set Management TRIM supported

 

Following this will either see this:

 

          *    Deterministic read data after TRIM

 

or you will see this:

 

          *    Deterministic read zeros after TRIM

 

or you won't see either of the above (if this is the case, do not use in unRAID P or P+Q array).

 

In a future release we do plan to add proper TRIM support to array disks.  Here's a heads-up on that.  In order to support TRIM in unRaid P or P+Q array, we must add code to the md/unraid driver and all SSD's in the array must support either "DX_TRIM" or "DZ_TRIM" mode as described above.  In addition there's a really good chance we will only support SSD's that support "DZ_TRIM" since to support "DX_TRIM" is a lot more work  ;)

 

  • Like 1

Share this post


Link to post

Thanks for the detailed info, I hope to see trim support on the array in the future, the reason I stopped using my SSD only array was the deteriorating write performance I was getting because of the lack of trim, never more than 100 to 150MB/s and sometimes much less, like 50MB/s writes.

 

I did use it a lot and never got a sync error, when I get the chance I'll check the type of trim each different model used.

 

I do still use a couple of SSDs on my main array, for these I only care about the read speed, still trim support would be always nice for endurance, also never got a single sync error, and I been using them for about 6 months, just checked and they are DZ_TRIM.

Share this post


Link to post

Thanks for the detailed info, I hope to see trim support on the array in the future, the reason I stopped using my SSD only array was the deteriorating write performance I was getting because of the lack of trim, never more than 100 to 150MB/s and sometimes much less, like 50MB/s writes.

 

I did use it a lot and never got a sync error, when I get the chance I'll check the type of trim each different model used.

 

I do still use a couple of SSDs on my main array, for these I only care about the read speed, still trim support would be always nice for endurance, also never got a single sync error, and I been using them for about 6 months, just checked and they are DZ_TRIM.

 

That's good info, thanks.  I'm still a bit worried about Parity updates slowing down the writes even with TRIM.  This is because with those DZ_TRIM devices we can treat TRIM like a "WRITE all zeros" and update parity accordingly, but probably Parity will not be all zeros.  A refinement would be to check if data to be written to Parity is all-zeros and if so, instead doing actual write, send down TRIM instead.  Not sure how this would affect performance, I think TRIM is one of those commands that causes a queue draining, which may also impact performance.  ::)

Share this post


Link to post

Superb summary.

 

Can I suggest that in the interim including some easy indicator in emHTTP as to which of the 3 states ("DX_TRIM" or "DZ_TRIM" or "Unsupported") should be added ASAP.

 

This will raise visibilty in the community and start the natural process of recommendations and more importantly removals

Share this post


Link to post

Have there been any updates on TRIM support for SSDs in the main array?

 

I would like to keep my music and documents on an SSD to keep all of my large HDDs (Movie Drives) spun down on the array. I have been trying to setup the Sync Docker to keep it in unashinged devices, but am having no luck. 

 

I am curious if TRIM is coming soon to the main array, so I can store my SSD there for now. I am not concerned about write speeds since I will be using a cache drive. My main goals are focused on fast seek times, data parity, low power consumption and drive health.

Share this post


Link to post
3 minutes ago, Twisted said:

Have there been any updates on TRIM support for SSDs in the main array?

 

I would like to keep my music and documents on an SSD to keep all of my large HDDs (Movie Drives) spun down on the array. I have been trying to setup the Sync Docker to keep it in unashinged devices, but am having no luck. 

 

I am curious if TRIM is coming soon to the main array, so I can store my SSD there for now. I am not concerned about write speeds since I will be using a cache drive. My main goals are focused on fast seek times, data parity, low power consumption and drive health.

Another approach for you would be to setup a cache pool and get redundancy that way. I have several cache-only shares for frequently accessed files like music.

Share this post


Link to post

@trurl Thank you for the advice. I was going to go this way initially, but I just bought a 500GB SSD and I have a 250GB SSD laying around and didn't want to send back the 500 for a 250. I am worried 250GB may not be enough space. I already have a 250GB M.2 drive for my VMs and Dockers, so I don't see any other use case for the 250GB SSD. If I cant get Sync to work and TRIM support is not coming, I may just send back the 500GB SSD.

Share this post


Link to post
14 hours ago, Twisted said:

but I just bought a 500GB SSD and I have a 250GB SSD laying around and didn't want to send back the 500 for a 250. I am worried 250GB may not be enough space.

If you don't need the mirror protection of the raid-pool but want to treat it as a JBOD pool that is also possible

Edited by Jcloud

Share this post


Link to post
11 minutes ago, Twisted said:

How do you handle redundancy?

System doesn't it's just JBOD. For Docker/apps I use the CA backup utility. For my VMs I make a manual copy of my vhd's to a vmBackup folder on my array. Everything else is either Mover for the shares. Which just leaves the Steam games I leave on the cache, and to me if I lost this I only see lost time as I just have to download it from Steam again (for me this isn't a problem).   

 

So you're correct, my cache is not protected - I'm fine with that, and take steps to cover my bum. :)

Share this post


Link to post

My challenge is I need redundancy. So i need to put an SSD in the main array or figure out how to setup Sync. So if TRIM support is coming, I am not going to hassle with trying to find an alternate solution.

Share this post


Link to post
4 minutes ago, Twisted said:

My challenge is I need redundancy. So i need to put an SSD in the main array or figure out how to setup Sync. So if TRIM support is coming, I am not going to hassle with trying to find an alternate solution.

 

I would not hold my breath for unRAID to offically support SSD arrays. I don't know how SSD parity would ever work. The whole idea of unused space has no real meaning with parity, unless we treat a block of parity as unused if the corresponding blocks on all disks are unused. unRAID would have to be very aware of the inner workings of the file systems to know. And if you filled a disk, parity would always be considered full and never able to be trimmed - hence would slow down. I think we're looking at a paradigm shift for SSDs to be supported. You could use SSDs in the array if parity is a spinner and trim runs periodically (let's say ever 2 months), and that when you trim all the SSDs, you rebuild parity. Not sure it is worth it.

 

So I'd stick with spinners for the array. If you have a specific use case that requires faster speed and redundancy, go with a cache pool and you'd get speed and redundancy . Or go with a RAID card in the unRAID server and run it as a UD alongside the array on the server.

Share this post


Link to post
15 minutes ago, SSD said:

You could use SSDs in the array if parity is a spinner and trim runs periodically (let's say ever 2 months), and that when you trim all the SSDs, you rebuild parity.

That triggered an idea. Perhaps offer the option of trim SSD data drives + rebuild parity zeroing it first if it was SSD as well, with the warning that until the function has completed, your array data (ALL OF IT) is at risk. For some, that could be an acceptable risk in return for the performance return.

 

Some SSD's have good enough spare sector wear leveling and internal management that they don't suffer from performance drops nearly as bad as other models. Perhaps if the incentive was there, we could figure out which specific SSD's could be used as is right now. @johnnie.black did some testing a while back which seemed to indicate that certain models worked fine in unraid as array drives in certain circumstances, even without trim enabled.

Share this post


Link to post
3 hours ago, jonathanm said:

That triggered an idea. Perhaps offer the option of trim SSD data drives + rebuild parity zeroing it first if it was SSD as well, with the warning that until the function has completed, your array data (ALL OF IT) is at risk. For some, that could be an acceptable risk in return for the performance return.

 

Some SSD's have good enough spare sector wear leveling and internal management that they don't suffer from performance drops nearly as bad as other models. Perhaps if the incentive was there, we could figure out which specific SSD's could be used as is right now. @johnnie.black did some testing a while back which seemed to indicate that certain models worked fine in unraid as array drives in certain circumstances, even without trim enabled.

 

Rebuilding parity is a particularly expensive operation for an SSD given its limited # of writes. 

Share this post


Link to post

Has anyone tried to use a Docker Sync to pull data off of an unassigned SSD and copy it to the main array nightly? I think this seems to be the best option for those of us who have a random stack of SSD sizes laying around.

Share this post


Link to post
8 minutes ago, Twisted said:

Has anyone tried to use a Docker Sync to pull data off of an unassigned SSD and copy it to the main array nightly? I think this seems to be the best option for those of us who have a random stack of SSD sizes laying around.

 

I have not. Instead of using an unassigned device ssd, could you make the ssd the cache. Then I think the mover would do what you want.

Share this post


Link to post

The only challenge I have is I have 250GB & 500GB SSD drives. I could send back the 500GB, but I think my data would fill up most of the cache drive at 250GB. I was hoping to find an alternative, so I can use what I have.

Edited by Twisted

Share this post


Link to post
8 hours ago, SSD said:

And if you filled a disk, parity would always be considered full and never able to be trimmed - hence would slow down.

There is no obligatory need to trim SSD. It's just a question of which SSD you buy - an SSD with a hidden overprovisioning pool will handle block erases in the background as it rotates flash blocks in/out of the overprovisioning pool, making it possible to treat the SSD as a normal HDD.

 

So a RAID with SSD can work very well, and there are already lots of products that runs RAID on SSD.

Share this post


Link to post
3 hours ago, pwm said:

There is no obligatory need to trim SSD. It's just a question of which SSD you buy - an SSD with a hidden overprovisioning pool will handle block erases in the background as it rotates flash blocks in/out of the overprovisioning pool, making it possible to treat the SSD as a normal HDD.

 

So a RAID with SSD can work very well, and there are already lots of products that runs RAID on SSD.

 

Looking at the Samsung 960 PRO, the endurance of the 512G version is 400TB. That basically means that it can be filled 800 times. That is a pretty big number.

 

But I am assuming that is looking at a usage pattern in which wear is effectively managed. If wear were to be poorly managed, let's say the hardest hit blocks get hit 10x more than other parts, that number goes from 800 to 80. (Maybe that is not realistic - I am not an expert. But without trim I am not sure.) Running frequent parity rebuilds re-writing every sector would take chunks out of its lifespan that would add up.

 

Generally I would say that the cache pool is a better place for data requiring high speed access. That or a HW RAID controller that is designed with SSD usage in mind, running as a UD. High speed access to a large media library really doesn't make a lot of sense.

Share this post


Link to post
26 minutes ago, SSD said:

But I am assuming that is looking at a usage pattern in which wear is effectively managed. If wear were to be poorly managed, let's say the hardest hit blocks get hit 10x more than other parts, that number goes from 800 to 80. (Maybe that is not realistic - I am not an expert. But without trim I am not sure.)

The SSD is responsible for wear-leveling. So it will regularly change the internal mapping what flash blocks that will be used for different LBA written from the OS.

 

A more troubling issue is write amplification - how very small writes will count as large writes when it comes to actual drive wear. If the SSD have 128 kB flash blocks, then it isn't always possible for the drive to fit 128 kB of writes on the block before it needs to erase the block and restart. And when it's time to erase a flash block, the drive sometimes has part of the block in use and so have to copy that data to a new block before it can append the new write - so the drive then performs additional internal writes besides the writes sent over the SATA cable.

 

Some SSD has a specific SMART attribute that specifies the amount of write amplification.

Share this post


Link to post
8 minutes ago, pwm said:

The SSD is responsible for wear-leveling. So it will regularly change the internal mapping what flash blocks that will be used for different LBA written from the OS.

 

A more troubling issue is write amplification - how very small writes will count as large writes when it comes to actual drive wear. If the SSD have 128 kB flash blocks, then it isn't always possible for the drive to fit 128 kB of writes on the block before it needs to erase the block and restart. And when it's time to erase a flash block, the drive sometimes has part of the block in use and so have to copy that data to a new block before it can append the new write - so the drive then performs additional internal writes besides the writes sent over the SATA cable.

 

Some SSD has a specific SMART attribute that specifies the amount of write amplification.

 

Would Plex metadata create a significant write amplification issue?  Plex metadata files are tiny, but there are a zillion of them!

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


Copyright © 2005-2018 Lime Technology, Inc.
unRAID® is a registered trademark of Lime Technology, Inc.