SSDs seem to work fine within unRAID but probably kills Parity when thrashed

JonathanM · January 7, 2016

Now I'm confused. If trim on an ssd can move data around and possibly corrupt parity then that would mean that the ssd knows and understands the filesystem present on the ssd and updates the file allocation and b trees accordingly. In other words this means that trim actively modifies your file system, and I find that very hard to believe.

In theory, everything should be fine, a zero written to a location should always stay a zero, etc. However, what happens to overwritten files is unclear to me in the context of parity. A normal HDD overwrites in place, but an SSD writes to a new physical location, and marks the original physical location as garbage to be overwritten later in a full page operation. Where does the "extra" space get allocated to and from? Does the number of writeable locations change? In an unaware OS situation, it's recommended to leave a percentage of the SSD unallocated, so background garbage collection can work. Unraid uses the whole presented size. Does the SSD refuse the write if it has no more clean pages?

Like I said, I am unclear, and have no idea what I'm talking about, just processing what I have read elsewhere.

Squid · January 7, 2016

K. Researched a bit more and the problem would be due to the wear levelling combined with trim basically means that any given sector that has been trimmed may not contain the same data that it did prior to the trim (since the wear levelling means that the next time sector 123 is written to may not be the same sector that was written to earlier and hence may not contain the same data after a trim operation since trim doesn't actually zero a sector merely just marks it as not in use)

As far as unraid, btrfs and xfs both support trim automatically but unraids md driver may interfere with it.

So I'm going to concede the point and agree that ssds would intermittently give parity errors.

BRiT · January 7, 2016

That should never be the case.

The contents of LOGICAL sector 123 should always be what we last wrote to LOGICAL sector 123. LOGICAL sector 123 may not be PHYSICAL sector 123, it may be PHYSICAL sector 98675309. The parity operations should always operate on the logical sectors.

JonathanM · January 7, 2016

That should never be the case.

The contents of LOGICAL sector 123 should always be what we last wrote to LOGICAL sector 123. LOGICAL sector 123 may not be PHYSICAL sector 123, it may be PHYSICAL sector 98675309. The parity operations should always operate on the logical sectors.

Agreed. The only penalties we SHOULD see are speed, when the SSD is forced to move data around to rewrite a whole page when it runs out of clean pages to write to. However, there are reports of parity not staying consistent when using SSDs. Any speculation on cause?

BRiT · January 7, 2016

That should never be the case.

The contents of LOGICAL sector 123 should always be what we last wrote to LOGICAL sector 123. LOGICAL sector 123 may not be PHYSICAL sector 123, it may be PHYSICAL sector 98675309. The parity operations should always operate on the logical sectors.

Agreed. The only penalties we SHOULD see are speed, when the SSD is forced to move data around to rewrite a whole page when it runs out of clean pages to write to. However, there are reports of parity not staying consistent when using SSDs. Any speculation on cause?

Depends on filesystem and how they support trim on SSDs.

If its a filesystem that supports trim directly, when a file is deleted does it send a trim command on the logical sectors used by that file? Or does it write 0s to those logical sectors? If it only sends the trim command, then the ssd is writing 0s behind the scenes of the unraid Parity scheme. That would cause massive parity consistency issues.

Squid · January 8, 2016

From Wikipedia

Trim irreversibly deletes from the SSD the data Trim affects. Recovery of data deleted by Trim is not possible. This is unlike a magnetic drive from which deleted data may often be recovered.

In other words because the now unused block has been erased on the drive behind the scenes of the filesystem parity is automatically void.

BRiT · January 8, 2016

It looks like using BTRFS or XFS on an SSD as a data disk in a Parity Protected Array would cause massive parity corruption issues, regardless of the discard method enabled.

If it runs fstrim or -o discard, it will change the sectors content behind the back of the parity protection mechanism.

https://btrfs.wiki.kernel.org/index.php/FAQ#Does_Btrfs_support_TRIM.2Fdiscard.3F

Does Btrfs support TRIM/discard?

There are two ways how to apply the discard:

during normal operation on any space that's going to be freed, enabled by mount option discard

on demand via the command fstrim

"-o discard" can have some negative consequences on performance on some SSDs or at least whether it adds worthwhile performance is up for debate depending on who you ask, and makes undeletion/recovery near impossible while being a security problem if you use dm-crypt underneath (see http://asalor.blogspot.com/2011/08/trim-dm-crypt-problems.html ), therefore it is not enabled by default. You are welcome to run your own benchmarks and post them here, with the caveat that they'll be very SSD firmware specific.

The fstrim way is more flexible as it allows to apply trim on a specific block range, or can be scheduled to time when the filesystem perfomace drop is not critical.

http://xfs.org/index.php/FITRIM/discard

Modes of Operation

Realtime discard -- As files are removed, the filesystem issues discard requests automatically

Batch Mode -- A user procedure that trims all or portions of the filesystem

JorgeB · January 10, 2016

Following up on my earlier post about sync errors with SSDs, I’ve been doing some more testing and this only happens for me using some Kingston 120GB V300 SSDs that had HPA set to 32GB, so they were same size as my other SSDs.

I didn’t have sync errors with my other SSDs so disabled HPA in the Kingstons to confirm and errors went away, I never though HPA could cause this but in way it’s good news.

With HPA on I just had to turn off server, wait a few hours and errors would appear, I took note of the actual sync errors, it was always sector 0 and various sectors at the end of the disk, example:

unRAID Parity check: 07-01-2016 22:50
Notice [TESTV6] - Parity check finished (0 errors)
Duration: 1 minute, 25 seconds. Average speed: 376.7 MB/s

unRAID Parity check: 08-01-2016 07:41
Notice [TESTV6] - Parity check finished (3 errors)
Duration: 1 minute, 25 seconds. Average speed: 376.7 MB/s

Jan 8 07:38:59 Testv6 kernel: md: parity incorrect, sector=0
Jan 8 07:39:41 Testv6 kernel: md: parity incorrect, sector=31267616
Jan 8 07:39:41 Testv6 kernel: md: parity incorrect, sector=31267624

unRAID Parity check: 08-01-2016 13:32
Notice [TESTV6] - Parity check finished (5 errors)
Duration: 1 minute, 25 seconds. Average speed: 376.7 MB/s

Jan 8 13:30:26 Testv6 kernel: md: parity incorrect, sector=0
Jan 8 13:31:09 Testv6 kernel: md: parity incorrect, sector=31266920
Jan 8 13:31:09 Testv6 kernel: md: parity incorrect, sector=31266928
Jan 8 13:31:09 Testv6 kernel: md: parity incorrect, sector=31267616
Jan 8 13:31:09 Testv6 kernel: md: parity incorrect, sector=31267624

Before removing HPA, also used 2 of these SSDs in a cache pool and there were no issues, I filled them with data and ran a scrub, always 0 errors.

scrub status for 50270b05-b2af-4f13-b37d-2867f59dfc43
scrub started at Fri Jan  8 22:51:52 2016 and finished after 00:01:00
total bytes scrubbed: 55.63GiB with 0 errors

scrub status for 50270b05-b2af-4f13-b37d-2867f59dfc43
scrub started at Sat Jan  9 07:41:35 2016 and finished after 00:01:00
total bytes scrubbed: 55.63GiB with 0 errors

So, probably SSDs are safe to use on the protected array, but still a good idea to do frequent parity checks in the beginning to make sure there are no issues.

garycase · January 10, 2016

Thanks Jonnie -- good to know.

I've been thinking of buying one of the frequent Newegg specials with 4 of the 120GB Kingston SSDs for ~ $150 just to set up a small test server similar to yours. [i also have a couple older 64GB and 80GB SSDs I could toss in the test unit. I already have one test server, but it's using some old spinners, and every time you post another test result I think I should have an SSD-based one just so everything would be FAST :)

BRiT · January 10, 2016

What filesystem were you using? If you are using xfs or btrfs do you ever run "fstrim"?

Also, can you try the following scenario using btrfs or xfs:

Ensure parity check completes without errors.

Create and then Delete a handful of files at least 64 meg in size that are non Zero filled off the various SSD data drives.

Run the "fstrim" utility on the SSD drive.

Run parity check again.

I suspect fstrim will wreck parity.

I think using something like /dev/random as input device to "dd" would suffice for non Zero file generation. The non zero is so when the sector gets zeroed out / reclaimed by fstrim it should trigger parity mismatches. The 64 meg in size is just so its large enough to be a block thats able to be reclaimed by the SSD during fstrim run.

JorgeB · January 10, 2016

What filesystem were you using? If you are using xfs or btrfs do you ever run "fstrim"?

Also, can you try the following scenario using btrfs or xfs:

Ensure parity check completes without errors.

Create and then Delete a handful of files at least 64 meg in size that are non Zero filled off the various SSD data drives.

Run the "fstrim" utility on the SSD drive.

Run parity check again.

I suspect fstrim will wreck parity.

I think using something like /dev/random as input device to "dd" would suffice for non Zero file generation. The non zero is so when the sector gets zeroed out / reclaimed by fstrim it should trigger parity mismatches. The 64 meg in size is just so its large enough to be a block thats able to be reclaimed by the SSD during fstrim run.

I did try earlier but can only trim cache, if I try an array disk I get an error.

root@Testv6:~# fstrim -v /mnt/cache
/mnt/cache: 4292886528 bytes were trimmed

disk1 (xfs) and disk3 (btrfs) same error:

root@Testv6:~# fstrim -v /mnt/disk1
fstrim: /mnt/disk1: FITRIM ioctl failed: Operation not supported

BRiT · January 10, 2016

Perhaps on /dev/md1 then ?

But thats a device and i think fstrim wants filesystem.

Might not be an issue then, but MN only way to try it out is using tne "discard" option at mount time, but thats a bit harder to put in place.

JorgeB · January 10, 2016

Perhaps on /dev/md1 then ?

But thats a device and i think fstrim wants filesystem.

Might not be an issue then, but MN only way to try it out is using tne "discard" option at mount time, but thats a bit harder to put in place.

root@Testv6:~# fstrim -v /dev/md1
fstrim: /dev/md1: not a directory

Yep, it only works on mount points, could LT have disabled trim on array devices?

I don’t mind trying the discard option, if someone can tell me how to do it.

BRiT · January 10, 2016

I'm not sure how to remount or alter the data disk mounting options in unRaid.

On my system with XFS drives, the following is in the syslog showing how it mounts:

 set -o pipefail ; mount -t xfs -o noatime,nodiratime /dev/md1 /mnt/disk1|& logger

To enable "fstrim" one needs to also specify "discard" on the list of options. The mount command would be:

mount -t xfs -o noatime,nodiratime,discard /dev/md1 /mnt/disk1

From BTRFS mount options (its the same for XFS): https://btrfs.wiki.kernel.org/index.php/Mount_options

discard

Enables discard/TRIM on freed blocks. This can decrease performance on devices that do not support queued TRIM command, like SATA prior to revision 3.1.

JorgeB · January 10, 2016

So I did another test, at the moment I have an array with 4 x Kingston 120GB SSDs, disk1(XFS) had about 60GB used, deleted 30GB of files and did a parity check, as expect finished with 0 errors, I then assigned disk1 to the cache slot and ran FSTRIM:

root@Testv6:~# fstrim -v /mnt/cache
/mnt/cache: 89387737088 bytes were trimmed

Put disk1 back in the original array and after trusting parity ran another parity check:

unRAID Parity check: 10-01-2016 19:44
Notice [TESTV6] - Parity check finished (4700498 errors)
Duration: 6 minutes, 34 seconds. Average speed: 304.7 MB/s

Still this does not answer 2 questions:

-would Unraid update parity if FSTRIM was run with disk1 in the array?

-most modern SSDs have background garbage collection routines, will these cause same issue?

Test server is usually on for a few minutes at a time, internal SSD garbage collection may need some idle time to work, next weekend I’m going to use one of each of different SSDs I can get my hands on, should be 5 or 6 different brands, delete some data then leave server on for over 24 hours, this should be enough for garbage collection to do its thing, naturally this test will only be conclusive if there are sync errors.

BRiT · January 10, 2016

I think fstrim would wreck parity since it is not considered a block write command and would not be seen as a write to the device, so parity would not be updated. The unraid MD device system would need to be patched in to treat an FITRIM command for a data drive sector as a "write all 0s" and update parity appropriately.

As for drives with automatic garbage collection, it all depends on how smart they are. The logical sector(s) would still have valid data on it as far as it's concerned so those sectors wouldn't be reaped and cleared. What the drive could do then is maybe erase and reorganize the logical sectors into more consistent physical blocks depending on their wear leveling algorithms.

I'm definitely looking forward to the results of the extended test next weekend.

JorgeB · January 17, 2016

So I was able to start testing before the weekend, on Thursday built an array of 10 SSDs with 9 different models from 8 brands.

I’ve filled them all with data, then deleted about half, waited 24 hours, did a parity check, filled them up again and then deleted all data, waited another 24 hours and did another parity check, and finally filled them about half way, waited 12 hours and did another parity check, always with 0 sync errors (This last time could not wait 24 hours since I’m having a problem with one of my servers that I need to troubleshoot).

While this test is not conclusive I believe that it’s probably safe to use SSDs with parity, TRIM does not work with SSDs on the array, this is probably a good thing for now as it would possibly invalidate parity, write performance won’t be optimal but I’m sure LT will look into that in the future, as SSDs become bigger and cheaper.

Screenshot below with SSDs used.

garycase · January 17, 2016

Thanks for that test Johnnie => certainly seems to show that SSDs are okay in the array.

I love the < 8 minute parity checks :)

SSDs seem to work fine within unRAID but probably kills Parity when thrashed

Recommended Posts

JonathanM

Link to comment

Squid

Link to comment

BRiT

Link to comment

JonathanM

Link to comment

BRiT

Link to comment

Squid

Link to comment

BRiT

Link to comment

JorgeB

Link to comment

garycase

Link to comment

BRiT

Link to comment

JorgeB

Link to comment

BRiT

Link to comment

JorgeB

Link to comment

BRiT

Link to comment

JorgeB

Link to comment

BRiT

Link to comment

JorgeB

Link to comment

garycase

Link to comment

Join the conversation