jpeltoniemi

Utilizing SSD as a smart read cache

37 posts in this topic Last Reply

Recommended Posts

2 minutes ago, trurl said:

As you have probably already discovered, this is slackware linux.

Yeah. I'm mostly used to Debian-based distros but luckily Slackware is not nightmarishly different from Debian.

5 minutes ago, trurl said:

As far as I know there isn't really any "3rd party" except open source. ... I think this includes both the parity and user share implementation.

3rd party, as in not in-house code. Hopefully with documentation somewhere. I'd assume(like that's done me any good so far) that parity is Limetech's own implementation, but not so sure about user shares, as software like UnionFS exist. Maybe they didn't fit unRAID's requirements or maybe Limetech want to reinvent the wheel just for fun(I do this all the time). Let's hope someone has more accurate information :)

Share this post


Link to post

unRAID 'shfs' was written long before other currently available union-type filesystems.  Originally it just returned symlinks.  Anyway, current implementation allows for certain unRAID-specific functions, such as supporting different allocation methods (like split level), hard links and preserving when possible, COW flags.

 

'shfs' operates similar to other union file systems.  The 'top' branch is always '/mnt/cache' and other branches are, in order, '/mnt/disk1', '/mnt/disk2', ... '/mnt/diskN'.

 

I don't mind answering specific questions, but most of the info is sprinkled around in docs on website, wiki, and forum posts.  Sorry about the 'search' capability of the forum which fairly sucks, we're looking into that.  Probably we should get around to writing a formal doc on how 'shfs' (and other things) work... added to the todo list...

Share this post


Link to post
1 hour ago, limetech said:

unRAID 'shfs' was written long before other currently available union-type filesystems.  Originally it just returned symlinks.  Anyway, current implementation allows for certain unRAID-specific functions, such as supporting different allocation methods (like slit level), hard links and preserving when possible, COW flags.

I was just beginning to think that shfs in unRAID maybe isn't what can be found on Google. Good, now I can stop trying to make sense how that shfs fits into this equasion :D I'll forgive myself with this one since with Greyhole you actually mount storage pools using CIFS.

1 hour ago, limetech said:

'shfs' operates similar to other union file systems.  The 'top' branch is always '/mnt/cache' and other branches are, in order, '/mnt/disk1', '/mnt/disk2', ... '/mnt/diskN'.

Nice! I guess this means I can put whatever I want in cache and it'll just work, as long as I make sure that mover won't touch them. There shouldn't be a problem even if I disable mover and write my own.

1 hour ago, limetech said:

I don't mind answering specific questions, but most of the info is sprinkled around in docs on website, wiki, and forum posts.  Sorry about the 'search' capability of the forum which fairly sucks, we're looking into that.  Probably we should get around to writing a formal doc on how 'shfs' (and other things) work... added to the todo list...

TBH the docs could be better. If I may suggest a quick improvement, just a list of essential keywords, config files and scripts would be a big help in getting started with tuning unRAID. That said, I have to commend your involvement with the community. Kind of makes me want to use my last money to buy a license right now instead of waiting for payday :D

Share this post


Link to post
2 hours ago, limetech said:

Probably we should get around to writing a formal doc on how 'shfs' (and other things) work... added to the todo list...

Just a quick question - does shfs keep static inode allocations until reboot or is fuse allowed to reuse inode allocations?

 

Inode reuse is an issue for NFS shares.

Share this post


Link to post
2 hours ago, jpeltoniemi said:

I guess this means I can put whatever I want in cache and it'll just work, as long as I make sure that mover won't touch them

 

If you set 'use cache' for the share to 'no' or 'only', the 'mover' will not operate on that share at all.

 

2 hours ago, pwm said:

Just a quick question - does shfs keep static inode allocations until reboot or is fuse allowed to reuse inode allocations?

 

By default FUSE will free inodes after a delay (looking at FUSE source code, min is 10 seconds).  That is what the 'fuse_remember' tunable is for on the Settings/NFS page.  That is also the reason we tie that tunable to the NFS settings page.  When NFS is enable the default value of 330 is 5 1/2 minutes (when NFS is not enabled, a setting of 0 is passed to FUSE).  The typical client-side NFS handle cache time-to-live is 5 minutes.  You have to be careful with this setting however, because if you are doing operations that touch a huge number of files, the FUSE memory footprint can keep growing.  This is related to an issue we're looking into right now:

 

Share this post


Link to post
9 hours ago, limetech said:

 

If you set 'use cache' for the share to 'no' or 'only', the 'mover' will not operate on that share at all.

 

 

By default FUSE will free inodes after a delay (looking at FUSE source code, min is 10 seconds).  That is what the 'fuse_remember' tunable is for on the Settings/NFS page.  That is also the reason we tie that tunable to the NFS settings page.  When NFS is enable the default value of 330 is 5 1/2 minutes (when NFS is not enabled, a setting of 0 is passed to FUSE).  The typical client-side NFS handle cache time-to-live is 5 minutes.  You have to be careful with this setting however, because if you are doing operations that touch a huge number of files, the FUSE memory footprint can keep growing.  This is related to an issue we're looking into right now:

 

Yes, one reason why I asked was specifically thinking about that memory leak thread. I have developed some own FUSE applications but they supply inode values from a database (since they work as "time machine" and can present arbitrary disk snapshots based on backup times). But since my FUSE code will always supply the same inode for each presented file I haven't seen any issue with leaked memory even if the database contains many hundred million files.

Share this post


Link to post
6 hours ago, pwm said:

my FUSE code will always supply the same inode for each presented file

 

You're using the low-level interface?

Share this post


Link to post
1 minute ago, limetech said:

 

You're using the low-level interface?

Yes.

 

For simpler things - like presenting individual streams of a BD image - i use the high-level interface.


But the VFS for the backup server solution is using the low-level interface and hands over the database record ID as inode to FUSE. But I haven't found much information about how FUSE itself handles inodes - they have their own inode field in their structures but seems to always duplicate the inode value I supplied.

 

 

 

 

 

 

Share this post


Link to post
7 minutes ago, pwm said:

hands over the database record ID as inode to FUSE

 

Your DB record ID is passed as the st_ino value you mean?  Do you see issues with NFS stale file handles?

Share this post


Link to post
Just now, limetech said:

 

Your DB record ID is passed as the st_ino value you mean?  Do you see issues with NFS stale file handles?

Yes, I place the DB record ID into the st_ino field before handing over to the FUSE code.


I have used it too little with NFS shares - most browsing is either done using SMB (with the FUSE VFS running on the storage server), or I run a copy of the FUSE VFS on the client machine and stream the actual file data over a TLS-encrypted tunnel from the storage server.

 

I should really set up a dedicated test system stressing NFS - especially since the VFS also includes all my media files and could present suitable movie or music selections to media players. I already know my older Popcorn Hour and QNAP media players works much better with NFS than SMB for some movie titles.

 

One important difference here compared to shfs in unRAID is that my VFS only allows viewing of archived file data - i.e. read-only access. Writes to the storage server happens by having a backup client scan a directory tree and "check in" changes to the storage pool. But this doesn't involve any FUSE code. The VFS code can just get a notification that more file data has been "committed" to the storage server so it can check if more and/or changed files should be made visible in the presented VFS.

Share this post


Link to post
34 minutes ago, pwm said:

Yes, I place the DB record ID into the st_ino field before handing over to the FUSE code.


I have used it too little with NFS shares - most browsing is either done using SMB (with the FUSE VFS running on the storage server), or I run a copy of the FUSE VFS on the client machine and stream the actual file data over a TLS-encrypted tunnel from the storage server.

 

I should really set up a dedicated test system stressing NFS - especially since the VFS also includes all my media files and could present suitable movie or music selections to media players. I already know my older Popcorn Hour and QNAP media players works much better with NFS than SMB for some movie titles.

 

One important difference here compared to shfs in unRAID is that my VFS only allows viewing of archived file data - i.e. read-only access. Writes to the storage server happens by having a backup client scan a directory tree and "check in" changes to the storage pool. But this doesn't involve any FUSE code. The VFS code can just get a notification that more file data has been "committed" to the storage server so it can check if more and/or changed files should be made visible in the presented VFS.

 

Sounds like a nice project.  Here's what happens with NFS.  As you know all I/O is referenced against a file handle, which is an opaque field (though in practice it's easy to see how an OS forms a file handle and this is a major security hole IMHO and one reason I really hate NFS and would really like to rip it out of unRAID... but I digress...).  What you will find, if you ever have to support NFS, is that older clients, especially older media/dvd players, only support a 32-bit file handle field, even though NFSv3 spec permits 64 bits.  Must keep this in mind if you want to be compatible with those devices.

 

Edit: that bit of knowledge just saved you days of debugging... you're welcome :D

Share this post


Link to post

Yes, it's a fun project. It also makes sure my unRAID machines don't get too bored since they are part of the storage pool - committed files are normally sent to multiple storage pools for redundancy. :)

 

But you did catch me with the NFS handle size. I think the physical files still just about fits in 32-bit numbers but virtual views are created using inode values way outside the 32-bit range.

 

A lot can be said about NFS security. Especially since most equipment runs NFSv3 that is limited to host-based authentication. But it works quite well for read-only access to media files.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


Copyright © 2005-2018 Lime Technology, Inc.
unRAID® is a registered trademark of Lime Technology, Inc.