Serious issue with AFP/NFS: random directories/files temporarily not accessible


Recommended Posts

I'm really trying to get a workable AFP or NFS solution running between my Mac Mini and my unRAID box.  Unfortunately SMB is a non-starter for me since I have thousands of files, many of which have long names or special characters which seem to be fundamentally incompatible with SMB/CIFS.

 

I first noticed issues when I was copying files on to the array via rsync to an AFP mount.  During a large transfer (I was basically doing a terabyte at a time), I noticed that a handful of files would inexplicably fail with permission errors.  I wrote this issue up in the AFP forum here: http://lime-technology.com/forum/index.php?topic=21155.0 but I didn't get much traction over there.

 

When I saw that the NFS stale file handle issue was addressed, I upgraded to -rc6-8168-test and tried switching to NFS.  Indeed, I did not get stale file handle errors, but I did see some very similar behavior where files and directories would just "go missing" and then mysteriously reappear.  I wrote this issue up in this post here: http://lime-technology.com/forum/index.php?topic=21377.msg191887#msg191887

 

Given the similarity between these two issues, it seems like it may be something more fundamental with how AFP/NFS are playing with the user shares.

 

The issue seems worse under load, but I've seen it happen "at rest" too, as in the NFS issue writeup.  It'll basically just be sitting there, and then refuse to see or enter existing/readable directories when I "cd" into them.

 

In part I'm using unRAID for storage for a media server, and I have a scanner process that looks for new files every so often or when something is added.  One super-annoying consequence of this issue in this application is that TV shows and movies will disappear during one scan, then reappear during the next scan with no other config or underlying FS changes in between.  It also just seems like a very serious issue if you can't count on all your files and directories being accessible at any given time.

 

I'm attaching a full syslog where I do a fresh boot and then do a couple of scans from my media server.  As you'll see, there's very little to go on, except for a few messages at the very end (generated during the scans) about duplicate .DS_Store objects.  I don't know if that's a clue or not. 

 

Nothing else that I can think to check seems amiss with the system -- all my drives are green, parity is valid, etc.

 

If anyone has ideas, please let me know what else I can do here.  I'm more than willing to do any kind of tweaking and debugging to figure this out, but I'm kind of at a dead end.  Thanks a lot in advance...

syslog-2012-07-23.txt

Link to comment

I turned off all NFS sharing, and I'm having a hard time nailing down this problem to where I can recreate the "ls" problem using AFP.

 

To be sure, it's still failing/missing random directories and files in the media scans (and rsyncs and large transfers with lots of files), but when I go to do the "ls" it lists the directory.  It seems like when I use AFP the problem is much more transient -- I can run an rsync and it'll fail on a handful of files once, and run it just a second or two later and it will work for those same files.

 

I have to ask -- are people out there heavily using AFP/NFS with Mac clients and not seeing this issue?  It seems like the vast majority of unRAID users are just using SMB.  I'm pretty bummed because I've invested a lot of time and money in this solution (not realizing this limitation) but this is definitely not workable as-is.

Link to comment

I'm using APF and have used NFS under 4.7. No problems with either.

 

Delete the following files: .AppleDB, .AppleDesktop, and .AppleDouble from the share (delete .Apple*). Make sure to delete any duplicates that may exist on multiple disks included in the share, e.g. "rm -r /mnt/disk*/sharename/.Apple*". Disable sharing for the share before deleting.

Link to comment

I'm using APF and have used NFS under 4.7. No problems with either.

 

Delete the following files: .AppleDB, .AppleDesktop, and .AppleDouble from the share (delete .Apple*). Make sure to delete any duplicates that may exist on multiple disks included in the share, e.g. "rm -r /mnt/disk*/sharename/.Apple*". Disable sharing for the share before deleting.

 

Hey, thanks for the reply and for taking a look at my issue.  I actually didn't realize AFP was supported under 4.7, but in any case 4.7 doesn't seem to support my NIC (newer Intel Gbit on an Asus motherboard), so that's not an immediate option.  I actually wouldn't mind picking up a compatible PCIe NIC, but I'd rather not be stuck in 4.7 land as unRAID moves on to 5.0 and beyond.

 

Previously, I had deleted all .DS_Stores and set my Mac Mini not to write them to network drives anymore.  They have not reappeared, and I'm no longer getting the error in the log about the duplicate objects, so that seems good.

 

Per your suggestion, I just deleted all the .Apple*'s, but I had to use the following:

 

find . -name ".Apple*" -printf '"%p"\n' | xargs rm -Rf

 

Nevertheless, I verified they are gone.  I'll keep an eye on it to see if things improve.  Thanks again for your help.

Link to comment

Unfortunately deleting all the Apple metadata I could find (.DS_Stores, the .Apple*'s and a bunch of ._[filename] files that somehow got created), the behavior of the server is exactly the same with files and directories being intermittently inaccessible.

 

I am racking my brain for why this seems to be happening only to me.  I'm using standard, modern hardware across the board, everything is on a hardwired gigabit connection with a single switch between the client and the unRAID box, nothing special about my Mac Mini or OSX install, all config options for AFP are at their defaults, etc.

 

Any other ideas welcome.  Thanks again for taking a look.

Link to comment

I may have the same issue with RC3.  I have noticed folders "shadowed" out, and some issues with iTunes.  Just was able to look at one right now, and the drive where the folder resides was spun down.  After forcing a spin up on all drives, the folder became available.  Could be that my unraid box is a 10 year old Dell Pentium, and I am also copying stuff to it.  I don't know.  Just a data point...

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.