gfjardim Posted May 10, 2012 Share Posted May 10, 2012 Ok, I've tracked down the AFP bug, and it's definitely caused by database being stored on the disks/shares. Because on some scenarios the BerkleyDB files are stored on multiple disks, and because the Netatalk database query timeout is set to seven seconds, if the disks are in standby when the share is accessed the AFP export will be mounted with temporary CNID. Even if the db is stored in an unique disk, if the spinup/read task take more than 7 seconds, the query will timeout and temporary CNID's will be used. So there are two options: a) increase the query timeout or; b) store those DB files on a flash disk. I've done the second one with a mounted virtual disk, and it worked. I've searched through the Netatalk documentation, and apparently the first one can only be accomplished by patching the Netatalk source code. Quote Link to comment
Interstellar Posted May 10, 2012 Share Posted May 10, 2012 Ok, I've tracked down the AFP bug, and it's definitely caused by database being stored on the disks/shares. Because on some scenarios the BerkleyDB files are stored on multiple disks, and because the Netatalk database query timeout is set to seven seconds, if the disks are in standby when the share is accessed the AFP export will be mounted with temporary CNID. Even if the db is stored in an unique disk, if the spinup/read task take more than 7 seconds, the query will timeout and temporary CNID's will be used. So there are two options: a) increase the query timeout or; b) store those DB files on a flash disk. I've done the second one with a mounted virtual disk, and it worked. I've searched through the Netatalk documentation, and apparently the first one can only be accomplished by patching the Netatalk source code. YES! I hate the fact that AFP requires all the disks to spin up just to access the DB files. Also, I'm still having errors and unsteady transfer rates May 10 16:40:14 Tower kernel: r8169 0000:02:00.0: eth0: link up (Network) May 10 16:40:39 Tower logger: Thu May 10 10:08:23 BST 2012 - Hard Drives active, resetting counter May 10 16:41:12 Tower kernel: r8169 0000:02:00.0: eth0: link up (Network) May 10 16:41:38 Tower last message repeated 15 times May 10 16:41:39 Tower logger: Thu May 10 10:08:23 BST 2012 - Hard Drives active, resetting counter May 10 16:41:39 Tower kernel: r8169 0000:02:00.0: eth0: link up (Network) May 10 16:41:53 Tower last message repeated 8 times May 10 16:42:39 Tower logger: Thu May 10 10:08:23 BST 2012 - Hard Drives active, resetting counter May 10 16:43:39 Tower logger: Thu May 10 10:08:23 BST 2012 - Hard Drives active, resetting counter May 10 16:44:22 Tower kernel: r8169 0000:02:00.0: eth0: link up (Network) May 10 16:44:47 Tower logger: Thu May 10 10:08:23 BST 2012 - Hard Drives active, resetting counter May 10 16:45:47 Tower logger: Thu May 10 10:08:23 BST 2012 - Hard Drives active, resetting counter May 10 16:46:04 Tower kernel: r8169 0000:02:00.0: eth0: link up (Network) May 10 16:46:40 Tower last message repeated 4 times May 10 16:46:46 Tower kernel: r8169 0000:02:00.0: eth0: link up (Network) May 10 16:46:47 Tower logger: Thu May 10 10:08:23 BST 2012 - Hard Drives active, resetting counter May 10 16:46:50 Tower kernel: r8169 0000:02:00.0: eth0: link up (Network) May 10 16:47:27 Tower last message repeated 2 times Transfer rate bounces around from 0MB/sec to 100MB/sec. I'm going to be testing the hardware in windows over the weekend so I will test the NIC to see if it is the hardware or the drivers syslog-2012-05-10.txt Quote Link to comment
limetech Posted May 10, 2012 Share Posted May 10, 2012 Ok, I've tracked down the AFP bug, and it's definitely caused by database being stored on the disks/shares. Great work, thank you! Because on some scenarios the BerkleyDB files are stored on multiple disks, and because the Netatalk database query timeout is set to seven seconds, if the disks are in standby when the share is accessed the AFP export will be mounted with temporary CNID. Interesting. I've found other places, both in the kernel, and in other linux subsystems where possible disk spin-up time is not taken into account when defining various time-outs. This is apparently another one. Even if the db is stored in an unique disk, if the spinup/read task take more than 7 seconds, the query will timeout and temporary CNID's will be used. So there are two options: a) increase the query timeout or; b) store those DB files on a flash disk. I've done the second one with a mounted virtual disk, and it worked. I've searched through the Netatalk documentation, and apparently the first one can only be accomplished by patching the Netatalk source code. This issue will provide a good illustration of what should go in a "beta" and what should go in an "rc". I've been thinking a lot about how netatalk handles the CNID translation and how best to integrate this into unRaid. My conclusion is that I think the best overall solution is to let user configure a separate space to store the DB, along with some buttons to facilitate creating a new DB, checking DB, etc. The idea being to store the DB on the Cache drive or other external drive (perhaps an SSD). I think this "feature" will help AFP users quite a bit and make for a better AFP experience with unRaid. However the issue can be "fixed" by patching the netatalk code to increase the DB access time-out. The first solution is probably in the order of 100 or so lines of new code; the second is probably modifying 1 line of code (though does require finding that line and understanding the code at least to the extent of knowing this would probably work). So the former will have to wait for 5.1, the latter I will put into -rc4. Quote Link to comment
savestheday Posted May 10, 2012 Share Posted May 10, 2012 I think there is a way to configure Netatalk via config file, no? I'm gonna have to look through my notes but pretty sure I do some AFP config on boot. Either way, this is great news. AFP is very painful right now for us Mac users. Thx Tom! Quote Link to comment
gfjardim Posted May 10, 2012 Author Share Posted May 10, 2012 I think there is a way to configure Netatalk via config file, no? I'm gonna have to look through my notes but pretty sure I do some AFP config on boot. Either way, this is great news. AFP is very painful right now for us Mac users. Thx Tom! To change the database storage path, all you need to do is change AppleVolumes.default from: :DEFAULT: cnidscheme:dbd options:upriv,usedots,nodev to: :DEFAULT: cnidscheme:dbd dbpath:/path-of-choice options:upriv,usedots,nodev I do that with "sed" in my "go" file to replace that line on "/etc/netatalk/AppleVolumes.default-". I've looked briefly into netatalk source, and found no easy way to increase the timeout interval. It's not exposed as a variable, so Tom is probably right about the amount of coding required. PS: Maybe those two variables at line 50 and 51 in this file can do the trick? http://fossies.org/linux/misc/netatalk-2.2.2.tar.gz:a/netatalk-2.2.2/libatalk/cnid/dbd/cnid_dbd.c Quote Link to comment
savestheday Posted May 10, 2012 Share Posted May 10, 2012 I think there is a way to configure Netatalk via config file, no? I'm gonna have to look through my notes but pretty sure I do some AFP config on boot. Either way, this is great news. AFP is very painful right now for us Mac users. Thx Tom! To change the database storage path, all you need to do is change AppleVolumes.default from: :DEFAULT: cnidscheme:dbd options:upriv,usedots,nodev to: :DEFAULT: cnidscheme:dbd dbpath:/path-of-choice options:upriv,usedots,nodev I do that with "sed" in my "go" file to replace that line on "/etc/netatalk/AppleVolumes.default-". I've looked briefly into netatalk source, and found no easy way to increase the timeout interval. It's not exposed as a variable, so Tom is probably right about the amount of coding required. PS: Maybe those two variables at line 50 and 51 in this file can do the trick? http://fossies.org/linux/misc/netatalk-2.2.2.tar.gz:a/netatalk-2.2.2/libatalk/cnid/dbd/cnid_dbd.c Ahh yeah, I knew it was something! This is what I have in my GO script: cp /boot/scripts/AppleVolumes.default- /etc/netatalk/AppleVolumes.default- My AppleVolumes.default- has some special settings towards the bottom :DEFAULT: cnidscheme:dbd options:upriv,usedots,nocnidcache dbpath:/mnt/cache/.appledbloc/$v Not sure that dbpath has ever improved it that much but maybe I should consider moving that to the flash drive. Would love to have your SED command as it's prolly much cleaner and SED confuses the crap out of me. Thx! Quote Link to comment
gfjardim Posted May 10, 2012 Author Share Posted May 10, 2012 Would love to have your SED command as it's prolly much cleaner and SED confuses the crap out of me. Thx! This is what I use: sed -i "s/cnidscheme:dbd/cnidscheme:dbd dbpath:\/vmware\//g" /etc/netatalk/AppleVolumes.default- Each slash or quote have to be escaped with a backslash. Quote Link to comment
savestheday Posted May 10, 2012 Share Posted May 10, 2012 Would love to have your SED command as it's prolly much cleaner and SED confuses the crap out of me. Thx! This is what I use: sed -i "s/cnidscheme:dbd/cnidscheme:dbd dbpath:\/vmware\//g" /etc/netatalk/AppleVolumes.default- Each slash or quote have to be escaped with a backslash. much appreciated! Quote Link to comment
limetech Posted May 10, 2012 Share Posted May 10, 2012 Not sure that dbpath has ever improved it that much but maybe I should consider moving that to the flash drive. I wouldn't put the DB on the flash device because it could shorten the device life if too many writes were happening during operation. Quote Link to comment
Interstellar Posted May 10, 2012 Share Posted May 10, 2012 Would love to have your SED command as it's prolly much cleaner and SED confuses the crap out of me. Thx! This is what I use: sed -i "s/cnidscheme:dbd/cnidscheme:dbd dbpath:\/vmware\//g" /etc/netatalk/AppleVolumes.default- Each slash or quote have to be escaped with a backslash. What line(s) of code would I need to put into my go script to put the database: A) Into memory (/var/tmp?) B) cache drive in folder "Other" so path is "/mnt/cache/Other" I'd like to test both options! Not sure that dbpath has ever improved it that much but maybe I should consider moving that to the flash drive. I wouldn't put the DB on the flash device because it could shorten the device life if too many writes were happening during operation. Could this be stored in memory? It is no issue if the database is lost (I regularly delete them), but if for 5.1 if you could save it to memory and then read/write it at array start/stop that could speed things up? That would really speed things up for people with lots of memory! Thanks Quote Link to comment
gfjardim Posted May 10, 2012 Author Share Posted May 10, 2012 A) Into memory (/var/tmp?) sed -i "s/cnidscheme:dbd/cnidscheme:dbd dbpath:\/var\/tmp\//g" /etc/netatalk/AppleVolumes.default- B) cache drive in folder "Other" so path is "/mnt/cache/Other" sed -i "s/cnidscheme:dbd/cnidscheme:dbd dbpath:\/mnt\/cache\/Other\//g" /etc/netatalk/AppleVolumes.default- You can store on memory, e.g. /var/tmp, but every time you restart your server it will need to rebuild the database from scratch, which can take a lot of time at your first access to the share. Other possible problem is that if you have a lot of files, the database can easily exceed 20MB size. Quote Link to comment
Interstellar Posted May 11, 2012 Share Posted May 11, 2012 A) Into memory (/var/tmp?) sed -i "s/cnidscheme:dbd/cnidscheme:dbd dbpath:\/var\/tmp\//g" /etc/netatalk/AppleVolumes.default- B) cache drive in folder "Other" so path is "/mnt/cache/Other" sed -i "s/cnidscheme:dbd/cnidscheme:dbd dbpath:\/mnt\/cache\/Other\//g" /etc/netatalk/AppleVolumes.default- You can store on memory, e.g. /var/tmp, but every time you restart your server it will need to rebuild the database from scratch, which can take a lot of time at your first access to the share. Other possible problem is that if you have a lot of files, the database can easily exceed 20MB size. Well, if the database was then written to disk at shutdown then that would solve the issue? But I suspect that requires more work! Cheers, will try the code now! Quote Link to comment
madburg Posted May 11, 2012 Share Posted May 11, 2012 ... So the former will have to wait for 5.1, the latter I will put into -rc3. Re-reading this, so we should look forward to testing this out on a cache drive in -RC4, yes? Quote Link to comment
limetech Posted May 11, 2012 Share Posted May 11, 2012 ... So the former will have to wait for 5.1, the latter I will put into -rc3. Re-reading this, so we should look forward to testing this out on a cache drive in -RC4, yes? Right, a mis-type, fixed. Quote Link to comment
Bizarro Posted May 12, 2012 Share Posted May 12, 2012 Updated to RC3, running parity check at around 71-84MB/sec across 14 drives (19TB), seems to run OK I'd say the NFS issues will continue to happen, otherwise running OK. Only major addons are Simplefeatures, Unmenu, and Plex. Can't wait until NFS and AFP are fixed so I don't have to use SMB Quote Link to comment
jgs2n Posted June 8, 2012 Share Posted June 8, 2012 Just wondering if there is a plan to deal with this since the potential changes for rc4 did not work out. It would be great to have time machine work without issue. Quote Link to comment
limetech Posted June 8, 2012 Share Posted June 8, 2012 Just wondering if there is a plan to deal with this since the potential changes for rc4 did not work out. It would be great to have time machine work without issue. Yes, working on a fix. Quote Link to comment
Interstellar Posted June 8, 2012 Share Posted June 8, 2012 If it is of any interest I have been running with the config as follows: B) cache drive in folder "Other" so path is "/mnt/cache/Other" sed -i "s/cnidscheme:dbd/cnidscheme:dbd dbpath:\/mnt\/cache\/Other\//g" /etc/netatalk/AppleVolumes.default- And it has greatly sped up access times and stopped the database errors. I have not however, been running it with TM, as I've stopped TM'ing to the server (performance reasons). Tom, it would be even faster if this database should be loaded into memory at boot and written at shutdown? It will of-course increase boot/shutdown times but given we rarely turn them off this is a possibility for people who want the speed & have the memory? Quote Link to comment
papester Posted August 24, 2012 Share Posted August 24, 2012 Just wondering if there is a plan to deal with this since the potential changes for rc4 did not work out. It would be great to have time machine work without issue. Yes, working on a fix. Has a solution to this issue been released in any of the -rc releases? Quote Link to comment
thejinx0r Posted August 29, 2012 Share Posted August 29, 2012 Hi, I just saw this thread, and I want to say that this worked for me too Quote Link to comment
mmccurdy Posted August 30, 2012 Share Posted August 30, 2012 If it is of any interest I have been running with the config as follows: B) cache drive in folder "Other" so path is "/mnt/cache/Other" sed -i "s/cnidscheme:dbd/cnidscheme:dbd dbpath:\/mnt\/cache\/Other\//g" /etc/netatalk/AppleVolumes.default- Having just found this thread, I tried this in a last ditch effort to solve my AFP issues (http://lime-technology.com/forum/index.php?topic=21638.0), and I got a popup error message on my Mac Mini after about 20 or 30 minutes of light disk usage saying that there was something wrong with the CNID database, and that it was using a temporary one and switching to read-only mode. Something to consider maybe if AFP support remains a priority, which it seems like it may not be. Quote Link to comment
Interstellar Posted October 26, 2012 Share Posted October 26, 2012 I've been running a new code as follows: sed -i "s/cnidscheme:dbd/cnidscheme:dbd dbpath:\/boot\//g" /etc/netatalk/AppleVolumes.default- And it keeps giving errors: Oct 26 20:06:29 Tower afpd[3129]: afp_disconnect: primary reconnect failed Oct 26 20:06:29 Tower cnid_dbd[16692]: dbd_add(DID: 3075/"Folder", dev/ino 0x0/0x91): Cannot add CNID: 3184 Oct 26 20:06:29 Tower afpd[3129]: =============================================================== Oct 26 20:06:29 Tower afpd[3129]: INTERNAL ERROR: Signal 11 in pid 3129 (2.2.3) (Errors) Oct 26 20:06:29 Tower afpd[3129]: =============================================================== Oct 26 20:06:29 Tower afpd[3129]: BACKTRACE: 10 stack frames: Oct 26 20:06:29 Tower afpd[3129]: #0 /usr/sbin/afpd(netatalk_panic+0x2d) [0x8097edd] (Drive related) Oct 26 20:06:29 Tower afpd[3129]: #1 /usr/sbin/afpd() [0x809804d] Oct 26 20:06:29 Tower afpd[3129]: #2 [0xb7740400] Oct 26 20:06:29 Tower afpd[3129]: #3 /usr/sbin/afpd(dir_add+0x453) [0x80618a3] Oct 26 20:06:29 Tower afpd[3129]: #4 /usr/sbin/afpd() [0x806555d] Oct 26 20:06:29 Tower afpd[3129]: #5 /usr/sbin/afpd(afp_over_dsi+0x565) [0x80549c5] Oct 26 20:06:29 Tower afpd[3129]: #6 /usr/sbin/afpd() [0x8053b24] Oct 26 20:06:29 Tower afpd[3129]: #7 /usr/sbin/afpd(main+0xa07) [0x8071757] Oct 26 20:06:29 Tower afpd[3129]: #8 /lib/libc.so.6(__libc_start_main+0xe6) [0xb732ab86] Oct 26 20:06:29 Tower afpd[3129]: #9 /usr/sbin/afpd() [0x80539a1] Ideas why it would do this? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.