chrisbirkinshaw Posted March 6, 2013 Share Posted March 6, 2013 If I don't do the rsync backups to UnRAID then I get around 1 month before it needs a reboot. The rsync backup always killed it, though that was with Plex running. Let's see how it fares without. Already 24 hours with no issues... Quote Link to comment
speeding_ant Posted March 8, 2013 Share Posted March 8, 2013 Moving back to rc10 resolved the issue for me. Quote Link to comment
EMKO Posted March 11, 2013 Share Posted March 11, 2013 so the web interface gets killed if you run out of memory? why doesn't it start back up ? how is it that i can run a lot of plugins without non of them closing but the web interface does even when i removed a bunch of plugins that i don't use like couchpotato? Quote Link to comment
limetech Posted March 11, 2013 Share Posted March 11, 2013 For those with this problem... this might sound like strange questions, but 1. Do you have NFS enabled? Meaning do you have Settings/Enable NFS set to 'Yes'? If answer to 1 was Yes, then 2. Do you have cache_dirs running? Quote Link to comment
ClunkClunk Posted March 11, 2013 Share Posted March 11, 2013 1. No. 2. No. (rc10 for me though, I haven't yet moved to rc11, so discount my answer as needed) Quote Link to comment
bobbintb Posted March 12, 2013 Share Posted March 12, 2013 Yes and yes. But I think I have also had the issue with NFS and without cache dirs. Not positive though. Quote Link to comment
liuzhen Posted March 12, 2013 Share Posted March 12, 2013 I also am suffering from Transport endpoint not connected errors in my log which results in my user shares becoming unreachable but my disk1, disk2 drives will still be reachable. My webgui still works though. I've discovered its something to do with the Plex Media server scan which scans my media library. I don't think it's RAM issue as I have 6gb in my tower and it's always had at least 3gb free on the 'crashes' (whilst I was testing yesterday) I've tried the scripts in the OP but they haven't done anything. Tried running rc8, rc10 and rc11 but they all have the same problems. I do not have NFS enabled. I just want to run Plex, Sab, Sickbeard and couchpotato If anyone has any suggestions it would be very much appreciated. Quote Link to comment
bobbintb Posted March 12, 2013 Share Posted March 12, 2013 I also am suffering from Transport endpoint not connected errors in my log which results in my user shares becoming unreachable but my disk1, disk2 drives will still be reachable. My webgui still works though. I've discovered its something to do with the Plex Media server scan which scans my media library. I don't think it's RAM issue as I have 6gb in my tower and it's always had at least 3gb free on the 'crashes' (whilst I was testing yesterday) I've tried the scripts in the OP but they haven't done anything. Tried running rc8, rc10 and rc11 but they all have the same problems. I do not have NFS enabled. I just want to run Plex, Sab, Sickbeard and couchpotato If anyone has any suggestions it would be very much appreciated. ive had the exact same issues and i dont use plex. 12gb ram. the fix listed here *seems* to help as it is less frequent of an issue but it still happens. it probably is just coincidence that it is less frequent though. i get no OOM errors in my log. i have new ram so i doubt its bad but im going to run a memtest soon. Quote Link to comment
ajeffco Posted March 12, 2013 Share Posted March 12, 2013 1) Yes 2) Yes and No. meaning normally yes, but also no, as I removed all plugins as detailed in my thread about it and the problem still occurred. I didn't try to disable NFS, although I have no problem in doing so if there's thought it might add some value to troubleshooting. The only difference from OP that I have is that the console never goes away, it just gets confused and starts spitting out things like "Share deleted" if one tries to modify a share. Also the disk shares don't go offline, only the user share is what goes offline when this happens. Al Quote Link to comment
bobbintb Posted March 19, 2013 Share Posted March 19, 2013 still getting this issue. tried every fix i could find. nothing helps. it fact it is worse now than ever. it now happens everytime i turn on my server and the usual fix for it no longer works. it seems to be permanently down. Quote Link to comment
htpcnewbie Posted March 23, 2013 Share Posted March 23, 2013 Used to have this issue while running rc8 and attributed it to utserver. Moved to transmission since then and I am on rc11. The issue resurfaced again now. i have the following in my go: #To prevent 'Transport endpoint error' pgrep -f "/usr/local/sbin/emhttp" | while read PID; do echo -1000 > /proc/$PID/oom_score_adj; done pgrep -f "/usr/sbin/smbd" | while read PID; do echo -1000 > /proc/$PID/oom_score_adj; done pgrep -f "in.telnetd" | while read PID; do echo -1000 > /proc/$PID/oom_score_adj; done pgrep -f "/usr/local/sabnzbd/SABnzbd.py" | while read PID; do echo 1000 > /proc/$PID/oom_score_adj; done Noticed that smb is still running but cannot access /mnt/user root@Tower:~# ps aux | grep smb root 2283 0.0 0.0 15808 3324 ? Ss Mar20 0:00 /usr/sbin/smbd -D root 2294 0.0 0.0 15820 1496 ? S Mar20 0:00 /usr/sbin/smbd -D root 4515 0.0 0.1 16644 4292 ? S 20:35 0:00 /usr/sbin/smbd -D root 24275 0.0 0.0 2452 584 pts/3 S+ 23:16 0:00 grep smb root@Tower:~# cd /mnt/user bash: cd: /mnt/user: Transport endpoint is not connected root@Tower:~# ps -elf | grep emhttp 4 S root 1161 1 0 80 0 - 2957 inet_c Mar20 ? 00:00:26 /usr/local/sbin/emhttp 0 S root 24027 22116 0 80 0 - 613 pipe_w 23:11 pts/2 00:00:00 grep emhttp root@Tower:~# cat /proc/1161/oom_adj -17 root@Tower:~# cat /proc/1161/oom_score 0 root@Tower:~# cat /proc/1161/oom_score_adj -1000 Looks like oom_score is 0. Going to try setting oom_score to -17 in a desperate attempt to keep smb and emhttp going.. shooting in the dark. Getting tired of these issues, please need a fix. Quote Link to comment
randall526 Posted March 24, 2013 Share Posted March 24, 2013 I've been working this issue for a few weeks now which has gotten a little worse. Now just opening up the plex manger causes the endpoint transport not connected error in my syslog. I've been looking at a corrupt file the plex manager may be reading, to doing a riserfs file check, to enabling NFS hoping the plex server could still see it's shares (nope). Doing some goggle work, I've read that SMB uses two ports and this symptom has shown to be the case when a client starts using the other port which was solved with blocking it with iptables (haven't tried yet). If I nail this one, I'll post it but if anyone else is working this issue, it has been rather frustrating after dropping over 1k on this server with plex being the primary purpose I got it with unraid. My setup is unraid 5.0-rc11, plex version v0.9.7.17.469-1f0b170, latest of each. For plugins,simple features, unmenu, unRAIDWeb and unraid torrent, latest version of each as far as I know. I've also had to add the below to my go file to get plex running. I toyed with the out of memory issue with the last 2 lines as well to no avail. mv /usr/lib/libstdc++.so.6 /usr/lib/libstdc++.so.6.orig ln -s /usr/lib/libstdc++.so.6.0.13 /usr/lib/libstdc++.so.6 pgrep -f "/usr/local/sbin/emhttp" | while read PID; do echo -1000 > /proc/$PID/oom_score_adj; done pgrep -f "/usr/sbin/smbd" | while read PID; do echo -1000 > /proc/$PID/oom_score_adj; done I was looking into sickbeard coach-potato and recompiling the kernel for TVheadend but I don't have enough faith in my Server's ability to run plex yet. I've always been able to solve my problems with alot of goggle work with out posting but since I dropped this amount on this server I decided it was time to post as my investment has been driven me up a wall. Quote Link to comment
htpcnewbie Posted March 24, 2013 Share Posted March 24, 2013 To the above post: I have been having the transport end point errors on rc8 without Plex. At that time my plugins were: Sab, Couchpotato, Sickbeard, utserver, Unmenu and simpleplugins. So the root of this issue seems not related to one plugin but something more fundamental. I am not proficient in Linux to debug the issue but can provide syslogs and help debugging. With rc11 it has become so worse that I cannot do a 4 cycle preclear without losing my unraid access. I got a new 3TB HDD and have tried preclearing twice, first it was with 4 cycles (had to stop while doing 3 since no access to smb) and the second time it was 2 cycles (smb crashed during 2nd). So the effective usability of the unraid server without regular reboots seems to be iffy. It would be helpful if Tom@Limetech or someone who knows the details of the unraid address this issue seriously. Quote Link to comment
dgaschk Posted March 24, 2013 Share Posted March 24, 2013 Try rc12a. You may have an incompatible add-on. Quote Link to comment
randall526 Posted March 24, 2013 Share Posted March 24, 2013 I wasn't even aware of RC12, didn't see it on the main download site. I was hoping for a new version to test too. Quote Link to comment
randall526 Posted March 24, 2013 Share Posted March 24, 2013 Transferred and replaced the image as defined here, http://lime-technology.com/wiki/index.php?title=UnRAID_Server_Version_5.0-beta_Release_Notes System booted fine, but issue remains. Might be more plex related. Second I start access the plex web manager Bwam. Interestingly I can upload movies and let the settings I configured before these issues auto find new movies and update them in time but I lost the ability to manager the server. Mar 24 21:36:30 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Data Storage Transport endpoint is not connected Mar 24 21:36:30 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Movies Transport endpoint is not connected Mar 24 21:36:30 Tower emhttp: get_filesystem_status: statfs: /mnt/user/plex Transport endpoint is not connected Mar 24 21:36:32 Tower mountd[24915]: refused mount request from 192.168.2.150 for /Data Storage (/): not exported <==== get this after enabling NFS as well. Mar 24 21:36:32 Tower mountd[24915]: refused mount request from 192.168.2.150 for /Data Storage (/): not exported Mar 24 21:36:33 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Application Support Transport endpoint is not connected Mar 24 21:36:33 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Backups Transport endpoint is not connected Mar 24 21:36:33 Tower emhttp: get_filesystem_status: statfs: /mnt/user/Data Storage Transport endpoint is not connected Quote Link to comment
randall526 Posted March 25, 2013 Share Posted March 25, 2013 SOLVED SOLVED SOLVED <==== Helps me cut to the chase on other forums As promised I would post if I nailed this. Transport endpoint connection resolved So I should have figured this one out sooner since I do this for a living, granted I'm a level 1 unix admin and not 2nd level where I work so i am far from all knowing. My scripting and coding is too weak. Anyway, applications stop working out of the clear blue quit often for me and after checking some of the normal stuff, hardware errors, filesystems, resources usage looks ok, as mine does, as well as logs we tend to check the ulimits under the account the application runs on when a app just poof has an error out of no where as mine did. ulimits restrict a process from forking or using too many resources. To check ulimits use ulimit -a to return ulimits under another account, ulimit -a unraid-plex The item of interest, which is in my experience the most common ulimit that gives people trouble open files (-n) 20000 This is after I set it with ulimit -n 2000, default was 1024, way too low. I added this too my go file. However, when the plex application is killed off , the problem remains so I started looking at smb ulimits as my shares went down for everything. I added this to /boot/config/ident.cfg file. This is the config the smb settings tab writes to in the unraid menu add this to the end ulimit -n 20000 Then run the following cd /root samba restart launch webgui for unraid and go to plex. Should be NOT RUNNING if your are in the middle of a transport endpoint error. restart plex, plex came back up for me after receiving a transport endpoint error for the first time with ut needed a reboot, yay I had the issue resurface as I was tweaking the ulimits so I may not have them right but I would say I'm on the right track since as I am now back in my plex manager, which I use to be able to set my watch to for crashing my shares. In this case the issue would only arise after your library size grew. For me, when I hit the main menu as plex was loading all of my movie titles it would crash, and before that, when my library was a little smaller it was intermittent and got progressively worse as the library grew in size which would make seance with this particular ulimit. I believe accounts are created with default ulimit settings if I'm not mistaken and the default ulimit for all users and applications could be set. I would suggest tweaking this ulimit in a next release if i'm not mistaken to resolve these kind of issues. Hopefully this resolves someones problem. Quote Link to comment
Joe L. Posted March 25, 2013 Share Posted March 25, 2013 Hey... I solved a very similar issue back on 4.5 by using this to start emhttp in the config/go script. #!/bin/bash # Start the Management Utility ulimit -n 20000;/usr/local/sbin/emhttp & Since emhttp starts the network services, it handles everything and does not get overwritten the next time the ident.svcs file is written by unRAID. Actually, I first suggested your exact solution here: http://lime-technology.com/forum/index.php?topic=5004.msg47502;topicseen#msg47502 but found that setting ulimit before invoking emhttp solved all, since it in turn invoked smbd and smbd inherited the ulimit. I wrote about that in this post: http://lime-technology.com/forum/index.php?topic=5004.msg95269;topicseen#msg95269 and then finally, commented that 5.X had not yet been fixed here: http://lime-technology.com/forum/index.php?topic=5004.msg93808;topicseen#msg93808 at that point (three years ago) it appeared as if the shfs process had the ulimit too low. I guess it was never fixed in the 5.X series. Tom responded about needing a 4.6.1 release to fix it, but the fix probably was not carried forward into the 5.X series http://lime-technology.com/forum/index.php?topic=5004.msg94646#msg94646 Congratulations... you are very good at tier-1 support. (about 30 years ago I supported our system administrators when they got stuck. Some things never change and ulimit was an issue back then too.) As described back then, you can invoke this following "sed" command (cut and paste it on the command line once, then stop the array and reboot) It will add the ulimit command to your config/go script. To add the line you can just type this following line in a telnet session. It will add the ulimit for you to your "go" script. Then all you'll need to do is reboot. sed -i "sX^/usr/local/sbin/emhttpXulimit -n 20000;/usr/local/sbin/emhttpX" /boot/config/go (easiest is to cut and paste it in a telnet window) After adding the line, you'll need to stop the array and reboot. The above "sed" (stream edit) line will change the line in the /boot/config/go script invoking emhttp from /usr/local/sbin/emhttp & to ulimit -n 20000;/usr/local/sbin/emhttp & Remember: after adding the command to set the ulimit, you'll need to stop the array and reboot. Joe L. Quote Link to comment
randall526 Posted March 25, 2013 Share Posted March 25, 2013 Ahh I wish I would have found that thread, I was searching for my syslog error which I don't think turned up in that thread. I think I might add your fix to emhttp as well. Yeah I get to support our tier 2's once in a blue moon and earn my badge of honor when I actually have the answers over some of the guys that have been in the field for 30+. Had one of those moments when no one on my team understood TCP congestion protocols with the network capture we were presented. A broken clock should get one right twice a Day Quote Link to comment
bobbintb Posted March 26, 2013 Share Posted March 26, 2013 Wow, thank you so much randall526 and Joe L.! This issue has been plaguing my for MONTHS! Everything seems to be working fine now. Hopefully it doesn't go south again. Thank you both so much. Quote Link to comment
htpcnewbie Posted March 26, 2013 Share Posted March 26, 2013 Thanks Joe and Randall. I will incorporate this in my go script. This particular issue had made my unraid system very unstable over the last few months. Wish there was a way to thank you guys for the helpful posts, maybe donate a beer button? Quote Link to comment
bobbintb Posted March 27, 2013 Share Posted March 27, 2013 dammit, it's happening again. although it seems better. could i try setting ulimit to unlimited or are the repercussions too great? Quote Link to comment
Joe L. Posted March 27, 2013 Share Posted March 27, 2013 dammit, it's happening again. although it seems better. could i try setting ulimit to unlimited or are the repercussions too great? If you have tons of files, set it higher. Worst case is exactly the same as having it too low. Not sure you mentioned it, but you must reboot after adding the line to set the ulimit for it to take effect. You might ALSO the set the ulimit in any script where you are starting an add-on process. It is possible they do not have the higher ulimit if they are not being started through emhttp. Quote Link to comment
bobbintb Posted March 28, 2013 Share Posted March 28, 2013 dammit, it's happening again. although it seems better. could i try setting ulimit to unlimited or are the repercussions too great? If you have tons of files, set it higher. Worst case is exactly the same as having it too low. Not sure you mentioned it, but you must reboot after adding the line to set the ulimit for it to take effect. You might ALSO the set the ulimit in any script where you are starting an add-on process. It is possible they do not have the higher ulimit if they are not being started through emhttp. Yea, I forgot to mention I rebooted. So far I just had it happen again the one time (instead of frequently to almost always). For me though, it only happens on start up. It never happens any other time such as accessing too many files or being on too long, like it does for others. Maybe I just don't tax it enough through normal usage or I don't leave it on long enough (only a few hours a day and 0000 to 0800). I will try setting it higher. I am planning on getting Crashplan soon. But that may all be dependent if it will loose it's shit again when adding another plugin. Also, what do you mean set it higher if I have a ton of files? Are you talking about in my array or all the files (packages) my plugins load? If you mean array, how does that contribute to the issue? As for setting ulimit on other scripts, I only have a small python script that runs in the background in my GO file. I can try that as well. But as far as I understand it, all PLGs are started through emhttp so those should all get set the same, correct? Quote Link to comment
fatal Posted March 30, 2013 Share Posted March 30, 2013 Hey... I solved a very similar issue back on 4.5 by using this to start emhttp in the config/go script. #!/bin/bash # Start the Management Utility ulimit -n 20000;/usr/local/sbin/emhttp & Added that to my go file, but even after reboot ulimit -a shows 1024 for Open Files. Here is my go file: #!/bin/bash # Start the Management Utility ulimit -n 20000;/usr/local/sbin/emhttp & echo nameserver 192.168.10.80 >/etc/resolv.conf echo 192.168.10.10 tower >>/etc/hosts while [[ ${LOOP:=10} -gt 1 && ! -b /dev/md1 ]] do (( LOOP=LOOP-1 )) echo "Waiting for /dev/md1 to come online ($LOOP)" sleep 1 done sleep 1 for disk in /dev/md* do blockdev --setra 2048 $disk done blockdev --setra 2048 /dev/sde # Parity Drive. Do I have it in the right place or do I need to move it after the last line? Edit: Moving it after the last line had no effect after reboot either...I can manually type "ulimit -n 20000" through telnet and it does update and shows the new value of 20000 when I do "ulimit -a". Take Care Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.