emhttp segfault


Recommended Posts

Sigh.  Problem has returned.  What else could it be?  Bad flash stick?

 

Feb 26 19:27:45 freenas emhttp: unRAID System Management Utility version 5.0-beta14
Feb 26 19:27:45 freenas emhttp: Copyright (C) 2005-2011, Lime Technology, LLC
Feb 26 19:27:45 freenas emhttp: Pro key detected, GUID: 0781-556B-3109-3213ECB1XXXX
Feb 26 19:27:45 freenas emhttp: rdevName.22 not found
Feb 26 19:27:45 freenas emhttp: diskFsStatus.1 not found
Feb 26 19:27:45 freenas kernel: emhttp[4944]: segfault at 0 ip b746e760 sp bff82dc0 error 4 in libc-2.11.1.so[b73f5000+15c000]

no, just a bug in emhttp where it is attempting to reference variables that are apparently not initialized.  Report it to tomm@lime-technology.
Link to comment
  • Replies 57
  • Created
  • Last Reply

Top Posters In This Topic

i as well have this same issue, i just built my unraid server and within an hour the web management page goes down. The process emhttp is still showing as running when doing a ps aux and I am able to telnet to port 80 but when trying to use IE or chrome I get page cannot be displayed. I killed the emhttp process and tried using '/usr/local/sbin/emhttp &' to restart it but I get the seg fault messages in syslog. The exact same as Electric posted. I am in the middle of a preclear on a drive so I can't just reboot the box to get the management page back.

Link to comment

Anyone that has this, do you have sabnzbd running? Check your memory using free -m to see how much is available and how much is cached...I wonder if the kernel used in the beta builds isn't releasing the memory from cache....

 

This is what I've done: Moved a bunch of files over to the server and manually invoked the mover script. Watched the memory fill up and before long emhttp stopped with a segfault. Anything that uses a lot of memory fills it up and it isn't released(which I know isn't abnormal). So I turned off cache_dirs and free'd memory, and so far today I haven't gone above 2500Mb used in memory. And NO segfault.

 

So I'm about to test by downloading a bunch of files and seeing if the memory fills up and if I get a segfault. After doing this tonight I plan on turning cache_dirs back on and doing it again trying to reproduce the segfault.

Link to comment
  • 2 months later...

I've had an issue with emhttp crashing since moving to 5bx and began using plugins (sabnzbd, sickbeard, couchpotato and unraid).  I receive the following same segfault errors at others in my log

 

Tower kernel: emhttp[17371]: segfault at 0 ip b754d760 sp bfe00dd0 error 4 in libc-2.11.1.so[b74d4000+15c000]

 

I'm now running 5RC3 with only plex on a clean install and still having this issue:  here's my output with free -m

 

                    total        used      free    shared    buffers    cached

Mem:          2015      1951        63          0        43      1747

-/+ buffers/cache:        160      1855

Swap:            0          0          0

 

i can't seem to restart emhttp i receive a segfault error. So now a couple questions:

 

1.  was this issue ever solved?

2.  is this memory output normal?  why is so much cached?

3.  Influencer mentioned freeing memory and turning cache_dirs off.  how do i do this?

 

thanks for the help

Link to comment

The memory output is normal, you are actually using very little right now, most of it is being used by cache. This is where I believe my problem comes in, I think cache_dirs was using so much memory, and its designed to keep the memory, so the OS can't free it for other applications to use(in this case, emhttp). This normally isn't a problem but if it uses too much it appears to cause the segfault.

 

Anyway, if you want to see if it is cache_dir's causing it, this would do it:

If you are running simple_features, cache_dirs settings can be located under Settings - Folder Caching

 

If you are not using cache_dirs from simple_features, then this line will stop any instance of cache_dirs running:

cache_dirs -q

 

Note that unraid does not use cache_dirs by default, if it is running you set it up. If you did not set it up, then cache_dirs is not whats causing your issues.

Run this line to clear cache out of ram.

sync; echo 3 > /proc/sys/vm/drop_caches

 

After that use your server however you do, try to do stuff that normally caused the segfault. Run it like this for a few days or however long it usually takes for you to see a segfault.

 

For me there were two instances of cache_dirs running, one was the script and an added line to the go file, one was running from simple_features. Together they used too much ram and caused the segfault.

 

Now only one runs, and is set to only cache 3 levels, as any lower than this and I'm most likely going to be playing a movie and the disk will spin up anyway.

Link to comment
  • 3 weeks later...

K, so I'm seeing web gui gone away - no response from web server but emhttp is "running".  By this I mean that the process is still running.

 

Killing it and trying to restart it returns Segmentation Fault to the screen and:

 

Jun  4 09:27:17 server emhttp: unRAID System Management Utility version 5.0-rc3
Jun  4 09:27:17 server emhttp: Copyright (C) 2005-2012, Lime Technology, LLC
Jun  4 09:27:17 server emhttp: Pro key detected, GUID: 058F-6387-0000-0000E801C488
Jun  4 09:27:18 server emhttp: rdevName.22 not found
Jun  4 09:27:18 server emhttp: diskFsStatus.1 not found
Jun  4 09:27:18 server kernel: emhttp[29547]: segfault at 0 ip b7499760 sp bfa30850 error 4 in libc-2.11.1.so[b7420000+15c000] 

in syslog

 

I am no longer running cachdirs but do have Sab, Sick, Couch, mysql, Plex and Crashplan.  Memory status:

root@server:/var/log# free -m
             total       used       free     shared    buffers     cached
Mem:          3854       3736        118          0        168       2959
-/+ buffers/cache:        608       3245
Swap:            0          0          0

 

stopping unraid (and all my plugins) and running emhttp in the foreground:

root@server:/var/log# /usr/local/sbin/emhttp
mkdir: cannot create directory `/mnt/disk1': File exists
mkdir: mkdir: cannot create directory `/mnt/disk3'cannot create directory `/mnt/disk2': File exists: File exists

mkdir: cannot create directory `/mnt/disk9': File exists
mkdir: mkdir: cannot create directory `/mnt/disk4'cannot create directory `/mnt/disk7': File exists: File exists

mkdir: cannot create directory `/mnt/disk8': File exists
mkdir: cannot create directory `/mnt/cache': File exists
rmdir: failed to remove `/mnt/cache': Device or resource busy
Starting couchpotato:  sudo -u nobody python /usr/local/couchpotato/CouchPotato.py -d --datadir /mnt/cache/.couchpotato-data --pidfile /var/run/couchpotato/couchpotato.pid > /dev/null 2>&1
Saving Settings...Completed
Applying Settings...Completed
Starting MySQL......root password already set...Completed
Starting sabnzbd:  sudo -u nobody python /usr/local/sabnzbd/SABnzbd.py -d -s 0.0.0.0:8080 --config-file /mnt/cache/.sabnzbd-data --pid /var/run/sabnzbd > /dev/null 2>&1
Starting sickbeard:  sudo -u nobody python /usr/local/sickbeard/SickBeard.py -d -p 8081 --datadir /mnt/cache/.sickbeard-data --pidfile /var/run/sickbeard/sickbeard.pid > /dev/null 2>&1
please refresh the page

 

but the web server still doesn't respond, unraid is restarted and my plugins too (well, except mysql, will have to look at that!).  After killing it with ctrl-c I tried running in the background but:

 

Jun  4 09:39:35 server emhttp: unRAID System Management Utility version 5.0-rc3
Jun  4 09:39:35 server emhttp: Copyright (C) 2005-2012, Lime Technology, LLC
Jun  4 09:39:35 server emhttp: Pro key detected, GUID: 058F-6387-0000-0000E801C488
Jun  4 09:39:35 server emhttp: rdevName.22 not found
Jun  4 09:39:35 server emhttp: diskFsStatus.1 not found
Jun  4 09:39:35 server kernel: emhttp[30809]: segfault at 0 ip b7501760 sp bf919550 error 4 in libc-2.11.1.so[b7488000+15c000]
Jun  4 09:39:40 server emhttp: unRAID System Management Utility version 5.0-rc3
Jun  4 09:39:40 server emhttp: Copyright (C) 2005-2012, Lime Technology, LLC
Jun  4 09:39:40 server emhttp: Pro key detected, GUID: 058F-6387-0000-0000E801C488
Jun  4 09:39:40 server emhttp: rdevName.22 not found
Jun  4 09:39:40 server emhttp: diskFsStatus.1 not found
Jun  4 09:39:40 server kernel: emhttp[30849]: segfault at 0 ip b749a760 sp bfce1920 error 4 in libc-2.11.1.so[b7421000+15c000]

 

Nope, not playing.  Any clues anybody?

Link to comment

Diable all add-ons before posting in this forum.

In his defense, I moved the post here as it applied to rc3 AND it was described as an emhttp segfault on array startup. 

(very low odds of a add-on causing it)

 

Far more likely it is emhttp incorrectly handling un-expected (null) entries in /proc/mdcmd

 

Joe L.

Link to comment

To others-

You can not kill emhttp and then restart, it is not designed to do that and you will get a segfault every time.

 

That explains that part of the problem.  Thanks.  I haven't seen this since I upgraded from b14 to rc3, but on b14 it was constantly happening.

Link to comment

Far more likely it is emhttp incorrectly handling un-expected (null) entries in /proc/mdcmd

What makes you think that?

 

To others-

You can not kill emhttp and then restart, it is not designed to do that and you will get a segfault every time.

 

I had an issue like this on Saturday and of course hard to check wiki and doc when there is a site outage.

 

What do you do if emhttp has died/become unresponsive and can't be restared if you want to shut the server down.

 

I believe the powerdown script requires the web interface to be running, is there another way to safely shutdown?

 

 

 

 

 

Link to comment

The memory output is normal, you are actually using very little right now, most of it is being used by cache. This is where I believe my problem comes in, I think cache_dirs was using so much memory, and its designed to keep the memory, so the OS can't free it for other applications to use(in this case, emhttp). This normally isn't a problem but if it uses too much it appears to cause the segfault.

 

Anyway, if you want to see if it is cache_dir's causing it, this would do it:

If you are running simple_features, cache_dirs settings can be located under Settings - Folder Caching

 

If you are not using cache_dirs from simple_features, then this line will stop any instance of cache_dirs running:

cache_dirs -q

 

Note that unraid does not use cache_dirs by default, if it is running you set it up. If you did not set it up, then cache_dirs is not whats causing your issues.

Run this line to clear cache out of ram.

sync; echo 3 > /proc/sys/vm/drop_caches

 

After that use your server however you do, try to do stuff that normally caused the segfault. Run it like this for a few days or however long it usually takes for you to see a segfault.

 

For me there were two instances of cache_dirs running, one was the script and an added line to the go file, one was running from simple_features. Together they used too much ram and caused the segfault.

 

Now only one runs, and is set to only cache 3 levels, as any lower than this and I'm most likely going to be playing a movie and the disk will spin up anyway.

 

I believe this is on the money. I run no plugins and when mover kicks in I get alert emails that memory utilization hits in the mid 80% during moving. Once it is completed it drops down to normal. Since I don't run anything (plugins/apps) on unRaid itself it holds up. I have 4gb allocated to unRAID. There are other devices reading & writing to the unRAID cache drive all the time.

 

So to his point, I could see others pushing the envelope and run out of memory during a mover operation. It would seem you could either add more ram or off load an app(s) to elevate the issue.

 

Link to comment

Far more likely it is emhttp incorrectly handling un-expected (null) entries in /proc/mdcmd

What makes you think that?

 

To others-

You can not kill emhttp and then restart, it is not designed to do that and you will get a segfault every time.

 

Actually you can do a powerdown with the script by opening a terminal session with the server and entering "powerdown" that's the only way I've been able to get WebGUI back, is a powerdown and reboot.

I had an issue like this on Saturday and of course hard to check wiki and doc when there is a site outage.

 

What do you do if emhttp has died/become unresponsive and can't be restared if you want to shut the server down.

 

I believe the powerdown script requires the web interface to be running, is there another way to safely shutdown?

Link to comment

Far more likely it is emhttp incorrectly handling un-expected (null) entries in /proc/mdcmd

What makes you think that?

 

To others-

You can not kill emhttp and then restart, it is not designed to do that and you will get a segfault every time.

 

Actually you can do a powerdown with the script by opening a terminal session with the server and entering "powerdown" that's the only way I've been able to get WebGUI back, is a powerdown and reboot.

I had an issue like this on Saturday and of course hard to check wiki and doc when there is a site outage.

 

What do you do if emhttp has died/become unresponsive and can't be restared if you want to shut the server down.

 

I believe the powerdown script requires the web interface to be running, is there another way to safely shutdown?

 

I tried just that, through telnet and on the server itself. powerdown script does nothing as it needs the web interface running

which it wasn't as it had bombed out. If the web interface can't be restarted due to the way it was written then your stuck.

 

I had to use poweroff which of course then wanted to run a parity check. When the webgui died all disk operations had

completed so i stopped the parity check and just let it run a non correcting check, everything was ok.

 

If there is no way to really shutdown the server if the webgui dies and can't be restarted then maybe an enhancement

is needed to add this. Of course the webgui should not ideally have these issues but as can be seen by all reports

of it being unresponsive its quite common.

 

 

 

 

Link to comment

I issue a "shutdown -h now" when I have faced nothing is responding in the past. It never started a parity check on me after. This shuts it down completely, not a reboot. But via my remote management board I trigger a power up. Don't know if this is a recommend execution so better if others chime in.

 

Link to comment

I had this issue after installing 5RC4 and started plugin in plugins. My emHttp would seg fault. So I looked at top and saw that I had very little free mem left.

I checked the size of my root disk issuing a du -sxh and found that this was 95% the size of the amount of memory that I had installed.

As / is stored in RAM any logs or other system related filed written too are also stored there it will use up mem.

 

So installing another 2G of memory and the problem went away.

 

Just my contribution to this fault.

 

Jfp

Link to comment
  • 2 weeks later...

I have had the same issue.. usually related to plugins. This seems to be especially a problem when a the disks_mounted event is called. Some of the plugins may encounter errors and not fail gracefully, causing the emhttp process to be totally locked up - and sometimes segfault.

 

Tom,

is there a way to restart the emhttp? I understand that it can't be just killed and relaunched.. but is there a process that can be followed?

 

whiteatom

Link to comment

I have had the same issue.. usually related to plugins. This seems to be especially a problem when a the disks_mounted event is called. Some of the plugins may encounter errors and not fail gracefully, causing the emhttp process to be totally locked up - and sometimes segfault.

 

Tom,

is there a way to restart the emhttp? I understand that it can't be just killed and relaunched.. but is there a process that can be followed?

 

whiteatom

The plugin event system as currently implemented expects that every event called will complete.  It will wait "forever" for an event to complete and will tie up emhttp "forever" if they do not.

 

as far as re-launching emhttp.

 

If it is the only thing you can do, I would kill any existing instance, and re-start it.  Then, stop the array and reboot the whole server so it again is in complete control as expected. 

 

Joe L.

Link to comment
  • 3 weeks later...

We need additional events as well, disks_mounted works for some, but that event doesn't tell the whole story. The disks can mount, but that doesn't mean the array is initialized yet. With a array_started event, we can be sure that the array is up, and shares are ready. This is especially a problem for users without cache drives running plug-ins that require directories on disk to be persistent, such as sab, sickbeard and couchpotato. The main cause for hanging with those three are when a user has the directory specified on the array, and the plug-in starts before the array does.

 

That being said, I do need to rewrite the plug-in to allow for graceful fails so it doesn't hang like it currently does. We also need(and I believe once the plug-in system has matured some in the future) better documentation on how the plug-ins should interact with unraid.

 

I'm late to this party, but hey, what can I say, :)

Link to comment

We need additional events as well, disks_mounted works for some, but that event doesn't tell the whole story. The disks can mount, but that doesn't mean the array is initialized yet. With a array_started event, we can be sure that the array is up, and shares are ready. This is especially a problem for users without cache drives running plug-ins that require directories on disk to be persistent, such as sab, sickbeard and couchpotato. The main cause for hanging with those three are when a user has the directory specified on the array, and the plug-in starts before the array does.

 

That being said, I do need to rewrite the plug-in to allow for graceful fails so it doesn't hang like it currently does.

The event "disks_mounted" does get generated after all disks, and the user share file system, are mounted.  The only event that follows this one is "svcs_restarted" which means that now the network protocol servies, Samba/NFS/AFP (etc), are "up".

 

We also need(and I believe once the plug-in system has matured some in the future) better documentation on how the plug-ins should interact with unraid.

 

Agreed.

Link to comment

I'll have to look into the problems with my plugins more then, the only time it seems to be an be an issue was when the directories were located on the array, it appeared as though the plugins fired after the event but before the user shared were ready. Now appears I was wrong, thanks for the info!

 

Sent from my HTC Vivid

Link to comment
  • 5 months later...

I am running 5.0-rc8a with lots of plugins, everything worked fine with emhttp for about a month and no seg fault...

has there been any update on this? is there a way to get emhttp back? i did a full reboot but still no luck:

 

Dec 16 19:48:37 Nokdim kernel: emhttp[18864]: segfault at 0 ip b7559760 sp bfbeb2d0 error 4 in libc-2.11.1.so[b74e0000+15c000]

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.