The whole server freezes


thany

Recommended Posts

I just had a freeze. The whole server. Not just one VM (well, obviously it freezes the VMs) but the entire server got frozen to absolute zero.

 

No idea what I did. Well, I was updating within in a Windows 10 VM. Am I not allowed to do that?...

 

I also attempted to install a plugin, which worked fine. Only afterwards the web interface showed fuckall, i.e. zero bytes of HTML. It's what we call a WSOD.

After a reboot it worked again, but the plugin isn't listed, so I can't remove it. It's also not found in /boot/config/plugins and so I have no idea how to remove it. Because well, I suspect this might be the culprit.

 

What do I do?

 

/edit

It was this one:

https://github.com/Influencer/UNplugged/blob/master/sabnzbd_unplugged.plg

 

How do I undo its damage?

Link to comment

Don't know how you found that Plugins page, but those are all for v5!  That means 32 bit and they CANNOT work on v6.  By the way, how did you find that wiki page?  I need to remove the link.

 

As best I remember, those plugins created entries in /boot/plugins or /boot/config/plugins, and you should be able to manually delete their remnants from those.  It's possible that Dynamix recognized the plugin failed, and moved it to a /boot/config/plugins_error folder (or something like that).  Another reboot should start clean without it.

 

As to finding good plugins, use Community Applications only.  Squid is very good about blacklisting or adding moderator comments to anything there that is reporting serious issues.  And you won't find anything listed that isn't compatible.

Link to comment

Well, I may have found new information. The plugin may not have been the culprit.

 

It does cause WSOD on the web interface, but the whole server freeze is caused by me installed Avast Antivirus on my Windows 10 VM. This is completely reproducible.

 

How can that even be?? It's supposed to be virtual. Is Avast or Windows 10 somehow breaking the hypervisor barrier and knackering up the server?!

 

/edit

btw, the plugins can be found on the wiki, by searching for (who guessed it) "plugins". There are quite possibly other malicious ones if sabnzbd was there.

Please, have a staff member scrub through them ASAP. unRAID is a commercial product, but having 3 different plugin pages (yes, really) with different content, unclear for which version, is amateurish iyam.

 

/edit2

Also, just checked, the local console is frozen as well. Brilliant. Have to do a hard reset again :(

Link to comment

Please, have a staff member scrub through them ASAP. unRAID is a commercial product, but having 3 different plugin pages (yes, really) with different content, unclear for which version, is amateurish iyam.

Unless the user has red squares, they are not a staff member. Moderators, community developers, and all the rest of us are just unpaid volunteers. I suggest the best way for you to complain effectively would be to email support directly, https://lime-technology.com/contact/ as otherwise, actual staff probably won't see this.
Link to comment

Please, have a staff member scrub through them ASAP. unRAID is a commercial product, but having 3 different plugin pages (yes, really) with different content, unclear for which version, is amateurish iyam.

 

The wiki is largely community driven.  So I've edited it.  You buy Unraid without ongoing support largely, it's more or less community supported unless you pay for a consult. However, the license key doesn't expire with version changes and many of us have been using the same one for a fair few years now, so I feel that's fair enough.

Link to comment

It does cause WSOD on the web interface, but the whole server freeze is caused by me installed Avast Antivirus on my Windows 10 VM. This is completely reproducible.

 

How can that even be?? It's supposed to be virtual. Is Avast or Windows 10 somehow breaking the hypervisor barrier and knackering up the server?!

 

Avast is a known incompatibility with KVM, possibly other hypervisors.  There are threads here about it.  I understand that the Avast devs have found ways to make it coexist with VMWare, VirtualBox, and Microsoft's product(?), but not everything.  Something about built in hypervisor usage for sandbox features.  Most other AV's are OK to use.

 

btw, the plugins can be found on the wiki, by searching for (who guessed it) "plugins". There are quite possibly other malicious ones if sabnzbd was there.

Please, have a staff member scrub through them ASAP. unRAID is a commercial product, but having 3 different plugin pages (yes, really) with different content, unclear for which version, is amateurish iyam.

 

unRAID is a commercial product but the wiki is not, it's almost entirely a user contributed effort.  Plus, those pages were written for older versions, and there are still a number of users with those versions.  We are slowly trying to bring the wiki up to date, while still supporting users who don't want to upgrade.  As it's an unpaid volunteer effort, it's slow to happen.  I'll probably undo your commenting, but clearly label the page for older versions only.  I'm sorry for the confusion though.

Link to comment

How do I undo its damage?

If you install the Fix Common Problems plugin (available within Community Applications), and then have it scan, it will identify any remnants that are incompatible with your system.

 

Please, have a staff member scrub through them ASAP. unRAID is a commercial product, but having 3 different plugin pages (yes, really) with different content, unclear for which version, is amateurish iyam.

While not strictly ontopic, limetech in their latest manual for unRaid does tell you to install Community Applications to handle all docker and plugin installations.  Probably mainly because CA keeps the available lists up to date automatically and maintains compatibility lists without requiring any outside interferance.  (IMHO the only valid lists of applications for unRaid are what's displayed by CA or the All unRAID Application Template Repositories / Support Threads  All other wiki entries listing anything for v6.1+ should merely point to either the posted URL or to CA.

 

So, should it freeze the whole hypervisor??

 

Iyam, that's still an issue. And a pretty bloody critical one at that.

Are you sure that you don't have an underlying hardware issue (or bios needs updating, etc).  While its true that Avast is incompatible with KVM due to their sandboxing, I can't recall off the top of my head where anyone posted here about it crashing the entire system.
Link to comment
  • 2 weeks later...

I just had another systemwide freeze (or crash - I can't tell as I'm not physically with the server right now). I was just browsing a website on my Win10 VM.

 

The hardware should be just fine. It's been running like a kitten for a couple of years, and there's no reason to assume anything is now magically broken.

Link to comment

The hardware should be just fine. It's been running like a kitten for a couple of years, and there's no reason to assume anything is now magically broken.

Many, many reasons.  Power Supplies age and can't supply their rated power

Memory chips can and do go bad

Capacitors age

Dust conducts electricity

VMs are highly dependent upon the BIOS implementation on the motherboard.  BIOS updates do occur generally to improve stability / compatibility

Link to comment

The hardware should be just fine. It's been running like a kitten for a couple of years, and there's no reason to assume anything is now magically broken.

Many, many reasons.  Power Supplies age and can't supply their rated power

Memory chips can and do go bad

Capacitors age

Dust conducts electricity

VMs are highly dependent upon the BIOS implementation on the motherboard.  BIOS updates do occur generally to improve stability / compatibility

"It was working fine yesterday". Well, when a lightbulb burns out, it was working fine before it quit.
Link to comment

It's a bit too much of a coincidence that the switch to unRAID made my server go bad. It ran VMware absolutely perfectly (I switched for other reasons, irrelevant atm) and now suddenly my hardware is bad when I switched to unRAID?

 

Well, in theory it's possible. But the OS is what's changed, and now things have become unstable. No sir, that's not a hardware problem.

 

One lucky thing about this last crash, is that the server managed to reboot itself. So at least I didn't have to go to the server physically. I did anyway, but didn't have to in the end.

 

If anyone is interested in diagnostics logs, I can post them. I myself don't know how to read them, let alone what to look for, but I can still post them if other folks know where to look for problems...

Link to comment

It's a bit too much of a coincidence that the switch to unRAID made my server go bad. It ran VMware absolutely perfectly (I switched for other reasons, irrelevant atm) and now suddenly my hardware is bad when I switched to unRAID?

 

Well, in theory it's possible. But the OS is what's changed, and now things have become unstable. No sir, that's not a hardware problem.

Different OS's use various RAM addresses differently, among many other things. Not saying you definitely do have hardware issues, but changing OS's can expose faults that were hidden previously.
Link to comment

Just for kicks (and to prove that my hardware is fine) I did a 4,5 hour long memtest86 test. On the fancy-pants UEFI version no less. After 4 passes, 0 errors. Conclusion: memory is totally fine. And after a burn-in like that, it's very likely the rest of the system is stable as well, especially under idleness.

 

So there :)

 

Now, what's next?

Link to comment

If anyone is interested in diagnostics logs, I can post them. I myself don't know how to read them, let alone what to look for, but I can still post them if other folks know where to look for problems...

 

There are a number of users here that can read them, and serious troubleshooting always requires a look at them.  I have to confess I'm almost happy when a user doesn't include them, saves us a lot of time!  (a selfish thought I know, but we're unpaid)  And while diagnostics don't guarantee a solution, there's a lot that can be detected.  Please see Need help? Read me first!, and attach the diagnostics zip.

Link to comment

We'll have to wait until the problem happens again though. It hasn't happened since the past couple of days, and I'd like to include a report that's as accurate to the configuration as possible - I'm am still setting stuff up every now and then, after all.

 

So let's see what happens.

Link to comment

Okay, waiting for the next crash then :(

Along with the last set of diagnostics generated by Fix Common Problem, also upload the syslog.txt file that'll be in the logs folder on the flash drive as there's a chance it'll have extra helpful information not contained within the diagnostics
Link to comment
  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.