Jump to content

Re: Unable to load web interface or Hit Shares - CyberMew


Recommended Posts

I'm having this issue often as well, once I had a few corrupted file entries that caused it to hang (I think) and produce tons of errors on the console which I've fixed, but not sure if there's anything to be concerned about. I've attached my logs before powerdown and after a forced restart. So should I upgrade the filesystem as well in order to fix these issues?

tower-diagnostics-20160711-2338.zip

tower-diagnostics-20160712-0002.zip

Link to comment

You're having a hardware issue with your system, causing 'machine check events', each preceded by momentary overheating CPU (all 4 cores) messages.  I would install mcelog from the NerdPack, and see what it says, the next time it reports them.  And of course you may want to examine the CPU cooling.

Link to comment

I might not have applied sufficient thermal paste on the CPU, hence the higher temps especially when there are a couple users trying to stream/transcode from Plex. I will try to replace the thermal paste again this weekend!

 

I've installed NerdPack, but how would I go about using mcelog?

 

Would the high cpu temps really be causing unraid web to lockup and render the disks inactive? There are no disk activities nor can we access the webui/shares via smb when it happens. I can't do a powerdown when this happens as well.

 

 

Just a side question, not sure if it's in the logs (and if I've mentioned this before), but my parity drive and disk4 often get errors. Are they ok or should I replace them?

Link to comment

Found the answer to my side question (finally!! the errors were making me uncomfortable)

 

Both drives were connected to https://www.amazon.com/gp/product/B00AZ9T3OU/, which I realise is problematic after looking at someone else's post http://lime-technology.com/forum/index.php?topic=50332.0, which linked me to http://lime-technology.com/forum/index.php?topic=40683.45

 

Thank the heavens! Going to try the solutions in that thread before it's time to switch out to another card!

Link to comment

I've installed NerdPack, but how would I go about using mcelog?

mcelog is a module that's triggered when a Machine Check Event occurs, and is able to gather and log a fair amount of info about the MCE.  Without it, all we know from the syslog is that an MCE occurred, but not the source of it (CPU, RAM, etc) or what the error was.  If an MCE occurs again, there will be more info logged for us, to use in figuring out what needs to be fixed or replaced.

 

Would the high cpu temps really be causing unraid web to lockup and render the disks inactive? There are no disk activities nor can we access the webui/shares via smb when it happens. I can't do a powerdown when this happens as well.

I don't know.  But it pays to deal with what we *can* deal with first, then see what else turns up, hoping that it will then be clearer to us with the obvious issues gone.

Link to comment

Oh that's great, I thought I had to do something in order to get it to appear in the logs. Glad that it's automatic.

 

I just got home, and seems like it's happening again. Unable to access my Plex, no hdd lights activity etc. This time however, I could access the webgui (a surprise!), which then I tried to do a stop array but it stuck at the first thing - Stopping Docker. No surprises there. Then proceed to do a powerdown, which didn't work as per usual. I will edit this post to attach the logs again once I'm able to get it out.

 

edit: attached logs.

 

Also, I've restarted with the bootup command line fix provided in the marvell topic, looks like it isn't working.

Jul 15 00:19:36 Tower emhttp: shcmd (11): /usr/local/sbin/set_ncq sdh 1 &> /dev/null

Jul 15 00:53:37 Tower kernel: ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen

 

Jul 15 00:53:37 Tower kernel: ata9.00: failed command: SMART

 

Jul 15 00:53:37 Tower kernel: ata9.00: cmd b0/d1:01:01:4f:c2/00:00:00:00:00/00 tag 28 pio 512 in

Jul 15 00:53:37 Tower kernel: ata9.00: status: { DRDY }

Jul 15 00:53:37 Tower kernel: ata9: hard resetting link

 

Jul 15 00:53:38 Tower kernel: ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

Jul 15 00:53:38 Tower kernel: ata9.00: configured for UDMA/133

Jul 15 00:53:38 Tower kernel: ata9: EH complete

Jul 15 00:54:22 Tower kernel: ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen

 

Jul 15 00:54:22 Tower kernel: ata9.00: failed command: IDENTIFY DEVICE

 

Jul 15 00:54:22 Tower kernel: ata9.00: cmd ec/00:01:00:00:00/00:00:00:00:00/00 tag 4 pio 512 in

Jul 15 00:54:22 Tower kernel: ata9.00: status: { DRDY }

Jul 15 00:54:22 Tower kernel: ata9: hard resetting link

 

Jul 15 00:54:23 Tower kernel: ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

Jul 15 00:54:23 Tower kernel: ata9.00: configured for UDMA/133

Jul 15 00:54:23 Tower kernel: ata9: EH complete

Not sure if it's affecting anything related to this main problem, but I'll also get another non-marvell sata card if possible to try and eliminate possible problems.

tower-diagnostics-20160715-0007.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...