[SOLVED?] Server Down For Unknown Reasons...


Recommended Posts

So, first off.. I'm not good at running this; I got it up and running and haven't touched it much since. So I apologize if I don't know how to properly describe what's going on.

 

Anyway, Server has been online since literally June with no errors. Worked fine earlier this morning and then suddenly I couldn't access it in windows explorer anymore (it hangs/crashes explorer). So I started investigating.

 

Naturally, my system log is behemoth having been on that long... but these are the things I'm seeing.

 

Running 5.0-rc12a since that's been stable for me.

 

-If I log on via telnet, and try to send poweroff command, it sees it - but does not turn off

-it seems to be repeating a "self detected stall on CPU" error

 

Apologies but I don't know how to make it a "text box" within the post and my notepad is now crashing also, I don't want to lose the log so it's pasted below.

 

/usr/bin/tail -f /var/log/syslog
Dec 18 22:45:12 Tower kernel: [] vfs_rename_dir+0xc0/0x122
Dec 18 22:45:12 Tower kernel: [] vfs_rename+0xc5/0x1d5
Dec 18 22:45:12 Tower kernel: [] ? do_path_lookup+0x1e/0x50
Dec 18 22:45:12 Tower kernel: [] sys_renameat+0x17d/0x1ee
Dec 18 22:45:12 Tower kernel: [] ? user_path_create+0x40/0x49
Dec 18 22:45:12 Tower kernel: [] ? sys_mkdirat+0x21/0x93
Dec 18 22:45:12 Tower kernel: [] sys_rename+0x28/0x2a
Dec 18 22:45:12 Tower kernel: [] syscall_call+0x7/0xb
Dec 18 22:45:26 Tower in.telnetd[18276]: connect from 192.168.1.172 (192.168.1.172)
Dec 18 22:45:28 Tower login[18277]: ROOT LOGIN on '/dev/pts/2' from '192.168.1.172'
Dec 18 22:48:12 Tower kernel: INFO: rcu_sched self-detected stall on CPU { 0} (t=366060 jiffies)
Dec 18 22:48:12 Tower kernel: Pid: 13140, comm: shfs Tainted: G O 3.4.36-unRAID #1
Dec 18 22:48:12 Tower kernel: Call Trace:
Dec 18 22:48:12 Tower kernel: [] print_cpu_stall+0x6d/0xe5
Dec 18 22:48:12 Tower kernel: [] __rcu_pending+0x3b/0x138
Dec 18 22:48:12 Tower kernel: [] rcu_check_callbacks+0x76/0xa1
Dec 18 22:48:12 Tower kernel: [] update_process_times+0x2d/0x58
Dec 18 22:48:12 Tower kernel: [] tick_periodic+0x63/0x65
Dec 18 22:48:12 Tower kernel: [] tick_handle_periodic+0x19/0x6f
Dec 18 22:48:12 Tower kernel: [] smp_apic_timer_interrupt+0x6d/0x7f
Dec 18 22:48:12 Tower kernel: [] apic_timer_interrupt+0x2a/0x30
Dec 18 22:48:12 Tower kernel: [] ? reiserfs_parse_alloc_options+0x454/0x454
Dec 18 22:48:12 Tower kernel: [] ? reiserfs_discard_all_prealloc+0x2d/0x3c
Dec 18 22:48:12 Tower kernel: [] do_journal_end+0x1b4/0x968
Dec 18 22:48:12 Tower kernel: [] journal_end+0xb7/0xbf
Dec 18 22:48:12 Tower kernel: [] reiserfs_rename+0x859/0x879
Dec 18 22:48:12 Tower kernel: [] ? find_busiest_group+0x283/0x998
Dec 18 22:48:12 Tower kernel: [] ? sched_clock_local+0x12d/0x17c
Dec 18 22:48:12 Tower kernel: [] vfs_rename_dir+0xc0/0x122
Dec 18 22:48:12 Tower kernel: [] vfs_rename+0xc5/0x1d5
Dec 18 22:48:12 Tower kernel: [] ? do_path_lookup+0x1e/0x50
Dec 18 22:48:12 Tower kernel: [] sys_renameat+0x17d/0x1ee
Dec 18 22:48:12 Tower kernel: [] ? user_path_create+0x40/0x49
Dec 18 22:48:12 Tower kernel: [] ? sys_mkdirat+0x21/0x93
Dec 18 22:48:12 Tower kernel: [] sys_rename+0x28/0x2a
Dec 18 22:48:12 Tower kernel: [] syscall_call+0x7/0xb
Dec 18 22:48:42 Tower shutdown[18301]: shutting down for system halt
Dec 18 22:51:12 Tower kernel: INFO: rcu_sched self-detected stall on CPU { 0} (t=384063 jiffies)
Dec 18 22:51:12 Tower kernel: Pid: 13140, comm: shfs Tainted: G O 3.4.36-unRAID #1
Dec 18 22:51:12 Tower kernel: Call Trace:
Dec 18 22:51:12 Tower kernel: [] print_cpu_stall+0x6d/0xe5
Dec 18 22:51:12 Tower kernel: [] __rcu_pending+0x3b/0x138
Dec 18 22:51:12 Tower kernel: [] rcu_check_callbacks+0x76/0xa1
Dec 18 22:51:12 Tower kernel: [] update_process_times+0x2d/0x58
Dec 18 22:51:12 Tower kernel: [] tick_periodic+0x63/0x65
Dec 18 22:51:12 Tower kernel: [] tick_handle_periodic+0x19/0x6f
Dec 18 22:51:12 Tower kernel: [] smp_apic_timer_interrupt+0x6d/0x7f
Dec 18 22:51:12 Tower kernel: [] apic_timer_interrupt+0x2a/0x30
Dec 18 22:51:12 Tower kernel: [] ? __discard_prealloc+0x10/0xab
Dec 18 22:51:12 Tower kernel: [] reiserfs_discard_all_prealloc+0x2d/0x3c
Dec 18 22:51:12 Tower kernel: [] do_journal_end+0x1b4/0x968
Dec 18 22:51:12 Tower kernel: [] journal_end+0xb7/0xbf
Dec 18 22:51:12 Tower kernel: [] reiserfs_rename+0x859/0x879
Dec 18 22:51:12 Tower kernel: [] ? find_busiest_group+0x283/0x998
Dec 18 22:51:12 Tower kernel: [] ? sched_clock_local+0x12d/0x17c
Dec 18 22:51:12 Tower kernel: [] vfs_rename_dir+0xc0/0x122
Dec 18 22:51:12 Tower kernel: [] vfs_rename+0xc5/0x1d5
Dec 18 22:51:12 Tower kernel: [] ? do_path_lookup+0x1e/0x50
Dec 18 22:51:12 Tower kernel: [] sys_renameat+0x17d/0x1ee
Dec 18 22:51:12 Tower kernel: [] ? user_path_create+0x40/0x49
Dec 18 22:51:12 Tower kernel: [] ? sys_mkdirat+0x21/0x93
Dec 18 22:51:12 Tower kernel: [] sys_rename+0x28/0x2a
Dec 18 22:51:12 Tower kernel: [] syscall_call+0x7/0xb

Link to comment

alrighty.. rule one in tech support; restart the machine and see if it solves the problem.

 

restarting seemed to solve the problem. i was mostly confused because the powerdown command wasn't actually turning it off.. anyways i'm letting it run parity check now to see if there's issues; startup log is attached in case someone wants to see if they see red flags for anything

 

thanks!

121814_syslog.txt

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.