SmallwoodDR82 Posted December 24, 2014 Share Posted December 24, 2014 First let me say thank you in advance. I've been chasing this issue for awhile and I'm completely at a loss. I was lucky today and was able to pull the syslog before the crash was so severe it killed telnet. Hardware : Case: Norco 4224 Mb: Supermicro x9scl-f-o CPU: Xeon 3.4Ghz E3-1240v2 RAM: Kingston 32GB Controller: Intel M1015 (IT Mode) Expander Card: RES2SV240 Running ESXi 5.5 with about 7 Guest. (Windows, Ubuntu, etc...) Running unRAID 6.0beta12 Pro via Plop. I was having this issue with 5.0.5 and the only plugin I had was Plex. So I decided to move to unRAID 6.0beta12 and run Plex as a docker. I did the move yesterday (12/23/2014) and all went smooth. Less than 24 hours later the same CPU stall error showed up Small history: I had this issue on ESXi 5.1 so I upgraded to 5.5 in hopes it would solve it and I'm still having the same issue. So far this issue has survived ESXi 5.1, ESXi 5.5, unRAID 5.0.5, and unRAID 6.0beta12. I'm kind of leaning toward hardware however none of my other VMs have issues at all. Granted they really don't have any passthrough. Anyone seen this or have any ideas? Sometimes I can go over a month, other times less than 24 hours. Typically I have all 4 CPUs to unRAID. Just today I went down to 2 however I'd like to use 4 because of Plex. Dec 24 14:33:30 S-M-C kernel: INFO: rcu_sched self-detected stall on CPU { 3} (t=6000 jiffies g=2121076 c=2121075 q=47469) Dec 24 14:33:30 S-M-C kernel: Task dump for CPU 3: Dec 24 14:33:30 S-M-C kernel: shfs R running task 0 14923 1 0x00000008 Dec 24 14:33:30 S-M-C kernel: 0000000000000000 ffff88013fd83de8 ffffffff8105cc09 0000000000000003 Dec 24 14:33:30 S-M-C kernel: 0000000000000003 ffff88013fd83e00 ffffffff8105f2c4 ffffffff81822d00 Dec 24 14:33:30 S-M-C kernel: ffff88013fd83e30 ffffffff810766a5 ffffffff81822d00 ffff88013fd8e0c0 Dec 24 14:33:30 S-M-C kernel: Call Trace: Dec 24 14:33:30 S-M-C kernel: <IRQ> [<ffffffff8105cc09>] sched_show_task+0xbe/0xc3 Dec 24 14:33:30 S-M-C kernel: [<ffffffff8105f2c4>] dump_cpu_task+0x34/0x38 Dec 24 14:33:30 S-M-C kernel: [<ffffffff810766a5>] rcu_dump_cpu_stacks+0x6a/0x8c Dec 24 14:33:30 S-M-C kernel: [<ffffffff81078ead>] rcu_check_callbacks+0x1e1/0x4ff Dec 24 14:33:30 S-M-C kernel: [<ffffffff81086659>] ? tick_sched_handle+0x34/0x34 Dec 24 14:33:30 S-M-C kernel: [<ffffffff8107ac1a>] update_process_times+0x38/0x60 Dec 24 14:33:30 S-M-C kernel: [<ffffffff81086657>] tick_sched_handle+0x32/0x34 Dec 24 14:33:30 S-M-C kernel: [<ffffffff8108668e>] tick_sched_timer+0x35/0x53 Dec 24 14:33:30 S-M-C kernel: [<ffffffff8107b149>] __run_hrtimer.isra.29+0x57/0xb0 Dec 24 14:33:30 S-M-C kernel: [<ffffffff8107b634>] hrtimer_interrupt+0xd9/0x1c0 Dec 24 14:33:30 S-M-C kernel: [<ffffffff8102ea78>] local_apic_timer_interrupt+0x4f/0x52 Dec 24 14:33:30 S-M-C kernel: [<ffffffff8102ee4a>] smp_apic_timer_interrupt+0x3a/0x4b Dec 24 14:33:30 S-M-C kernel: [<ffffffff815ead9d>] apic_timer_interrupt+0x6d/0x80 Dec 24 14:33:30 S-M-C kernel: <EOI> [<ffffffff81154fd0>] ? unfix_nodes+0x13f/0x14b Dec 24 14:33:30 S-M-C kernel: [<ffffffff81147aff>] ? __discard_prealloc+0x71/0xb1 Dec 24 14:33:30 S-M-C kernel: [<ffffffff81147ba2>] reiserfs_discard_all_prealloc+0x43/0x4c Dec 24 14:33:30 S-M-C kernel: [<ffffffff81163ed6>] do_journal_end+0x4e1/0xc57 Dec 24 14:33:30 S-M-C kernel: [<ffffffff81164ba6>] journal_end+0xad/0xb4 Dec 24 14:33:30 S-M-C kernel: [<ffffffff8114b8d9>] reiserfs_unlink+0x1bf/0x21f Dec 24 14:33:30 S-M-C kernel: [<ffffffff810fc287>] ? link_path_walk+0x67/0x70c Dec 24 14:33:30 S-M-C kernel: [<ffffffff810ff1ed>] vfs_unlink+0xa7/0x120 Dec 24 14:33:30 S-M-C kernel: [<ffffffff810ff351>] do_unlinkat+0xeb/0x1ee Dec 24 14:33:30 S-M-C kernel: [<ffffffff810f7750>] ? SyS_newlstat+0x25/0x2e Dec 24 14:33:30 S-M-C kernel: [<ffffffff810fffe8>] SyS_unlink+0x11/0x13 Dec 24 14:33:30 S-M-C kernel: [<ffffffff815e9fa9>] system_call_fastpath+0x16/0x1b Syslog attached. Thanks all and happy holidays! syslog.zip Quote Link to comment
SmallwoodDR82 Posted December 25, 2014 Author Share Posted December 25, 2014 looks like there is a fix/work around. Another ReiserFS issue. http://lime-technology.com/forum/index.php?topic=35788.0 I have a lot of moving/formatting in my future... Quote Link to comment
SmallwoodDR82 Posted January 9, 2015 Author Share Posted January 9, 2015 Update: 1/9/2015 After a few painfully slow weeks of transferring data around and formatting drives, I am now 100% on XFS array disks. I'm am 48 hours in without a single CPU Stall. I will keep this thread updated. Not just ready to mark it as solved. Thanks! Quote Link to comment
SmallwoodDR82 Posted January 26, 2015 Author Share Posted January 26, 2015 Update: 1/25/2015 19 days without an error/crash! Quote Link to comment
SmallwoodDR82 Posted February 10, 2015 Author Share Posted February 10, 2015 Update: 2/9/2015 34 days, 2 hours and 1 minute without an error/crash! (not that I'm counting...) Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.