sureguy Posted September 27, 2014 Share Posted September 27, 2014 Attempting to reboot manually: /root/samba stop - appears to work, command prompt returns - umount /dev/md1 hangs ssh session - umount /dev/md2 results in umount: /mnt/disk2: device is busy. (In some cases useful info about processes that use the device is found by lsof( or fuser(1)) - umount /dev/md3 hangs ssh session - umount /dev/md4 hangs ssh session - umount /dev/md5 hangs ssh session - umount /dev/md6 hangs ssh session - umount /dev/sdl1 hangs ssh session (this is cache) run df -h results in: Filesystem Size Used Avail Use% Mounted on /dev/sda1 3.9G 293M 3.6G 8% /boot /dev/md1 16G 432M 16G 3% /mnt/disk1 /dev/md2 1.9T 1.5T 332G 83% /mnt/disk2 shfs 1.9T 1.5T 407G 80% /mnt/user0 shfs 2.0T 1.5T 422G 79% /mnt/user So some drives unmounted. Now I try this: umount /dev/md1 which results in: umount: /mnt/disk1: not mounted I re-run df -h: Filesystem Size Used Avail Use% Mounted on /dev/sda1 3.9G 293M 3.6G 8% /boot /dev/md2 1.9T 1.5T 332G 83% /mnt/disk2 /dev/md5 16G 432M 16G 3% /mnt/disk5 shfs 1.9T 1.5T 407G 80% /mnt/user0 shfs 2.0T 1.5T 422G 79% /mnt/user So I try to umount /dev/md2 - it's still busy So I try to umount /dev/md5 - now I get: umount: /mnt/disk5: not mounted I re-run df -h and get: Filesystem Size Used Avail Use% Mounted on /dev/sda1 3.9G 293M 3.6G 8% /boot /dev/md2 1.9T 1.5T 332G 83% /mnt/disk2 /dev/md6 16G 432M 16G 3% /mnt/disk6 shfs 1.9T 1.5T 407G 80% /mnt/user0 shfs 2.0T 1.5T 422G 79% /mnt/user Re-running the umount commands, I get: Filesystem Size Used Avail Use% Mounted on /dev/sda1 3.9G 295M 3.6G 8% /boot /dev/md2 1.9T 1.5T 332G 83% /mnt/disk2 shfs 1.9T 1.5T 407G 80% /mnt/user0 shfs 2.0T 1.5T 422G 79% /mnt/user So disk2 isn't going to un-mount. Fine. I'll try to proceed, so I run: /root/mdcmd stop I get: /root/mdcmd: line 11: echo: write error: Device or resource busy So I issue powerdown - to no avail, the system just sits there. So I hit the power button, and the system hangs on unmounting remote filesystems. There are a bunch of repeating lines on the console like the following: Cannot stat file /proc/25200/fd/32: No such file or directory Cannot stat file /proc/25202/fd/26: No such file or directory Cannot stat file /proc/25202/fd/26: No such file or directory Cannot stat file /proc/25206/fd/31: No such file or directory Cannot stat file /proc/25206/fd/31: No such file or directory Cannot stat file /proc/25209/fd/26: No such file or directory Cannot stat file /proc/25209/fd/26: No such file or directory etc. hard shut down by holding the power button down. I closed all putty sessions but no logs were saved (entirely possible I missed a message about this as I had more than 10 putty sessions open from trying to unmount drives). Attached is the last backup of the syslog taken about 20 minutes before totally giving up (before the manual shutdown procedure). syslog.sept272014.7.zip Quote Link to comment
jphipps Posted September 27, 2014 Share Posted September 27, 2014 I have been having similar crashes, and for some reason i found if I boot into Xen and run as normal with no VM's it hasn't crashed. Trying to get a window to swap my IO card to try to rule that out, but it might be worth trying to see if you still get it to crash under the Xen boot. Quote Link to comment
sureguy Posted September 27, 2014 Share Posted September 27, 2014 I have been having similar crashes, and for some reason i found if I boot into Xen and run as normal with no VM's it hasn't crashed. Trying to get a window to swap my IO card to try to rule that out, but it might be worth trying to see if you still get it to crash under the Xen boot. Funnily enough, i was the person that suggested swapping the cards between your servers. I'll see what happens on 6b9 in xen mode. Suggest you try 6b6 normal boot on your problematic system. Quote Link to comment
jphipps Posted September 27, 2014 Share Posted September 27, 2014 I thought that was you, but didn't go back and look.. I am running one more parity check right now under non-Xen with the current card just to make sure it is consistently crashing so when I swap it, I can do a good test. I don't really have good hope since that is already the second card I have tried. I had a different manufacture sata (non-SAS) card that was using a Marvel chip as well that it was crashing the same way. Robj had suggested replacing the card, so I replaced it with a model I knew was working. Quote Link to comment
sureguy Posted September 27, 2014 Share Posted September 27, 2014 My card has never exhibited the disk errors Limetech pointed out in your syslog, perhaps we can get more traction if my system doesn't exhibit any issues in xen mode. Quote Link to comment
sureguy Posted September 27, 2014 Share Posted September 27, 2014 Weirdly under xen mode I'm getting reiserfs issues. I've rebooted into 6beta9 and am checking all disks, if they come up fine I'll reboot into xen mode and rerun Quote Link to comment
jphipps Posted September 27, 2014 Share Posted September 27, 2014 That is one difference on my system, all my filesystems are XFS, I don't have any Reiserfs filesystems. Quote Link to comment
sureguy Posted September 27, 2014 Share Posted September 27, 2014 I just ran reiserfsck against all 6 array devices, all finished with no corruptions found. Quote Link to comment
RobJ Posted October 16, 2014 Share Posted October 16, 2014 Attached is the last backup of the syslog taken about 20 minutes before totally giving up (before the manual shutdown procedure). I was going to respond here, but because of the discussion over there, I moved it there. Quote Link to comment
Dimtar Posted October 17, 2014 Author Share Posted October 17, 2014 Server crashed again last night, first time in weeks but I didn't get a log. It feels like it hits a corrupt file of some type? There were errors on the screen but I couldn't enlarge the view to read the text, the only thing I noticed was there was a good 8 errors and they all looked nearly identical. Quote Link to comment
Dimtar Posted November 26, 2014 Author Share Posted November 26, 2014 Getting crashes two times a week now. Quote Link to comment
sureguy Posted November 26, 2014 Share Posted November 26, 2014 My best suggestion is to roll back to 6beta6 at this point. Quote Link to comment
Dimtar Posted November 26, 2014 Author Share Posted November 26, 2014 My best suggestion is to roll back to 6beta6 at this point. Thanks but beta9 had the same issue. My current plan is to give the next beta a try, if it continues then start planning to move to something besides unRaid. Quote Link to comment
sureguy Posted November 26, 2014 Share Posted November 26, 2014 Beta9 or Beta6? Quote Link to comment
Dimtar Posted November 26, 2014 Author Share Posted November 26, 2014 Beta9 or Beta6? I believe it all started happening once I hit beta9. Quote Link to comment
sureguy Posted November 26, 2014 Share Posted November 26, 2014 In my experience beta 6 does not exhibit this problem. Quote Link to comment
razorslinky Posted December 2, 2014 Share Posted December 2, 2014 Just in case someone searching this thread is having the same type of issue I wanted to let you know that there are a few of us experiencing the same issue. Here's a thread that I started: http://lime-technology.com/forum/index.php?topic=35788.45 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.