doubleohwhatever Posted December 24, 2014 Share Posted December 24, 2014 I've had two hard crashes since running beta-12. After the first crash I setup putty to generate a log. Unfortunately I had another crash today but at least I have some helpful info this time. The last few hours of the log data is in the attached file. If more data is needed just let me know. log.zip Quote Link to comment
BRiT Posted December 24, 2014 Share Posted December 24, 2014 I've had two hard crashes since running beta-12. After the first crash I setup putty to generate a log. Unfortunately I had another crash today but at least I have some helpful info this time. The last few hours of the log data is in the attached file. If more data is needed just let me know. I haven't looked at the logs at all, but I wonder if you're suffering from some RFS (ReiserFS) corruption issues that had plagued at least 2 other folks. They were having seemingly random crashes or reboots and their issues went away completely once they migrated to XFS. Quote Link to comment
WeeboTech Posted December 24, 2014 Share Posted December 24, 2014 I've had two hard crashes since running beta-12. After the first crash I setup putty to generate a log. Unfortunately I had another crash today but at least I have some helpful info this time. The last few hours of the log data is in the attached file. If more data is needed just let me know. I haven't looked at the logs at all, but I wonder if you're suffering from some RFS (ReiserFS) corruption issues that had plagued at least 2 other folks. They were having seemingly random crashes or reboots and their issues went away completely once they migrated to XFS. There were 3 people that I remember, this would be number 4 if it's reiserfs corruption. I think it could be down to the metadata level. If I remember correctly One user was doing the reiserfsck multiple times and was still getting corruption. So my question would be, @doublewhatever, were the beta's used that were known to have potential corruption ever used? In prior investigations, one member said no, another said yes. I cannot remember the third situation all that well. In looking at these logs I see some reiserfs calls, and the messages about stalled CPU. Quote Link to comment
RobJ Posted December 24, 2014 Share Posted December 24, 2014 I've had two hard crashes since running beta-12. After the first crash I setup putty to generate a log. Unfortunately I had another crash today but at least I have some helpful info this time. The last few hours of the log data is in the attached file. If more data is needed just let me know. This log wasn't much help, as it is only a small piece of the syslog, actually a small piece that repeats over and over, a CPU stall with Call Trace that repeats once every 3 minutes. It does look exactly like the previous users with CPU stalls, and the CPU involved is executing Reiser code every time. I don't think we were ever able to conclude that it was directly related to the Reiser file corruption issue, but that it IS clearly doing something that's Reiser related. At least one of the other users found that reiserfsck would find issues, but this problem could occur again right after reiserfsck had declared the file system clean. I believe that one or more users 'solved' the problem by converting the disks to XFS. In your case, the log has little info, does not indicate any particular drive is involved. I would start the system in Maintenance mode and check every one of the data drives (see Check Disk File systems). Quote Link to comment
doubleohwhatever Posted December 24, 2014 Author Share Posted December 24, 2014 So my question would be, @doublewhatever, were the beta's used that were known to have potential corruption ever used? Yep. I was running beta 7 but just for a few days before I upgraded to beta 9. I never came across any corrupted files though. Unfortunately with beta 9 I experienced these same crashes. So I downgraded to beta 6 and ran that until beta 12 was available. I never had any crashes while running beta 6. This is what confuses me. If it was corruption causing the issue, wouldn't I have seen the same crashes with beta 6? Basically that beta is butter smooth for me but anything after that seems to cause these crashes. This log wasn't much help, as it is only a small piece of the syslog, actually a small piece that repeats over and over, a CPU stall with Call Trace that repeats once every 3 minutes. It actually repeats like that in the log. I've attached a larger portion of the log. Basically I was just moving a crap ton of files around before the crash. Getting all of the frequently accessed stuff into a single share so I can just have the directories cached on the one share. The server is a AVS-10/4 loaded with 4TB drives and is 71% full across all drives. What's the best way to convert to XFS with a system this full? Can I pull and replace a drive at a time and let unraid rebuild them as XFS? log.zip Quote Link to comment
johnodon Posted December 24, 2014 Share Posted December 24, 2014 FYI...I have had crashes since migrating to XFS...same CPU stall messages. But I have not seen the corrupt file system messages I was seeing before which I think led to my other crashes. I have since disabled 2 dockers (nzbget and nzbdrone) and one plugin (SNAP) and have not had a crash in 3 days. If I can go a solid week without a crash, I think my culprit was either a docker container or the SNAP plugin. John Quote Link to comment
doubleohwhatever Posted December 25, 2014 Author Share Posted December 25, 2014 Unfortunately I don't have any dockers that could be the problem on my system. The only plugins I have are the cached directories and btsync (neither of which were installed the first time I had a crash with beta 12). Quote Link to comment
JonathanM Posted December 25, 2014 Share Posted December 25, 2014 Can I pull and replace a drive at a time and let unraid rebuild them as XFS?No. Unraid can only rebuild an entire drive as it currently is, it can't convert filesystems on the fly. One way to accomplish a conversion like what you describe is to empty a drive by moving the contents onto other drives, change the filesystem type to XFS, and let unraid reformat the drive. Then you can use the new empty XFS drive to receive the contents of the next drive you want to convert, lather rinse repeat until you are all done. There is no way I know of to convert a reiserfs drive to xfs directly, and even if there was, I'm not sure I would trust it if the reiserfs drive has suspected errors. Bottom line, you have to be able to come up with a totally blank drive with enough free space. Each copy operation should be done with a utility that supports checksum verification before the source files are removed. Quote Link to comment
johnodon Posted December 25, 2014 Share Posted December 25, 2014 Bottom line, you have to be able to come up with a totally blank drive with enough free space. Each copy operation should be done with a utility that supports checksum verification before the source files are removed. If you dare do so, you can use your parity drive as the intermediary drive if you do not have a spare drive. Be warned: your data will be unprotected during the migration process. Most here would advise against this. John Quote Link to comment
RobJ Posted December 25, 2014 Share Posted December 25, 2014 Linking this to the other main thread: [sOLVED-WORK-AROUND] ReiserFS Kernel Panic during Mover That thread includes links to other related threads. Quote Link to comment
doubleohwhatever Posted December 28, 2014 Author Share Posted December 28, 2014 Just had another crash. It definitely seems to be triggered by the mover somehow. The mover finished and within a few seconds the cpu stall errors started flying. I guess my biggest curiosity at the moment is why these crashes never happened on beta 6 and earlier betas. I wasn't on beta 7/8 long enough to know if they triggered the crashes but they were definitely a problem with beta 9. Does anyone have any ideas on what changes might have been made that could cause this problem? Quote Link to comment
Chris Pollard Posted December 29, 2014 Share Posted December 29, 2014 I also had a similar issue to the one posted in the first log, stall related to shfs. I didn't bother taking logs, assumed it was something disk related. I'm running beta 10a however.... had to powerdown as there were a ton of stuck processes I couldn't clear. If I get it again I'll grab some logs. Quote Link to comment
BRiT Posted December 29, 2014 Share Posted December 29, 2014 Could be the new Linux kernel behaves differently (performs better or timings are just slightly different enough) so the trouble with RFS shows up a lot more. Quote Link to comment
jonp Posted December 30, 2014 Share Posted December 30, 2014 Have we confirmed that data has or hasn't been corrupted here yet? Crashing and corruption can be related, but I want to get confirmation that corruption has truly occurred on data first. Have you attempted to open a file and witnessed corruption first hand? What about CRC checks? Reiserfsck? Quote Link to comment
doubleohwhatever Posted December 30, 2014 Author Share Posted December 30, 2014 I haven't come across any corrupted files. In my case, the crashes always seem to occur just after the mover has finished. Just happened again tonight. No log this time unfortunately as I forgot to have putty write the session to a file. Quote Link to comment
jonp Posted December 30, 2014 Share Posted December 30, 2014 I haven't come across any corrupted files. In my case, the crashes always seem to occur just after the mover has finished. Just happened again tonight. No log this time unfortunately as I forgot to have putty write the session to a file. Ok, what about resierfsck? Did you run this to check for issues? Quote Link to comment
doubleohwhatever Posted December 30, 2014 Author Share Posted December 30, 2014 Will check when I get back in town on the 1st. Heading out of town for a quick new years trip. Quote Link to comment
doubleohwhatever Posted January 2, 2015 Author Share Posted January 2, 2015 Back in town and will run resierfsck tomorrow. In the meantime I've attached another log from the last crash. It turns out I did have it logging after all. crash-2.txt Quote Link to comment
SmallwoodDR82 Posted January 2, 2015 Share Posted January 2, 2015 same issue over here. my crashes have not been during mover however. http://lime-technology.com/forum/index.php?topic=37311.0 Hopefully we get to the bottom of this...my array is 50% converted to XFS. Hope to be done in the next week or so. I'll keep everyone updated! Quote Link to comment
doubleohwhatever Posted January 3, 2015 Author Share Posted January 3, 2015 I'm assuming my crashes always happen after the mover runs because that's the only time files are moved around on my server. ResierFS is the issue here. I'm converting to XFS and being done with it. Quote Link to comment
SmallwoodDR82 Posted January 20, 2015 Share Posted January 20, 2015 Update: my array has been converted to XFS for 13 days now and I am 13 days without a crash/reboot! Quote Link to comment
jonp Posted January 20, 2015 Share Posted January 20, 2015 Couple issues in this thread which I'm going to move to general support. 1) The OP doesn't provide enough information to go off of nor followed the defect report post guideline to give us something to test. 2) No corruption (as the subject indicates) has been proven to have occurred at this point. 3) All replies to this thread by others that say, "they've had a similar issue" are not the same issue (Smallwood's issue is not related here IMHO). 4) No feedback to requests for reiserfsck has been provided, so we have nothing to go off of. As I mentioned above, I'm moving this to general support. When we can see some log that has actual corruption reported or a reiserfsck report, we can help further, but this is definitely not a bug with beta 12. Quote Link to comment
doubleohwhatever Posted January 20, 2015 Author Share Posted January 20, 2015 Sorry. I actually did run reiserfsck but had a crash almost immediately. I just said screw it and have since converted to XFS. No issues since. I really wouldn't be so certain that there's not a bug though. My setup ran 100% fine on Beta 6 and earlier but crashed once a week with ReiserFS errors on anything above Beta 6. So instead of just writing it off, you may want to call it a rare bug with the solution being to convert to XFS. Quote Link to comment
hackztor Posted January 20, 2015 Share Posted January 20, 2015 XFS seems to be the better way to go. I had a filesystem corruption one to many times with ReiserFS and 0 with xfs. I am so glad they opened unraid to multiple filesystems. Sorry. I actually did run reiserfsck but had a crash almost immediately. I just said screw it and have since converted to XFS. No issues since. I really wouldn't be so certain that there's not a bug though. My setup ran 100% fine on Beta 6 and earlier but crashed once a week with ReiserFS errors on anything above Beta 6. So instead of just writing it off, you may want to call it a rare bug with the solution being to convert to XFS. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.