Kernel BUG while stopping the array

dangil · May 5, 2012

Since beta6a, until at least beta 14 (testing rc2 now), I have a intermittent issue while stopping the array.

syslog spits a BUG: unable to handle kernel NULL pointer dereference at (null)

when umounting sleeping disks

I have a Supermicro X8SIL-V motherboard, with 6 onboard sata. If I only use these ports, I don't see the issue.

I had a LSI 3081e-r controller, that I thought was the issue, but I replaced it with a JMicron JMB362 based 2-port pci-e 1x card and installed the 7th drive of my array

After 1 hour after power-on, the 7th disk (array disk6) that is connected to this JMB362 controller, went to sleep. 10 minutes after, I tried to stop the array, but the interface returned without the array stopped, and after that I need to reboot the server to regain control.

So I think that when there is more than 1 sata controller, the issue appears.

attached is the syslog with the kernel BUG info

syslog.zip

dangil · May 6, 2012

running 5.0 rc-2, and it happened again

I let the array sleep for some time

Then I spun up the disks first... checked that everything was accessible

then I hit STOP, to stop the array, and the webgui returned with the array still running

if I hit stop the array again, webgui will freeze. Everything that try to access the disks freezes from now on

any suggestions?

again, this only happens when I add a secondary sata controller. First the 3081e-r LSI. Now I removed it, and I am using a JMB362 based controller

without a secondary sata controller this error doesn't occours

tried this :

root@Tower:/mnt# echo stop > /proc/mdcmd

-bash: echo: write error: Device or resource busy

Samba is running. lsof didn't return anything usefull

/mnt is empty.

df outputs:

root@Tower:/mnt# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sdf1 1.9G 128M 1.8G 7% /boot

df: `/mnt/disk6': No such file or directory

I will leave the server on this state so I can test any suggestions

syslog.zip

limetech · May 6, 2012

Are you using a Xeon processor?

dangil · May 6, 2012

No.

I am using a core i3 540

with 4GB DDR3 Unbuffered ECC

dangil · May 8, 2012

nobody has any other suggestions ?

limetech · May 8, 2012

nobody has any other suggestions ?

Is this problem intermittant, or can you make it happen all the time?

dangil · May 8, 2012

intermittant, unfortunately

but it only happens when I have a secondary sata controller.

I tried to reproduce it several times, by sleeping all disks and then stopping the array, but I couldn't reproduce.

Tried to sleep only the disk attached to the secondary sata controller, but I couldn't reproduce it either after several attempts

I suspect it has something to do with how long the disks stay sleeping in conjunction with the presence of a secondary controller.

if I remove the secondary controller and leave all disks with the onboard controller, the error doesn't happen (after a few months of using the server, with long sleep cycles).

after I attached a new disk to this JMB362 controller, the error happened on the same day, after the disks went to sleep after 1 hour as configured. When I was using the LSI based controller, the same error also appeared occasionally .

If I log on the server, /mnt is empty, which suggests the disks were unmonted. SMB services are restarted, but the stop command to mdcmd fails with the device is busy message.

If I try to stop the array again or try to shutdown the server, a sync() command is issued which locks up everything

lsof output doesn't list any of the /dev/sdX disks or /mnt disks

if I try to unload md_cmd module, it says it's in use and cant be unloaded

why did you mentioned a xeon processor ? I think the i3 540 running on intel 3420 chipset behaves as a xeon, being able to use ECC memory. Could this indicate something ?

limetech · May 8, 2012

why did you mentioned a xeon processor ? I think the i3 540 running on intel 3420 chipset behaves as a xeon, being able to use ECC memory. Could this indicate something ?

I was working with someone via email a few months ago who was using your motherboard with a xeon processor that also exhibited this problem fairly reliably. Eventually I think one of the kernel updates made it "go away". I think this is a race condition in the linux kernel related to ReiserFS un-mounting before the linux buffer cache is fully flushed (exacerbated by a very fast processor). We were trying to set up a definitive test case that I could post to the kernel development mailing list, but we just could not make a reliably repeatable test case. I now have a similar m/b in house and I'll order a fast processor and try and go down this rabbit hole again

dangil · May 8, 2012

Great news !

Thanks for the info

I will wait for a possible fix then

dangil · May 11, 2012

an Update

I did some testing, and this bug isn't related with disk sleeping.

I booted the array, and did a few stop/start array cycles in a row. doing this I could reproduce the bug consistently.

It really appears it is a race condition or a timing issue. but the bug manisfests consistently on the same disk always. the disk attached to the offboard PCI-e sata controller

Since I aready switched sata controllers, I will try to test it with different disks, because I suspect I am always testing this with the same 1.5TB seagate drive.

If this bug manifests itself with other HDs too, I guess I will have to limit my array to the 6 onboard sata ports for now...

Does anybody else has the same hardware configuration as I do and is running without issues ? (Supermicro X8SIL-V + Core i3 540 + 4GB DDR3 ECC + Secondary PCI-e Sata controller) ?

searching the forums for the Call Trace output (queue_delayed_work do_journal_end journal_end_sync), I found a few other members with the same symptoms, like users madburg, nezil, gfjardim

it appears madburg ran reiserfsck on the disks and the problem was fixed? is that correct?

how can I do this without invalidating the parity on my array?

syslog.zip

dgaschk · May 11, 2012

See Check_Disk_Filesystems in my sig.

limetech · May 11, 2012

an Update

I did some testing, and this bug isn't related with disk sleeping.

I booted the array, and did a few stop/start array cycles in a row. doing this I could reproduce the bug consistently.

It really appears it is a race condition or a timing issue. but the bug manisfests consistently on the same disk always. the disk attached to the offboard PCI-e sata controller

Since I aready switched sata controllers, I will try to test it with different disks, because I suspect I am always testing this with the same 1.5TB seagate drive.

If this bug manifests itself with other HDs too, I guess I will have to limit my array to the 6 onboard sata ports for now...

Does anybody else has the same hardware configuration as I do and is running without issues ? (Supermicro X8SIL-V + Core i3 540 + 4GB DDR3 ECC + Secondary PCI-e Sata controller) ?

searching the forums for the Call Trace output (queue_delayed_work do_journal_end journal_end_sync), I found a few other members with the same symptoms, like users madburg, nezil, gfjardim

it appears madburg ran reiserfsck on the disks and the problem was fixed? is that correct?

how can I do this without invalidating the parity on my array?

It's not clear if this only happens with corrupted file system. But procedure is below:

With array Stopped, check the "Maintenance mode" box and then Start the array. Maintenance mode starts the array but does not mount any of the hard drives. (The reason you want the array Started is so that any changes made by reiserfsck update parity so that parity remains consistent.) You can then check the individual file systems via telnet session like this:

reiserfsck /dev/md1 <-- corresponds to disk1

reiserfsck /dev/md2 <-- corresponds to disk2

etc.

To check the Cache disk, you need to look on the Main page and see which linux device identifier has been assigned to it. This will be string inside parenthesis, e.g., (sde). Then use this command:

reiserfsck /dev/sde1 <-- substitute "sde" for identifier on your system, and add a "1".

The reiserfsck utility will ask you to type "Yes" to continue. Type exactly like that (without the quotes). If the utility finds a problem, it will ask you to re-run, but with a switch specified, typically "--fix-fixable" but don't do this unless the utility says to.

The reiserfsck utility can take a long time to run, depending on how large the file system is.

dangil · May 11, 2012

I ran reiserfsck on all array disks.. only disk6, the one the unmount was failing with, had 5 transactions played back, but none had errors detected

after that, I couldn't reproduce the bug... tried several start/stop array cycles in a row, and nothing

lets wait a few days and see...

dangil · May 12, 2012

the bug reapeared. after the initial testing, I let the disks sleep and waited a few hours. after that, I hit the stop button, and the same kernel bug appeared...

this time disk6 had 0 transactions replayed during the reiserfsck command...

I guess it's not related with reiserfs corruption afterall

is there a way to remove an empty disk from the array without rebuilding parity?

dgaschk · May 12, 2012

You could zero the drive, but why? If the system can't properly calculate parity the system has no protection.

dangil · May 12, 2012

I am testing something different. Started the array in maintenance mode, and manually mounted disk6.

I will sleep all disks and let it sit for a few hours. then I will try to unmount the disk manually and see what happens

dangil · May 13, 2012

well. I could mount and umount disk6 manually several times, even after several hours of disk sleep

inconclusive for now...

dangil · May 15, 2012

can I unmount a disk while the array is started, but in normal mode ? (not in maintenance mode) I assume I must stop samba first

limetech · May 15, 2012

can I unmount a disk while the array is started, but in normal mode ? (not in maintenance mode) I assume I must stop samba first

Yes, or you can set SMB Export to 'No'.

dangil · May 17, 2012

I did several cycles of mount/umount disk6, that is the disk attached to pci-e controller and none failed

I have a hunch:

could this bug be brought up because the umount of disk5 on the onboard sata controller is not completelly finished when, in the sequence, the umount for disk6, attached to the pci-e controler is started?

what if a few seconds were added between all the umount commands ?

could someone create a test case, with 2 disks, one attached to a onboard sata controller, and other attached to an offboard sata controller, and a script that mounts and unmounts them in sequence ?

what led me to this hunch is that on my syslog , 2 disks remain busy after umount crashes the kernel. the conclusion is that the unmount of the second to last disk and the unmount of the last disk conflict with each other. perhaps from a race condition between them.

I don't have the capabilities to debug this low level kernel stuff...but perhaps someone else has

dangil · May 21, 2012

Tom, could you implement a slight delay (3 seconds for example), between each unmount when the stop button is pressed on the webgui ?

Kernel BUG while stopping the array

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation