Nothing happening after replacing/upgrading to larger disc


Recommended Posts

Thanks in advance for your help. 

 

Yesterday I upgraded to 5.0 and had no issue with the server, it was behaving normally.  Today I powered down, removed a 1TB drive and replaced it with a 3TB drive.  Then powered on.  Using the Web GUI, it recognized the 1TB drive was missing and I selected the 3TB to replace it.  I clicked the "start array array and rebuid/expand" or whatever it is, but then nothing happens.  The drives didn't all spin up to rebuild the 1TB onto the new 3TB.  I could no longer access the Web GUI, the shares all disappeared and I could only access the flash drive over the nextwork. 

 

I have attached the syslog

 

Any ideas? 

syslog-2013-08-22.txt

Link to comment

I clicked the "start array array and rebuid/expand" or whatever it is, but then nothing happens.  The drives didn't all spin up to rebuild the 1TB onto the new 3TB.  I could no longer access the Web GUI, the shares all disappeared and I could only access the flash drive over the nextwork. 

 

I have seen something similar to this too.  Just this AM one of my servers had a disabled 2tb disk.  Pulled and replaced with a 3tb one to match the rest of the server.  For at least 10 minutes, the server was unresponsive.  Telnet access still worked, and unMenu worked but was very unresponsive.  Finally it started the rebuild process and started behaving normally.  (using rc16 on a TamSolutions intel Xeon 5130 4gb Ram - 24 bay server with 10 - 3tb disks installed.)

 

How long did you try to access it?

Link to comment

Problem seems likely to be related to corruption in the GPT of the replacement drive.  Here's the relevant parts of the syslog:

 

Aug 22 19:30:53 Tower kernel: ata1.00: ATA-9: WDC WD30EFRX-68AX9N0,      WD-WMC1T3169627, 80.00A80, max UDMA/133

Aug 22 19:30:53 Tower kernel: ata1.00: 5860533168 sectors, multi 16: LBA48 NCQ (depth 0/32)

...

Aug 22 19:30:53 Tower kernel: scsi 1:0:0:0: Direct-Access    ATA      WDC WD30EFRX-68A 80.0 PQ: 0 ANSI: 5

Aug 22 19:30:53 Tower kernel: sd 1:0:0:0: Attached scsi generic sg1 type 0

Aug 22 19:30:53 Tower kernel: sd 1:0:0:0: [sdb] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)

...

Aug 22 19:30:53 Tower kernel: GPT:Primary header thinks Alt. header is not at the end of the disk.

Aug 22 19:30:53 Tower kernel: GPT:1565565871 != 5860533167

Aug 22 19:30:53 Tower kernel: GPT:Alternate GPT header not at the end of the disk.

Aug 22 19:30:53 Tower kernel: GPT:1565565871 != 5860533167

Aug 22 19:30:53 Tower kernel: GPT: Use GNU Parted to correct GPT errors.

Aug 22 19:30:53 Tower kernel:  sdb: sdb1 sdb2

Aug 22 19:30:53 Tower kernel: sd 1:0:0:0: [sdb] Attached SCSI disk

...

Aug 22 19:30:53 Tower emhttp: Device inventory:

Aug 22 19:30:53 Tower emhttp: WDC_WD30EFRX-68AX9N0_WD-WMC1T3169627 (sdb) 2930266584

...

Aug 22 19:32:57 Tower kernel: mdcmd (2): import 1 8,16 2930266532 WDC_WD30EFRX-68AX9N0_WD-WMC1T3169627

Aug 22 19:32:57 Tower kernel: md: import disk1: [8,16] (sdb) WDC_WD30EFRX-68AX9N0_WD-WMC1T3169627 size: 2930266532

Aug 22 19:32:57 Tower kernel: md: disk1 wrong      (this is normal, it's a replacement)

...

Aug 22 19:33:23 Tower emhttp: writing GPT on disk (sdb), with partition 1 offset 64, erased: 0

Aug 22 19:33:23 Tower emhttp: shcmd (36): sgdisk -Z /dev/sdb &> /dev/null

Aug 22 19:33:24 Tower emhttp: shcmd (37): sgdisk -o -a 64 -n 1:64:0 /dev/sdb |& logger

Aug 22 19:33:24 Tower kernel:  sdb: unknown partition table

Aug 22 19:33:24 Tower logger: ^GCaution: invalid main GPT header, but valid backup; regenerating main header

Aug 22 19:33:24 Tower logger: from backup!

Aug 22 19:33:24 Tower logger:

Aug 22 19:33:24 Tower logger: Caution! After loading partitions, the CRC doesn't check out!

Aug 22 19:33:24 Tower logger: ^GWarning! Main partition table CRC mismatch! Loaded backup partition table

Aug 22 19:33:24 Tower logger: instead of main partition table!

Aug 22 19:33:24 Tower logger:

Aug 22 19:33:24 Tower logger: Warning! One or more CRCs don't match. You should repair the disk!

Aug 22 19:33:24 Tower logger:

Aug 22 19:33:24 Tower logger: Invalid partition data!

Aug 22 19:33:25 Tower logger: Information: Creating fresh partition table; will override earlier problems!

Aug 22 19:33:25 Tower logger: The operation has completed successfully.

Aug 22 19:33:25 Tower emhttp: shcmd (38): udevadm settle

Aug 22 19:33:25 Tower kernel:  sdb: unknown partition table

Aug 22 19:33:25 Tower emhttp: Start array...

Aug 22 19:33:25 Tower kernel: mdcmd (56): start UPGRADE_DISK

Aug 22 19:33:25 Tower kernel: md: do_run: lock_rdev error: -6

Aug 22 19:33:25 Tower kernel: md1: stopping

Aug 22 19:33:25 Tower kernel: BUG: unable to handle kernel NULL pointer dereference at 000001d4

Aug 22 19:33:25 Tower kernel: IP: [<f867412c>] do_stop+0x54/0xd4 [md_mod]

Aug 22 19:33:25 Tower kernel: *pdpt = 0000000037490001 *pde = 0000000000000000

Aug 22 19:33:25 Tower kernel: Oops: 0000 [#1] SMP

...

Your replacement drive had 2 existing partitions, but also had errors in the GPT.  You can see the attempts to overwrite it, to fix it, but the kernel repeatedly says "kernel:  sdb: unknown partition table", so it apparently was not fixed.  The rebuild begins, without a good partition table, and crashes immediately.

 

I do have one other recommendation, I see ata_piix being used, which usually means that in your BIOS SATA settings, you have SATA mode set to IDE emulation.  I strongly urge you to change that to AHCI, anything but an IDE emulation mode.  It should be slightly faster, and a little safer.

Link to comment
  • 2 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.