Just Give It To Me Straight Doc


Recommended Posts

Well I had to move my Unraid to a new box. (Long Story) Unraid is now housed in a Dell r510 with an H310 flashed to IT mode. When I switched the drives over I made sure before starting the array that they were all in the correct place in the dashboard, everything seemed to go fine then I started to have errors.

 

Diagnostics attached:

 

Main Problem: Disk 3 is disabled, no docker

 

I am hoping that I just have a bad drive.

 

"Fix Common Problems" plugin

 

Errors:

 

disk3 (ST3000DM001-1CH166_Z1F5358D) is disabled

 

disk3 (ST3000DM001-1CH166_Z1F5358D) has read errors

 

Unable to write to cache

 

Unable to write to Docker Image

 

Warnings:

 

Multiple registration keys found

 

CPU possibly will not throttle down frequency at idle

bbb-diagnostics-20161023-1158.zip

Link to comment

SMART for Z1F5358D looks OK, and since you mentioned you had changed other hardware I would assume the disk is OK and you need to troubleshoot the other hardware. Cache disk problems probably also related to other hardware. And since your docker.img is on cache that explains docker not working.

 

After you get your other hardware straight you can rebuild Z1F5358D onto itself.

 

Check controller is seated, check all disk cables, power and SATA at both ends, etc.

Link to comment

SMART for Z1F5358D looks OK, and since you mentioned you had changed other hardware I would assume the disk is OK and you need to troubleshoot the other hardware. Cache disk problems probably also related to other hardware. And since your docker.img is on cache that explains docker not working.

 

After you get your other hardware straight you can rebuild Z1F5358D onto itself.

 

Check controller is seated, check all disk cables, power and SATA at both ends, etc.

 

 

Well I checked everything and reseated everything also. Then I went back through the BIOS I didn't see anything that jumped out at me. Drive 2 now comes up as unmountable in the dashboard, drive 3 is still disabled, and I can't seem to read/write from the SSD.

 

Attached a new diagnostics.....Anyone have a new thing to try?

bbb-diagnostics-20161023-1915.zip

Link to comment

I think you still have some issues with your other hardware, possibly the controller. Are all of your disks plugged into that same controller?

 

Disk2 shows write error in log. Is it disabled (red X) also?

 

unRAID disables a disk when a write to it fails. Possibly the disk2 write eventually succeeded so it didn't get disabled.

 

Looks like you have filesystem corruption on disk2, disk3, and cache.

 

I don't think there is any point in trying to rebuild the disabled disk or fix the filesystems until you get your controller sorted. Unfortunately I don't have any experience with that hardware. Maybe someone else will have some idea.

Link to comment

I think you still have some issues with your other hardware, possibly the controller. Are all of your disks plugged into that same controller?

 

Disk2 shows write error in log. Is it disabled (red X) also?

 

unRAID disables a disk when a write to it fails. Possibly the disk2 write eventually succeeded so it didn't get disabled.

 

Looks like you have filesystem corruption on disk2, disk3, and cache.

 

I don't think there is any point in trying to rebuild the disabled disk or fix the filesystems until you get your controller sorted. Unfortunately I don't have any experience with that hardware. Maybe someone else will have some idea.

 

 

Thanks for all your help. I changed the thread topic to hopeful get someone that has experience with the H310. I spun down the array for now.

 

Anyone that has experiance with an H310 have any idea what I am missing?

Link to comment

How are your drives connected?

All on the H310?

All on one port of the H310?

Is the H310 known to work OK?

How long has it "gone fine"? Days, weeks?

Do you have the possibility to connect the failed drives to the main board?

 

Don't try to repair the disks until you know that your other hardware (controller) is OK!

If you have some spare drives for testing, set up the array with those. If possible on another machine.

If your controller is faulty you can mess up all your data.

Back up your data!

 

Link to comment

How are your drives connected?

 

In Dell 3.5'' caddies Sata into the drive backplane. From the backplane mini-sas cables A/B back to the H310

 

Cache is the only one not run through the H310 however that is also run through the backplane. It is in slot 11 or 12 internally

 

All on the H310?

 

All Data Drives yes, the Cache is not I believe it is connected directly ( unless someone tells me different since its in an internal slot )

 

All on one port of the H310?

 

Drives are in slots 0,1,2,3 so I assume they are all on sas cable A if it works like that ( can't find that info )

 

Is the H310 known to work OK?

 

It was pulled from a working environment....I mean there is always a chance I guess

 

How long has it "gone fine"? Days, weeks?

 

This is a brand new build. Was forced to move it over.

 

Do you have the possibility to connect the failed drives to the main board?

 

No. On the r510 12bay unit the onboard sata is factory disabled from what I have read.

 

Don't try to repair the disks until you know that your other hardware (controller) is OK!

If you have some spare drives for testing, set up the array with those. If possible on another machine.

If your controller is faulty you can mess up all your data.

Back up your data!

 

Roger that awaiting further ideas/questions

Link to comment

Actually this build never ran without issues.

In addition to a faulty controller (I never heard of one until now) the backplane can also cause trouble.

 

You have to rule out the backplane as a failure cause!

 

If you have 12 slots and the H310 can connect but 8, what are the remaining 4 drives connected to?

Maybe on board?

Can you connect the drives directly to the controller or main board and bypass the backplane?

 

 

 

Link to comment

Actually this build never ran without issues.

 

Correct its basically a new build with old drives and old install usb

 

In addition to a faulty controller (I never heard of one until now) the backplane can also cause trouble.

 

You have to rule out the backplane as a failure cause!

 

If you have 12 slots and the H310 can connect but 8, what are the remaining 4 drives connected to?

Maybe on board?

Can you connect the drives directly to the controller or main board and bypass the backplane?

 

I am running the Dell update iso right now making sure that no driver updates have been missed.

 

This R510 model is one of the ones with 12 hot swappable bays and 2 internal bays non hot swappable ( I kinda miss spoke early the cache is in one of the internal bays not a hot swap bay)

 

The H310 uses sas connectors so I can't directly connect the drives. As I said before the on board Sata connectors are disabled from the factory (permanently from what I understand)

 

I'll wait and see if any of the updates help things and report back.

Link to comment

Syslog looks OK, as far I can tell.

Nevertheless, this is also an new build and you can't tell if it will run stable to complete a rebuild.

The safest thing would be to rebuild on another drive or backup disk3 and start the rebuild as you planned.

 

https://lime-technology.com/wiki/index.php/Troubleshooting#Re-enable_the_drive

 

Changing the topic to the old subject since the H310 and backplane don't seem to be the issue.

I think this is not true.

 

This R510 model is one of the ones with 12 hot swappable bays and 2 internal bays non hot swappable

Understood so far but the H310 has 2x4=8 ports to connect with your bays!

That leaves the remaining 14-8=6 bays connected to???

Is there another controller? A second H310?

Is there a sort of expander built into that backplane?

Link to comment

I guess I found it:

Oct 23 19:07:25 BBB kernel: mpt2sas_cm0: LSISAS2008: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
Oct 23 19:07:25 BBB kernel: mpt2sas_cm0: Protocol=(
Oct 23 19:07:25 BBB kernel: Initiator,Target
Oct 23 19:07:25 BBB kernel: ), Capabilities=(
Oct 23 19:07:25 BBB kernel: TLR,EEDP
Oct 23 19:07:25 BBB kernel: ,Snapshot Buffer,Diag Trace Buffer
Oct 23 19:07:25 BBB kernel: ,Task Set Full,NCQ
Oct 23 19:07:25 BBB kernel: )
Oct 23 19:07:25 BBB kernel: scsi host1: Fusion MPT SAS Host
Oct 23 19:07:25 BBB kernel: mpt2sas_cm0: sending port enable !!
Oct 23 19:07:25 BBB kernel: mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x5d4ae520b5804900), phys(
Oct 23 19:07:25 BBB kernel: mpt2sas_cm0: expander_add: handle(0x0009), parent(0x0001), sas_addr(0x500065b36789abff), phys(26)

I compared a syslog of my server (uses a H310 also) and it does not mention an expander_add!

 

The backplane seems to have an integrated expander in order to provide the 14 ports.

I'm pretty sure that this is causing you troubles.

 

Edit:

http://serverfault.com/questions/462433/how-to-find-if-there-is-an-expander-in-my-controller

 

Changing the topic to the old subject since the H310 and backplane don't seem to be the issue.

Obviously the backplane IS the issue.

Link to comment

I guess I found it:

Oct 23 19:07:25 BBB kernel: mpt2sas_cm0: LSISAS2008: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
Oct 23 19:07:25 BBB kernel: mpt2sas_cm0: Protocol=(
Oct 23 19:07:25 BBB kernel: Initiator,Target
Oct 23 19:07:25 BBB kernel: ), Capabilities=(
Oct 23 19:07:25 BBB kernel: TLR,EEDP
Oct 23 19:07:25 BBB kernel: ,Snapshot Buffer,Diag Trace Buffer
Oct 23 19:07:25 BBB kernel: ,Task Set Full,NCQ
Oct 23 19:07:25 BBB kernel: )
Oct 23 19:07:25 BBB kernel: scsi host1: Fusion MPT SAS Host
Oct 23 19:07:25 BBB kernel: mpt2sas_cm0: sending port enable !!
Oct 23 19:07:25 BBB kernel: mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x5d4ae520b5804900), phys(
Oct 23 19:07:25 BBB kernel: mpt2sas_cm0: expander_add: handle(0x0009), parent(0x0001), sas_addr(0x500065b36789abff), phys(26)

I compared a syslog of my server (uses a H310 also) and it does not mention an expander_add!

 

The backplane seems to have an integrated expander in order to provide the 14 ports.

I'm pretty sure that this is causing you troubles.

 

Edit:

http://serverfault.com/questions/462433/how-to-find-if-there-is-an-expander-in-my-controller

 

Changing the topic to the old subject since the H310 and backplane don't seem to be the issue.

Obviously the backplane IS the issue.

 

Got ya. Sorry for the misunderstanding. So are thinking the backplane/expander is bad?

Link to comment

Not necessarily bad but obviously not working with this controller in IT mode.

Maybe the backplane has to be configured/flashed also into another operating mode?

Could be a driver issue also!?

 

You could try to get some info from DELL itself.

There is an IT firmware from DELL for the H310.

Ask them, if the controller/backplane combo will work in IT mode respectively what you need to

do with the backplane in order to get it working.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.