Jump to content

Unformatted Disk [Closed]


timz

Recommended Posts

Received a unformatted disk after a power outage. Luckily it is only 1 disk. I receive the following message after reiserfsck:

 

root@Tower:/mnt# reiserfsck  --check /dev/md8

reiserfsck 3.6.24

 

Will read-only check consistency of the filesystem on /dev/md8

Will put log info to 'stdout'

 

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes

###########

reiserfsck --check started at Sun Apr  5 18:52:14 2015

###########

Replaying journal: No transactions found

Checking internal tree.. \block 138318553: The level of the node (9186) is not correct, (4) expected

the problem in the internal node occured (138318553), whole subtree is skipped

finished

Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs.

Bad nodes were found, Semantic pass skipped

1 found corruptions can be fixed only when running with --rebuild-tree

###########

reiserfsck finished at Sun Apr  5 18:55:03 2015

###########

Should I rebuild tree? or just reformat and rebuild array?

Link to comment

I would expect that running with --rebuild-tree to correct the problem is the right way forward to avoid losing all the data on the drive. 

 

Even if you are not that worried about the data (which is what your suggestion for reformatting suggests) then if you have not been through the process before it might be a good experience so you know what it is like if you encounter a case in the future where recovering the data on the drive is important.

 

If you do decide to reformat then that only takes a minute or so but does mean you lose the existing data on the drive so would need to recover it from your backups.

Link to comment

So you have 3 options to recover. I'd recommend in this order:

 

1. What I would do is to remove the disk from the array, and start the array with one disk short. unRAID will simulate the missing disk. Once you start the array it will be as though the missing disk is present, and you should be able to see if the data is present. If it looks good, you can rebuild the drive onto either the existing disk or onto a new disk.

 

2. If the simulated disk also looks bad, you can turn your attention to the physical disk which you would access outside the array. Run reiserfsck and follow its suggested path. It may be able to reconstruct the disk. If successful, you would have to do a new config and rebuild parity.

 

3. A third option would be to run reiserfsck on the simulated disk in step 1.

 

Good luck. Post back with any specific questions and I or someone can try to help. Your chances of recovery are good - just don't do anything sillly. For example, DO NOT RUN A PARITY CHECK!

Link to comment

So you have 3 options to recover. I'd recommend in this order:

 

1. What I would do is to remove the disk from the array, and start the array with one disk short. unRAID will simulate the missing disk. Once you start the array it will be as though the missing disk is present, and you should be able to see if the data is present. If it looks good, you can rebuild the drive onto either the existing disk or onto a new disk.

 

2. If the simulated disk also looks bad, you can turn your attention to the physical disk which you would access outside the array. Run reiserfsck and follow its suggested path. It may be able to reconstruct the disk. If successful, you would have to do a new config and rebuild parity.

 

3. A third option would be to run reiserfsck on the simulated disk in step 1.

 

Good luck. Post back with any specific questions and I or someone can try to help. Your chances of recovery are good - just don't do anything sillly. For example, DO NOT RUN A PARITY CHECK!

I would suggest that steps 2 and 3 should be reversed in priority order?    I think it is better to try and repair the simulated disk before the physical one.  Then you still have the physical disk as backup, and the best way to proceed may depend on what happened trying to repair the simulated disk.
Link to comment

I have a spare disk...what about rebuilding a spare disk first? Also...not sure if this is bad but it started an automatic parity check when I started up my machine....I stopped it. When I stopped it there were 0 errors found.

Link to comment

I have a spare disk...what about rebuilding a spare disk first? Also...not sure if this is bad but it started an automatic parity check when I started up my machine....I stopped it. When I stopped it there were 0 errors found.

The chances are that a rebuild will still end up 'unformatted'.  This is because a rebuild recreates the disk exactly as unRAID thinks it currently should be - including any disk corruption.  You would still need to run reiserfsck to fix the corruption.    That was why I suggested running reiserfsck against the simulated disk.  If it fixes the problems you can then rebuild onto your spare disk which will end up looking just like the simulated one.
Link to comment

Your first screenshot shows the array being started with one disk missing. In this situation, the drive is being simulated by parity and the other disks. Notice that the simulated disk is showing unformatted. This is unfortunate - I had hoped that by removing the physical disk the simulated disk would appear formatted.

 

The second screenshot shows that you stopped the array and re-assigned the physical disk to the slot. This is not what you want to do! The physical disk represents a possibility of recovering data, as does the simulated disk. Although it is possible that both disks represent the same state, this it not necessarily true. Bottom line - DO NOT START THE ARRAY WITH THE PHYSICAL DISK ASSIGNED TO THE SLOT. If you do, unRAID will immediately begin to overwrite the physical disk with the logical disk. And we know the result - the physical disk will still be unformatted.

 

What you should do is again remove the physical disk from the slot, start the array in maintenance mode, and run reiserfsck on the SIMULATED disk (/dev/md8). (This was option 3 in my earlier post). This will attempt to recover the data from the corrupted simulated disk. It will take potentially many hours to complete. You will still have the physical disk unaltered and may be able to try the same procedure on it if this fails. One of the other may succeed. Or they may both fail. We don't know. But if you rebuild the simulated disk on top of the physical disk - you know they will both have the identical result.

 

Good luck!

Link to comment

FYI, I did not start the array with the physical disk assigned.

 

I have removed the physical disk and have will attempt to recover the simulated disk. 

 

Just so I know, assuming the simulated disk is corrected by reiserfsck....is the next step to just reassign the physical disk and let it rebuild.?

Link to comment

FYI, I did not start the array with the physical disk assigned.

 

I have removed the physical disk and have will attempt to recover the simulated disk. 

 

Just so I know, assuming the simulated disk is corrected by reiserfsck....is the next step to just reassign the physical disk and let it rebuild.?

 

Yes.

 

I'd suggest that you research the command(s) you need to run to have reiserfsck repair the file system. Post the commands here and someone will confirm before you begin.

Link to comment

So I tried running reiserfsck on the simulated disk  here are the results:

 

root@Tower:/dev# reiserfsck /dev/md8

reiserfsck 3.6.24

 

Will read-only check consistency of the filesystem on /dev/md8

Will put log info to 'stdout'

 

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes

###########

reiserfsck --check started at Thu Apr  9 19:35:58 2015

###########

Replaying journal: No transactions found

Checking internal tree.. 

 

Bad root block 0. (--rebuild-tree did not complete)

 

Aborted

 

 

NOTE: The program Aborted automatically (I did not do it). 

 

When I run this command:

 

reiserfsck --rebuild-tree /dev/md8

 

It is rebuilding fine.  I will wait for the results and let you know in the morning.

 

 

 

 

 

TZ

Link to comment

The reiserfsk -rebuild-tree is complete with the following results:

 

Flushing..finished

Objects without names 124

Empty lost dirs removed 4

Dirs linked to /lost+found: 34

Dirs without stat data found 3

Files linked to /lost+found 90

Pass 4 - finished done 358651, 27 /sec

Deleted unreachable items 3390

Flushing..finished

Syncing..finished

###########

reiserfsck finished at Sat Apr 11 10:41:12 2015

###########

 

 

What is the next step?

Link to comment

It looks like I lost quite a bit of files. Recovered ~1.6 TB out of ~3.5 TB.

Have you also looked in the lost+found folder to see if there is content there you recognise?  If you want to access that folder across the network then you will need to run the newperms command against it to get permissions correct for network access.

Link to comment

Thanks for the help, I was able to recover a lot of files (but not all). I am scratching my head wondering what happened though.  I am not sure why UNRAID allowed one failed disk to lose my data!!

 

My guess is that one disk had a issue (after a power loss) and when unraid booted, it ran an automatic parity check that updated the parity disk with the data from the bad disk?

 

Any ideas?

Link to comment

It is difficult to understand the exact series of events.

 

When power is lost, a journaled file system like reiserfs typically is not corrupted. Upon mounting the disk, the file system replays the last few transactions from the journaling area and you often see this in the syslog. The parity drive has no journaling and is not even a file system. On a power outage, it is not uncommon for parity to lose recent updates. The automatic parity check after a dirty shutdown is meant to synchronize parity with the drive (the drive data is always considered preferentially).

 

That is the way it is supposed to work and 99.9% it works very well. The fact that the failure caused your drive to become unformatted is unique in my lengthy experience on the forum. And the fact that simultaneously the simulated drive is also unformatted doubly confusing. It seems as though parity synced with the unformatted disk, but not sure. It would have taken many hours and would have been hard to miss. You can run the reiserfsck against the physical disk. It may result in better/different recovery outcomes if parity were not aligned.

 

I still think there is something that happened that we are missing and may never fully understand. UnRaid protection is not perfect but it's very effective most of the time. I'm glad you recovered a good portion of your data and hope trying the physical disk might yield more. But sorry it was not perfect. Hate to use the "b" word in this situation (no, not that "b" word) but backups are needed to protect data you cannot afford to lose. I am therefore hopeful that recovery may be possible (eg, by re-ripping source disks) and that you didn't lose unrecoverable memories or financial records.

 

One more point, your recovery occurred on the simulated disk. This is a tenuous recovery. Amy new disk failure or dirty shutdown can cause the whole simulated drive contents to be lost. You need to rebuild into a physical disk very soon to better protect that data and reestablish parity protection.

 

Best of luck!

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...