WEHA

BTRFS cache read only mounted

16 posts in this topic Last Reply

Recommended Posts

Posted (edited)

Hello

 

So I had this issue where unraid started throwing errors, I believe because of cache drive disconnect (if that's possible, doesn't it run from the usb drive?)

I set the array to not start on boot but because of the errors it apparently didn't save this preference.

Anyway, after rebooting (not detecting the second cache drive) it started the array, without the second cache drive. (raid1)

It was mounted read only, I guess because of the disconnect earlier.

I rebooted to get the second drive detected again and I tried to re-add the drive in unraid.

It remained mounted read only and btrfs was still saying that it was missing the drive, even though it was looking correct in the unraid gui.

After searching the internet I found the command to replace the drive in the raid 1 (adding redetected drive and removing missing drive).

But I'm still at the problem where it says the drive is mounted read only.

When I execute mount it says rw, so it's btrfs not allowing me to write.

The only thing I could find for this situation is that there needs to be a kernel patch to get this working.

I'm not familiar how to check or install this patch in unraid.

 

Source: 

https://www.mail-archive.com/search?l=linux-btrfs%40vger.kernel.org&q=subject:"raid1\%3A+cannot+add+disk+to+replace+faulty+because+can+only+mount+fs+as+read\-only."&o=newest

https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg60979.html

 

Any suggestions?

 

EDIT: added diagnostics

tower-diagnostics-20180414-0850.zip

Edited by WEHA

Share this post


Link to post

There were ENOSPC errors during the balance of the pool:

 

Apr 14 06:24:19 Tower kernel: BTRFS: error (device nvme0n1p1) in btrfs_remove_chunk:2882: errno=-28 No space left
Apr 14 06:24:19 Tower kernel: BTRFS info (device nvme0n1p1): forced readonly
Apr 14 06:24:19 Tower kernel: BTRFS info (device nvme0n1p1): 92 enospc errors during balance

You're best option is to backup any data on the pool, format and restore, you can use the link below for help with the backup:

https://lime-technology.com/forums/topic/46802-faq-for-unraid-v6/?do=findComment&comment=543490

 

Share this post


Link to post

Thank you for your quick reply.

I was able to copy the data earlier, I hope there is no corruption.

What are ENOSPC errors exactly? Can't seem to find a simple description :)

 

What would be the best way to format the drivers:

- using the gui: just delete the partition and re-add as cache?

- is a full wipe necesarry?

- a btrfs command

 

thanks!

 

 

Share this post


Link to post
43 minutes ago, WEHA said:

What are ENOSPC errors exactly?

Not enough space.

 

43 minutes ago, WEHA said:

What would be the best way to format the drivers:

Use bkdsicsrd so they are completely wiped:

 

blkdiscard /dev/nvme0n1

and

blkdiscard /dev/nvme1n1 

Then format pool

Share this post


Link to post

There was 100GB of free space? How is there no space left?

 

Pool has been formatted, restoring data currently.

Shares are emtpy but appdata & system was on my cache.

This will restore itself once I restart unraid?

Share this post


Link to post
44 minutes ago, WEHA said:

There was 100GB of free space? How is there no space left?

There could be an allocation or corruption problem resulting in the errors.

 

44 minutes ago, WEHA said:

Shares are emtpy but appdata & system was on my cache.

This will restore itself once I restart unraid?

Not quite clear on what you're asking, shares are just all top level folder, both on the data disks and cache pool.

Share this post


Link to post

I mean the share configuration.

When I go to shares no in the top menu, it's empty :(

Share this post


Link to post
Posted (edited)
15 minutes ago, johnnie.black said:

Post new diagnostics

Everything is now copied back, stopped and started array: no exportable shares.

New diag attached

tower-diagnostics-20180414-1538.zip

 

Something strange though when I "ls /mnt":


16K drwxrwxrwx  1 nobody users 106 Apr 14 15:32 cache/
  0 drwxrwxrwx  3 nobody users  19 Apr 14 10:00 disk1/
  0 drwxrwxrwx  4 nobody users  43 Apr 14 10:00 disk2/
  0 drwxrwxrwx  3 nobody users  19 Apr 14 10:00 disk3/
  0 drwxrwxrwx  3 nobody users  19 Apr 14 10:00 disk4/
  0 drwxrwxrwx 11 nobody users 167 Apr 14 10:00 disk5/
  0 drwxrwxrwx  5 nobody users 100 Apr 14 15:35 disks/
  ? d?????????  ? ?      ?       ?            ? user/
  0 drwxrwxrwx  1 nobody users  19 Apr 14 10:00 user0/
 

/bin/ls: cannot access 'user': Transport endpoint is not connected
 

 

When I "ls user0", those contain the non-cache shares

Edited by WEHA

Share this post


Link to post

So I did the "have you tried turrning it off and on again" scenario, shares are back!

Now I'm getting to work on the dockers and vm's, will update soon.

Share this post


Link to post
23 minutes ago, WEHA said:

have you tried turrning it off and on again

That was what I was going to suggest since all seemed normal on the log, you should also update to latest.

Share this post


Link to post
9 minutes ago, johnnie.black said:

 you should also update to latest.

I wasn't aware of a new version until today since it did not (still isn't) mention this in the GUI like it did last time.

Updating is the last on my todo list :)

 

So next problem, I rebooted and offcourse my second cache drive got "undetected" again.

Updated ssd firmware this time (bios this morning) hoping this fixes it.

Reboot, reassigned second drive... and now it's showing as RAID 0 in terms of space.

 

Tried rebalance with -dconvert=raid1 -mconvert=raid1 but did nothing.

Do I have to convert it to single first?

 

Dashboard shows 1.5TB size,  785GB in use

Status (these are 2 x 1TB nvme SSD's fyi):

btrfs filesystem df:
Data, RAID1: total=732.00GiB, used=731.15GiB
System, RAID1: total=32.00MiB, used=144.00KiB
Metadata, RAID1: total=2.00GiB, used=908.31MiB
GlobalReserve, single: total=512.00MiB, used=0.00B
btrfs balance status:
No balance found on '/mnt/cache'

Share this post


Link to post
Posted (edited)
9 minutes ago, WEHA said:

Do I have to convert it to single first?

 

No, but data is shown as raid1, upgrade to v6.5 and reboot, if it doesn't go back to normal post new diags.

Edited by johnnie.black

Share this post


Link to post

It's normalized now.

Too bad the gui doesn't show the "in progress" state.

 

Anyway, thank you for your assistance!

Share this post


Link to post

New release inhibits the stop array button and shows "btrfs operation" in progress.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


Copyright © 2005-2018 Lime Technology, Inc.
unRAID® is a registered trademark of Lime Technology, Inc.