Jump to content

Removing cache pool?


alexricher

Recommended Posts

Good day Unraid Community,

 

I've been running Unraid with a cache drive (1TB WD Black) for a long. At some point, I've added a 500GB SSD for a total of 2 drives when the cache pool was introduced. 1TB + 500GB = 750GB.

 

Recently, the WD 1TB drive has been giving me tons of errors in syslog and SMART is telling me it's slowly dying with pending sectors and offline uncorrectable errors. I'd like to remove the drive of the cache pool and go back to only 1 drive: 500GB SSD. I don't care to lose the redundancy as I've done a backup for the moment.

 

Now the issue: I cannot find for the life of me how to remove the drive of the cache pool successfully. I've tried reading multiple forum post over the last week and most of the people end up replacing their drive. There was a thread a while ago (https://lime-technology.com/forum/index.php?topic=39774.0) that was closer to my need and had replied but no one answered back to it as it was probably too old.

 

This morning, my dockers and VM is not working anymore due to the faulty drive. I've tried the steps mentioned in the threads I've found but whenever I remove the 1st drive (1TB), I cannot move the cache pool back to 1 drive. It remains at 2. It cannot be rebalance if both drives aren't present. If I start it without the 2 drives, the cache won't mount. The syslog is giving me tons of errors about the drive so for the last week, I've been running without dockers/VMs but I'd like to fix this now.

 

Would anyone would be kind enough to point me in the right direction? I love Unraid and it's only when things start acting up that you realize how dependant we are on our Unraid servers. ;) Don't hesitate if you need more info to help me solve this issue. I'll be spending the day trying to fix this... :)

 

Thanks for your help and have a great day!

Link to comment

Thanks for your reply trurl! My cache was almost full, I probably used >400GB. I've backup/removed almost everything out of the cache drive and I'm now trying to do a rebalance to see if it'll help. I'll do a search as suggested hoping to find more useful information and report back if it still isn't clear.

Link to comment

The pool will then think the drive has failed.. That is how the redundancy should work.. Then add another drive to get it back.. I am not sure how you would continue with only one drive..

 

If you remove one disk from the pool (don’t forget it has to be disconnected) and start array, cache will be rebalanced to the single disk, you can in the future add another disk and it will again rebalance.

Link to comment

ahaa.. so you will never be in a "degraded state".. with one disk less it will just "transform" to a single disk "pool" ?

 

It will be in a degraded state if one pool disk drops offline, but contrary to what happens with an array disk, it won’t redball, easiest way to tell is if one disk stops showing temp info, but it you remove the failed disk and start the array without it then it will automatically rebalance to a single disk.

Link to comment

That's interesting info! :) Thanks johnnie.black and everyone else for this valuable info. I wasn't aware that I had to disconnect the drive from the server for it to rebalance to a single disk. I was under the impression that not assigning it would be enough to understand the disk isn't available.

 

I must admit I find it a bit unusual that even with only 1 drive assigned as cache it remains a pool of only 1 drive. Not that I mind, I actually like this idea so whenever I'll get a new drive, I can assign it and restart redundancy. :)

 

I'll disconnect the drive and report back. Thanks guys!

Link to comment

Alright, so I've unplugged the 1TB WD HDD, start the array with only the 2nd disk present (first one mentions "Not Installed".) At first, nothing happened so I've gone in the cache settings and did a balance manually. After completed, I was able to access my cache drive's content and start my dockers! :D

 

Now, the next questions:

  • Unraid states on the Main page that my cache size is of 750GB, how's that possible if I currently only have 500GB installed?
  • When I check the used size for all content on the cache share drive, I seem to use only 243GB, yet it tells me I only have 97GB left on the HDD. 243GB+97GB=340GB... Do I truly have 500GB available?
  • Is there a way I can get back that missing HDD space?
  • Will my cache always remain with in this state (where it's missing the first disk and says 750GB?) If so, is there a way to resolve this?

 

Thanks for your continuous support!

 

[EDIT]: Even if I seem to see 97GB left available, I cannot seem to get any more content on the drive. For instance, Sabnzbd tells me 97GB left but when I extract a 2GB Rar file, it says "disk full"! Now, I'm even more confused...

Link to comment

It should rebalance automatically,  but it takes a while.

 

When it's done space is reported correctly.

 

Thanks johnnie. :) If I do a "Balance" manually and it completed, should it reflect the right amount of space? I've noticed this:

 

btrfs filesystem show:
Label: none  uuid: a6bc1b21-937a-40b1-a1d8-08ebcbb8f147
Total devices 2 FS bytes used 270.98GiB
devid    2 size 465.76GiB used 273.03GiB path /dev/sdn1
*** Some devices missing

btrfs-progs v4.1.2
btrfs filesystem df:
Data, RAID1: total=270.00GiB, used=269.35GiB
System, RAID1: total=32.00MiB, used=80.00KiB
Metadata, RAID1: total=3.00GiB, used=1.63GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

 

If I read correctly, it now states 270GB as my total space with 269GB used...

Link to comment

Hmm. I wonder if it's trying to maintain RAID1 fault tolerance with a single device.

 

I'm no expert but I was under the same assumption... If so, this is still dangerous, no? I mean, if the drive fails, its fault tolerance will be on the same drive; therefore, no backup whatsoever.

 

What's my next step in order to maximize my 500GB? :)

Link to comment

Do you see any read/write activity in the cache disk?

 

This normally works like this:

 

-start array with one cache disk missing

-balance will begin there will be some read/write activity, it will take some time depending on ssd/hdd

-when done, read/write activity stops and btrfs filesystem show will show the new number of devices (in your case should be one) and will stop saying that a device is missing.

 

 

If there's no read/write activity something went wrong.

 

 

Link to comment

I see write activity in the "Stats" tab; however, I cannot tell which drive it is. But if I look at the cache details page, I see changes in this section:

 

Label: none  uuid: a6bc1b21-937a-40b1-a1d8-08ebcbb8f147
Total devices 2 FS bytes used 267.05GiB
devid    2 size 465.76GiB used 293.03GiB path /dev/sdm1
*** Some devices missing

btrfs-progs v4.1.2
btrfs filesystem df:
>>>>>> Data, RAID1: total=260.00GiB, used=254.27GiB <<<<<<<<
Data, single: total=30.00GiB, used=11.15GiB
System, single: total=32.00MiB, used=80.00KiB
Metadata, RAID1: total=3.00GiB, used=1.62GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

 

Every time I refresh the page, I get a new RAID1 total:

 

Data, RAID1: total=256.00GiB, used=250.26GiB
(...)
Data, RAID1: total=247.00GiB, used=241.07GiB

 

Is this what we're looking for?

Link to comment

It seems to have stalled:

 

Data, RAID1: total=238.00GiB, used=231.97GiB
Data, single: total=52.00GiB, used=33.46GiB
System, single: total=32.00MiB, used=80.00KiB
Metadata, RAID1: total=3.00GiB, used=1.62GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

 

No more [Data, single] changes, nor [Data, RAID1]. Is it safe to be assumed it's done balancing everything? I was expecting the Single total would have been bigger... Any thoughts?

Link to comment

Balance will be complete if "btfrs filesystem show" displays "Total devices 1" and no mention of "Some devices missing", if it still shows that and there's no activity something went wrong, possibly by damage to the fs caused by the bad disk, you can try stooping and starting array again, if there's no progress probably best to backup cache, reformat it and restore data.

Link to comment

Thanks for your reply. Yeah, something must have gone wrong because it stopped balancing and after a while, I've stopped the array, restarted it and now whatever I do is "Unmountable"... :( Next step I guess is reformating and restarting from scratch from a backup for the cache SSD HDD?

Link to comment

In case someone else wonders, I've rebuilt the cache from a backup and I'm now back in business. No more cache pooling for now and when I'll be ready, I'll add another drive and recreate the pool.

 

Thanks everyone for your help, it's really appreciated! Gotta love this Unraid community. Well worth my license cost! :D Keep on rockin' with this nice piece of software!

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...