Disk activity reported even though drive spun down

reggierat · April 18, 2015

The issue first popped up in B14 after doing some maintenance on my array. 2 x 4TB drives were purchased, one of these was used to replace by 3tb parity drive, and then my two oldest drives were replaced with these 2 drives, the 3tb and 4tb. I also used this opportunity to migrate all array disks to XFS. It was only after moved to XFS that i started noticing the problem, up until then b14 had been very good. I have now moved to beta15 and the issue still persists.

As you can see from the attached screenshot all array drives are spun down, disk 3 is still reported it's temperature and it is also reporting 'disk activity' in the syslog when i turn on debugging for s3 sleep.

All cabling has been doubled checked, and i replaced the sata cable on disk 3

If i manually spin up all drives and then manually spin them down then the issue goes away. It was is most often reported on disk 3 but occasionally disk 1 and disk 2 so i don't really think the drive is at fault.

smart report for affected drive

Disk 3 attached to port: sde
ID#	ATTRIBUTE NAME	FLAG	VALUE	WORST	THRESH	TYPE	UPDATED	FAILED	RAW VALUE
1	Raw Read Error Rate	0x000b	100	100	016	Pre-fail	Always	Never	0
2	Throughput Performance	0x0005	136	136	054	Pre-fail	Offline	Never	93
3	Spin Up Time	0x0007	135	135	024	Pre-fail	Always	Never	406 (Average 407)
4	Start Stop Count	0x0012	099	099	000	Old age	Always	Never	7950
5	Reallocated Sector Ct	0x0033	100	100	005	Pre-fail	Always	Never	0
7	Seek Error Rate	0x000b	100	100	067	Pre-fail	Always	Never	0
8	Seek Time Performance	0x0005	146	146	020	Pre-fail	Offline	Never	29
9	Power On Hours	0x0012	097	097	000	Old age	Always	Never	24437
10	Spin Retry Count	0x0013	100	100	060	Pre-fail	Always	Never	0
12	Power Cycle Count	0x0032	100	100	000	Old age	Always	Never	1380
192	Power-Off Retract Count	0x0032	094	094	000	Old age	Always	Never	7951
193	Load Cycle Count	0x0012	094	094	000	Old age	Always	Never	7951
194	Temperature Celsius	0x0002	200	200	000	Old age	Always	Never	30 (Min/Max 8/46)
196	Reallocated Event Count	0x0032	100	100	000	Old age	Always	Never	0
197	Current Pending Sector	0x0022	100	100	000	Old age	Always	Never	0
198	Offline Uncorrectable	0x0008	100	100	000	Old age	Offline	Never	0
199	UDMA CRC Error Count	0x000a	200	200	000	Old age	Always	Never	378

syslog.zip

reggierat · April 19, 2015

Just observing things again today, I don't understand the inner workings enough to know for sure but it seems the disk is getting out of sync with the others in regards to its temp and spun up/down status.

When the server is woken up after a period of time all the disks are getting spun up except disk 3 (because its currently empty and the mover hasn't chosen to start writing to it yet). After the polling_attribute timer all disk temps are read including the disk that is spun down (disk 3). Now after some inactivity all disks will be spun down and temps will stop being reported by the GUI. The drive that was not spun up will continue to report its temp until I finally spin it up manually and either let it spin down on its own or spin it foen manually.

reggierat · April 23, 2015

issue still persists, all disks will be spun down in the GUI, but 1 disk (sde) will still be reporting 'disk activity' according to the debug for s3_sleep.

Does anyone have any idea how this is possible if the disk is spun down?

jonp · April 24, 2015

issue still persists, all disks will be spun down in the GUI, but 1 disk (sde) will still be reporting 'disk activity' according to the debug for s3_sleep.

Does anyone have any idea how this is possible if the disk is spun down?

Please disable all plugins, run tests again, and report back with another syslog. You have a lot of plugins installed which may have something to do with this.

reggierat · April 24, 2015

here is a syslog with docker disabled and all plugins removed except s3_sleep. I have kept this installed as this is what im using to check on the mystery disk activity

In this syslog, rebooted server, waited 5mins, manually spun down all disks, observed that disks were spun down, manually put server to sleep, manually woke up using WOL, checked all disks were still showing as spun down. At 7.05am enabled s3 debug logging and disks are still spun down but disk activity is being reported. This is in fact before any temps have been taken so this may have just been confusing the issue previously.

To summarize the issue occurs if a disk has not been spun up after waking from sleep. Even though the disk is spun down in the gui it incorrectly is reporting disk activity for some reason. Manually spinning the drive(s) up always fixes the issue.

syslog.zip

reggierat · April 24, 2015

As a bandaid Im thinking I just spin up all drives on wake. Could someone help me with the custom command to do this

i tested this from the console which spins up all drives

for disknum in 0 `ls /dev/md* | sed "sX/dev/mdXX"`; do /root/mdcmd spinup $disknum; done

but adding this to the area for commands after wake does nothing

RobJ · April 25, 2015

Well, I've looked at both syslogs, and I can't honestly say I see any problems. In the early days of user experimentation with S3 sleep and waking, there were so many errors, I have no idea how it ever worked! In yours, there are problems trying to recover the drives, but it does seem to work, somehow. I'm sure you know that LimeTech does not, cannot support S3 sleep.

I don't think I see anything to worry about. There has been a known issue with drives spin status and temps being sometimes out of sync with reality, a minor glitch.

reggierat · April 25, 2015

I realise it is not officially supported, but this was working perfectly up until b14. The 'disk activity' that occurs prevents the server from going back to sleep if the above conditions have been met. ie server wakes up but a particular drive has not been spun up before the server is due to go back to sleep. if i could just get a post wake command to spin all drives up as a work around this would atleast get it behaving as it was before the bug started occurring

reggierat · April 27, 2015

In the end i modified the s3 sleep script to spin up all my drives when the server comes out of sleep. This is atleast a work around for the issue

Wally · May 8, 2015

reggierat,

How did you modify the S3 sleep script? I'm having a similar problem as my drives are spun up after coming out of S3 sleep but unRAID thinks they are spun down since that's how they were just before S3. Because of this, my server won't go to sleep again unless I either do a manual spinup or spindown to sync the status of the drives and unRAID. I tried adding post S3 commands to spinup with no success.

bonienl · May 9, 2015

As a bandaid Im thinking I just spin up all drives on wake. Could someone help me with the custom command to do this

i tested this from the console which spins up all drives

for disknum in 0 `ls /dev/md* | sed "sX/dev/mdXX"`; do /root/mdcmd spinup $disknum; done

but adding this to the area for commands after wake does nothing

I tested this command by putting it in the "Custom commands after wake-up" box and it is working fine for me. All my disks get spun-up after waking-up.

This is also the preferred method instead of changing in the code of the s3_sleep script itself.

What exactly did you do ?

I enabled debug mode to the syslog, and this is given:

May 9 07:15:31 vesta s3_sleep: Wake-up now

May 9 07:15:31 vesta s3_sleep: Execute custom commands after wake-up

May 9 07:15:31 vesta kernel: Restarting tasks ... done.

May 9 07:15:31 vesta kernel: mdcmd (154): spinup 0

May 9 07:15:31 vesta kernel:

May 9 07:15:31 vesta kernel: mdcmd (155): spinup 1

May 9 07:15:31 vesta kernel:

May 9 07:15:31 vesta kernel: mdcmd (156): spinup 10

May 9 07:15:31 vesta kernel:

May 9 07:15:31 vesta kernel: mdcmd (157): spinup 2

May 9 07:15:31 vesta kernel:

May 9 07:15:31 vesta kernel: mdcmd (158): spinup 3

May 9 07:15:31 vesta kernel:

May 9 07:15:31 vesta kernel: mdcmd (159): spinup 4

May 9 07:15:31 vesta kernel:

May 9 07:15:31 vesta kernel: mdcmd (160): spinup 5

May 9 07:15:31 vesta kernel:

May 9 07:15:31 vesta kernel: mdcmd (161): spinup 6

May 9 07:15:31 vesta kernel:

May 9 07:15:31 vesta kernel: mdcmd (162): spinup 7

May 9 07:15:31 vesta kernel:

May 9 07:15:31 vesta kernel: mdcmd (163): spinup 8

May 9 07:15:31 vesta kernel:

May 9 07:15:31 vesta kernel: mdcmd (164): spinup 9

Wally · May 9, 2015

I just tried the spinup command posted by reggierat and it works if you've let the S3 sleep script do the sleeping but not if done manually thru the webpage as expected. My server is now sleeping normally. Thanks all.

reggierat · May 9, 2015

Boniel: sorry for not seeing your previous reply, i realise i shouldn't be changing things i don't properly understand but it did provide me a work around to this current bug. I think perhaps i was manually sleeping the server to test it and thus not getting the code to execute. Trying again with it in the post commands section and letting it sleep properly.

Wally: were you experiencing the same bug? where the server would not sleep due to disk activity even though all drives were spun down?

Wally · May 9, 2015

reggierat,

My problem was exactly the same as you described including the disk activity in the logs and messed up temp display. Before S3 Sleep activates, the drives are spun down but when awoken, all drives are spun up as power is applied to them. The problem is that unRAID still thinks they are spun down so never starts its timer to spin them down and S3 Sleep checks the drives directly and sees if any are spinning and thus the constant log entry of drive activity and reset of its timer. In my case, drive sdb showed activity until I accessed file on it and unRAID spun it down normally then sdc began showing up as active in the logs.

Either spinning all the drives up or down manually will sync the drives actual status to unRAID and allow S3 Sleep to work properly as you have noticed. The command you posted works perfectly as long as you let S3 Sleep do the sleeping as doing it manually is direct and bypasses everything.

bonienl · May 10, 2015

Thanks for explaining the issue and providing a solution (workaround).

Not sure if this can be solved in a proper way, since LT does not officially support s3 sleep, I keep it in mind however.

jonp · May 12, 2015

Just moved this to general support. This isn't a bug or a defect, but rather, a consequence of using an unsupported plugin. I appreciate the desire for us to support S3 sleep on unRAID, but it is not a feature on the roadmap for 6.0 at this time. A feature request could be posted (or if one exists, add your support for it), but it's not a priority for us to incorporate this into unRAID in the near-term.

reggierat · May 16, 2015

FWIW this bug is still present in RC-1. I understand your position on this Jonp, but given the bug was introduced in the last 2 beta's i'm hoping that a future update might resolve this? Atleast we have a work around to keep this plugin working.

Disk activity reported even though drive spun down

Recommended Posts

reggierat

Link to comment

reggierat

Link to comment

reggierat

Link to comment

jonp

Link to comment

reggierat

Link to comment

reggierat

Link to comment

RobJ

Link to comment

reggierat

Link to comment

reggierat

Link to comment

Wally

Link to comment

bonienl

Link to comment

Wally

Link to comment

reggierat

Link to comment

Wally

Link to comment

bonienl

Link to comment

jonp

Link to comment

reggierat

Link to comment

Join the conversation