s3 Sleep - Realtek r8169 - Occasionally not waking from sleep


Recommended Posts

I have an ongoing issue with the S3 sleep Plugin.  Occasionally the server does not seem to wake correctly from sleep, it appears 'hung' ie not accessible from the Web GUI, SMB, Putty etc plugging a monitor into the machine does not show any output either, But if i press the power button the machine will shutdown safely allowing me to power on again without triggering a parity check.

 

The other issue i have is the server will sometimes not sleep when it should be expected to.  This morning i got up to find it had been running all night.  It appears that it is not loading all settings correctly on first boot.  Looking at the syslog it is active because of SDi wihch is my cache drive.  S3 sleep is set to exclude the Cache drive and i can see when the server booted up this evening that it did not include this setting.  At 5.29am i togglen the exclude cache setting on and off and this fixed the issue and not i can see the server is going to sleep soon.

syslog.zip

Link to comment
  • 2 weeks later...

Some people have reported that they need to run some specific commands before and/or after s3-sleep.

 

When using the s3_sleep plugin of Dynamix it is possible to define these commands in the designated boxes. Which commands are exactly needed depends on your system, perhaps a forum search can help.

 

Link to comment
  • 2 weeks later...

Thanks for the replies, i realise the hard drives are a long shot, but this is really annoying me since on v5 this worked flawlessly.  I did notice in the syslog it was complaining about an unclean shutdown and to run checkdisk on the flash drive.  I ran checkdisk and then managed 10 days without issue.  But the issue did re-occur after those 10 days.  Again waking from sleep the system hung, no network shares, no telnet, no access when connecting monitor and keyboard.

 

I have now performed a long format on my flash drive and set  it up from scratch again and re-downloaded the latest dynamix plugins including s3 sleep.

 

Apart from  the above, i am still having the issue where on boot, S3 sleep is not setting the included/excluded disk parameters correctly

 

eg

 

upon installing the plugin and configuring it for the first time, i see the following in the syslog, drives are set correctly, it is including all disks and excluding cache and flash

 

Jan 31 08:15:57 Tower s3_sleep: command-args=-C 1 -a -c -m 30 -D 0
Jan 31 08:15:57 Tower s3_sleep: action mode=sleep
Jan 31 08:15:57 Tower s3_sleep: check disks status=yes
Jan 31 08:15:57 Tower s3_sleep: check network activity=no
Jan 31 08:15:57 Tower s3_sleep: check active devices=no
Jan 31 08:15:57 Tower s3_sleep: check local login=no
Jan 31 08:15:57 Tower s3_sleep: check remote login=no
Jan 31 08:15:57 Tower s3_sleep: version=3.0.0
Jan 31 08:15:57 Tower s3_sleep: ----------------------------------------------
Jan 31 08:15:57 Tower s3_sleep: included disks=sdb sdc sdd sde sdf sdg sdh
Jan 31 08:15:57 Tower s3_sleep: excluded disks=sda sdi
Jan 31 08:15:57 Tower s3_sleep: ----------------------------------------------
Jan 31 08:15:57 Tower s3_sleep: s3_sleep process ID 12960 started, To terminate it, type: s3_sleep -q

 

 

After shutting the system down and powering on again i see the following when the plugin is loaded, all disks are set to excluded

 

Jan 31 08:19:17 Tower s3_sleep: command-args=-C 1 -a -c -m 30 -D 0
Jan 31 08:19:17 Tower s3_sleep: action mode=sleep
Jan 31 08:19:17 Tower s3_sleep: check disks status=yes
Jan 31 08:19:17 Tower s3_sleep: check network activity=no
Jan 31 08:19:17 Tower s3_sleep: check active devices=no
Jan 31 08:19:17 Tower s3_sleep: check local login=no
Jan 31 08:19:17 Tower s3_sleep: check remote login=no
Jan 31 08:19:17 Tower s3_sleep: version=3.0.0
Jan 31 08:19:17 Tower s3_sleep: ----------------------------------------------
Jan 31 08:19:17 Tower s3_sleep: included disks=
Jan 31 08:19:17 Tower s3_sleep: excluded disks=sda sdb sdc sdd sde sdf sdg sdh sdi

 

When i go into the plugins settings and toggle 'wait for array inactivity' from 'yes, exclude cache' to no, click apply and then back to 'yes, exlcude cache' i see the correct settings in the log

 

Jan 31 08:21:37 Tower s3_sleep: command-args=-C 1 -a -c -m 30 -D 0
Jan 31 08:21:37 Tower s3_sleep: action mode=sleep
Jan 31 08:21:37 Tower s3_sleep: check disks status=yes
Jan 31 08:21:37 Tower s3_sleep: check network activity=no
Jan 31 08:21:37 Tower s3_sleep: check active devices=no
Jan 31 08:21:37 Tower s3_sleep: check local login=no
Jan 31 08:21:37 Tower s3_sleep: check remote login=no
Jan 31 08:21:37 Tower s3_sleep: version=3.0.0
Jan 31 08:21:37 Tower s3_sleep: ----------------------------------------------
Jan 31 08:21:37 Tower s3_sleep: included disks=sdb sdc sdd sde sdf sdg sdh
Jan 31 08:21:37 Tower s3_sleep: excluded disks=sda sdi
Jan 31 08:21:37 Tower s3_sleep: -------------------------

Link to comment

I'm no expert but what I can see in the "bad log" is this message repeating very often:

"Tower kernel: r8169 0000:03:00.0 eth0: rtl_phyar_cond == 1 (loop: 20, delay: 25)."

 

In the "good boot" the NIC is reporting just this:

Feb  5 04:51:12 Tower kernel: r8169 0000:03:00.0 eth0: link down
Feb  5 04:51:12 Tower kernel: r8169 0000:03:00.0 eth0: link up

 

Can't say if this is the only reason but it is probably not bad to solve this first.

Maybe a NIC restart/reset? (if there is something like that)

...

Did some google-fu on that error and it seems it's an issue with the r8169 driver.

Was there a driver update in v6?

 

Thats what I could find.

Maybe someone more knowledgable can chime in.

Link to comment
  • 2 weeks later...

Thank you for taking the time to look at this.  I've done a little googling myself and it does appear to be Kernel related.  I have also spoken to JonP who has stated we are moving to a new kernel next beta so fingers crossed. 

 

Still frustrating, just made it to 7 days before the issue re-occurred.

 

If the updated Kernel doesn't improve things would you think something like

http://www.mwave.com.au/product/intel-pro1000-gt-network-adapter-pci-pwla8391gtblk-aa38128

 

would be better?

Link to comment

While it is desireable to have a fully functional rig, you should answer yourself one question.

What is the benefit from S3 over shutdown (S5)?

 

In my case the gain is in the range of seconds.

I decided to go with S5.

 

Just some food for thought.

 

You are right, most of the benefit is measured in seconds or the convenience of knowing if I click a song in the bedroom, the server 3 rooms away will wake, spin up and start playing in the bedroom. But since it was something I had working flawlessly in v5 I'm reluctant to take a step backwards so to speak. Today I have purchased an Intel 1000 pro gt for $45 so we'll see how we go

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.