unRAID Server Release 6.1.3 Available


limetech

Recommended Posts

... By the way, the little green dot that indicates the drive is spun up works just fine => it's only the Temp column that's not working [it's filled with asterisks instead of temperatures  :)]

 

Temperature display is controlled independently from the other disk readings on the main page and may cause status to go out of sync.

 

To minimize sync issues, you can lower the Tunable (poll_attributes) time under disk settings (default is 30 minutes), at the expense of more disk reading (smartctl).

 

Ultimately an improved disk spin up/down detection may be needed for the temperature reading, but this is something LT has to look into.

 

What's changed from v5 (and for that matter v4 before it) that made this stop working?    In the past, if a disk was spun up, you saw it's temp;  if not, you didn't.    Simple as that.    The spin up/down state is clearly recognized (hence the green dot) ... and the temps are shown just fine when the array first starts.    It's only if a disk spins down -- and later spins back up from some array activity that the temps are missing.

 

Link to comment
  • Replies 246
  • Created
  • Last Reply

Top Posters In This Topic

... By the way, the little green dot that indicates the drive is spun up works just fine => it's only the Temp column that's not working [it's filled with asterisks instead of temperatures  :)]

 

Temperature display is controlled independently from the other disk readings on the main page and may cause status to go out of sync.

 

To minimize sync issues, you can lower the Tunable (poll_attributes) time under disk settings (default is 30 minutes), at the expense of more disk reading (smartctl).

 

Ultimately an improved disk spin up/down detection may be needed for the temperature reading, but this is something LT has to look into.

 

I handle this with this code:

 

function is_disk_running($dev) {
  $state = trim(shell_exec("hdparm -C $dev 2>/dev/null| grep -c standby"));
  return ($state == 0) ? TRUE : FALSE;
}

function get_temp($dev) {
  $tc = "/tmp/.hdd_temp.json";
  $temps = is_file($tc) ? json_decode(file_get_contents($tc),TRUE) : array();
  if (is_disk_running($dev)) {
    if (isset($temps[$dev]) && (time() - $temps[$dev]['timestamp']) < 300 ) {
      return $temps[$dev]['temp'];
    } else {
      $temp = trim(shell_exec("smartctl -A -d sat,12 $dev 2>/dev/null| grep -m 1 -i Temperature_Celsius | awk '{print $10}'"));
      $temp = (is_numeric($temp)) ? $temp : "*";
      $temps[$dev] = array('timestamp' => time(),
                           'temp'      => $temp);
      file_put_contents($tc, json_encode($temps));
      return $temp;
    }
  }
  else {
    return "*";
  }
}

 

I didn't computed performance penalties, but it syncs spinning status and probes for new temperatures after 5 minutes.

Link to comment

What's changed from v5 (and for that matter v4 before it) that made this stop working?    In the past, if a disk was spun up, you saw it's temp;  if not, you didn't.    Simple as that.    The spin up/down state is clearly recognized (hence the green dot) ... and the temps are shown just fine when the array first starts.    It's only if a disk spins down -- and later spins back up from some array activity that the temps are missing.

 

In the older unRAID versions emhttp would query the disk smart info each time a page change or page refresh in the webGUI was done. Hence the advice not to change or update webGUI pages too frequently when - for example - a parity operation is in progress, as it will interfere with the on going disk acitvity.

 

In unRAID v6 the temperature readings (smart info) are done independently from webGUI page changes an hence using the webGUI should (does) not interfere with disk operation.

 

emhttp has some mechanism built-in to keep track of the disk spin up/down status to know whether temperatures need to be read or not, but as observed sometimes it can go out of sync.

 

Link to comment

Understand -- so which is a "better" fix ....

 

(1)  Add the code gfjardim posted.    [And, if so, HOW and WHERE do I add that?  I'm a total non-Linux guy  :) ]

 

or

 

(2)  Change the Tunable (poll_attributes) time to a lower value (e.g. 5 or 10 minutes).    Do I correctly understand that this should then make the temperatures "appear" after whatever value this is set to?    Also, does this mean that temps, when displayed, are only updated at this interval ??

 

Link to comment

Understand -- so which is a "better" fix ....

 

(1)  Add the code gfjardim posted.    [And, if so, HOW and WHERE do I add that?  I'm a total non-Linux guy  :) ]

 

or

 

(2)  Change the Tunable (poll_attributes) time to a lower value (e.g. 5 or 10 minutes).    Do I correctly understand that this should then make the temperatures "appear" after whatever value this is set to?    Also, does this mean that temps, when displayed, are only updated at this interval ??

 

For unassigned devices emhttp doesn't read the temperature and the solution of gfjardim takes care of that in his plugin.

 

For array devices the GUI depends on emhttp and lowering the timer value will shorten the reading interval AFTER emhttp received the trigger that a disk is in spin up state.

 

Link to comment

(2)  Change the Tunable (poll_attributes) time to a lower value (e.g. 5 or 10 minutes).    Do I correctly understand that this should then make the temperatures "appear" after whatever value this is set to?    Also, does this mean that temps, when displayed, are only updated at this interval ??

 

This has come up a number of times, and is one of a number of reasons the upgrade guide was written, to deal with the gotchas and behavioral quirks between v6 and earlier.  There are a number of recommendations in the Configuring the Settings section, including 'Tunable (poll_attributes)' in the 'Disk Settings'.  I've add a feature request to change the default setting for this, from 30 minutes to 2 or 3 minutes.  I like 2 minutes, but would like to see some testing, discover what impact there is with different values.

Link to comment

I changed it to 5 (300 seconds) and like that a lot better ... at least when a disk is spinning the temperature is (albeit with a delay of up to 5 min) now displayed.    I may drop it to 2, as I can't imagine that a few msec of activity every 120 seconds really matters.    Suppose it takes a full msec (unlikely) to grab the SMART data from each disk, and that they're all done sequentially, so you waste a full msec/disk to get the temp.  Even with a 20 disk array that would be 20 msec ever 120000 msec ... or about 0.02% of the time that it would be "wasting".

 

 

Link to comment

 

For unassigned devices emhttp doesn't read the temperature and the solution of gfjardim takes care of that in his plugin.

 

For array devices the GUI depends on emhttp and lowering the timer value will shorten the reading interval AFTER emhttp received the trigger that a disk is in spin up state.

 

I've posted the code only to show the logic I'm using. The bottom line is that emhttp should probe for a new temperature if the device was in standby and now it's spinning, and discard the old temperature record if it's spun down. And keep the "poll_attributes" to temperature updates.

 

 

Link to comment

 

For unassigned devices emhttp doesn't read the temperature and the solution of gfjardim takes care of that in his plugin.

 

For array devices the GUI depends on emhttp and lowering the timer value will shorten the reading interval AFTER emhttp received the trigger that a disk is in spin up state.

 

I've posted the code only to show the logic I'm using. The bottom line is that emhttp should probe for a new temperature if the device was in standby and now it's spinning, and discard the old temperature record if it's spun down. And keep the "poll_attributes" to temperature updates.

One issue with that is that hdparm apparently doesn't work with certain controllers (Areca, others?).  However, UnMENU's MyMain has been able to get around that with some special detection and programming.

Link to comment

 

One issue with that is that hdparm apparently doesn't work with certain controllers (Areca, others?).  However, UnMENU's MyMain has been able to get around that with some special detection and programming.

 

UNRAID doesn't support it either. If anyone with an Areca card could send me some code about detection spin status and probing temperature, I'll gladly add it.

 

But let's keep on topic: the question here is how to maintain spin status and temperature probing in sync. IMHO, if the disk is spinning and there's not a valid temperature reading, it should read it immediately, even if the pool_status interval isn't met.

Link to comment

Just upgraded, but now I'm unable to start my VM, I'm being given the following error:

 

unsupported configuration: host doesn't support VFIO PCI passthrough

 

Under 6.1.2 (and earlier) I had a network card passthrough which was functioning fine:

 

Here is the XML file I have for my VM:

 

<domain type='kvm'>
  <name>ubuntu-server</name>
  <uuid>dccdd050-89bc-6f12-491b-86f2872ca517</uuid>
  <description>Ubuntu Server</description>
  <metadata>
    <vmtemplate name="Custom" icon="ubuntu.png" os="ubuntu"/>
  </metadata>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <memoryBacking>
    <nosharepages/>
    <locked/>
  </memoryBacking>
  <vcpu placement='static'>2</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='1'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-q35-2.3'>hvm</type>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough'>
    <topology sockets='1' cores='2' threads='1'/>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/cache/virtual-machines/ubuntu-server/vdisk1.img'/>
      <target dev='hdb' bus='virtio'/>
      <boot order='1'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x04' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x02' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='dmi-to-pci-bridge'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1e' function='0x0'/>
    </controller>
    <controller type='pci' index='2' model='pci-bridge'>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0x0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x03' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:a2:f3:28'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x01' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/ubuntu-server.org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'/>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes' websocket='-1' listen='0.0.0.0' keymap='en-us'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='vmvga' vram='16384' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </video>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x00' slot='0x19' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x06' function='0x0'/>
    </hostdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x05' function='0x0'/>
    </memballoon>
  </devices>
</domain>

 

 

I followed the guide here: http://lime-technology.com/forum/index.php?topic=39638.0

 

Here is my syslinux.cfg file:

 

default /syslinux/menu.c32
menu title Lime Technology
prompt 0
timeout 50
label unRAID OS
  menu default
  kernel /bzimage
  append pci-stub.ids=8086:1502 initrd=/bzroot
label unRAID OS Safe Mode (no plugins)
  kernel /bzimage
  append initrd=/bzroot unraidsafemode
label Memtest86+
  kernel /memtest

Link to comment

Just a comment I will post in help once it happens again. Once I updated to this version now twice the system has lost the prokey and can't read/write to the USB drive. Here is the weirdest part this happens once the array is up for a few days. The glitch happens and it keeps running but the main page turns default white( I use black) VPN goes down and nothing on main page works except stop array. Once stop is hit I get advertising page to buy key? And restart button is gone. I have dynamix power plug in installed so I use right side restart feature but just so I don't have to force power cycle. Ether way comes back up to parity check with everything normal. The only thing my system has that is weird is I am using wd blacks and during parity checks they get 50c and I start getting emails about shutting down but it never really shutdown. On previous versions this error did happen but the us. Never acted like this. Well I thought this was a glitch of the flash so did several tests to make sure it was not but since it restarted no log file and since it loses ability to write to the USB I'm scratching my head on how I am going to get the log once it happens again. But I will post in general help at that time.

For now I thought this would be a good FYI

 

Thank you

 

Thornwood

Link to comment

Just a comment I will post in help once it happens again. Once I updated to this version now twice the system has lost the prokey and can't read/write to the USB drive. Here is the weirdest part this happens once the array is up for a few days. The glitch happens and it keeps running but the main page turns default white( I use black) VPN goes down and nothing on main page works except stop array. Once stop is hit I get advertising page to buy key? And restart button is gone. I have dynamix power plug in installed so I use right side restart feature but just so I don't have to force power cycle. Ether way comes back up to parity check with everything normal. The only thing my system has that is weird is I am using wd blacks and during parity checks they get 50c and I start getting emails about shutting down but it never really shutdown. On previous versions this error did happen but the us. Never acted like this. Well I thought this was a glitch of the flash so did several tests to make sure it was not but since it restarted no log file and since it loses ability to write to the USB I'm scratching my head on how I am going to get the log once it happens again. But I will post in general help at that time.

For now I thought this would be a good FYI

 

Thank you

 

Thornwood

 

unRAID version 6.1.3 doesn't have a function to shutdown upon disk overheating.

 

Do you have anything installed using unMenu or perhaps still running the v5 version of Dynamix Disk Health ?

 

It also sounds your flash drive at some point becomes unreadable, in that case Dynamix reverts to its default settings, which is the white theme.

 

Link to comment

Look through your syslog for USB errors, especially for USB disconnects. It seems like maybe you have the USB flash drive plugged into a USB3 slot, try plugging it into a different slot. There have been other users that have had USB disconnect issues and "fixed" it by moving the flash drive to a different USB slot.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.