unRAID Server Release 6.2.0-beta21 Available


Recommended Posts

Hello everyone,

 

today I updated my Unraid 6.1.9 Server to the 6.2.0b21. The Update went good, then the problems started.

I mainly use Unraid for Running my Gaming VMs. Everything was fine in 6.1.9.

Since the update to the beta release my VMs are unusable. When Windows 10 is loaded it takes maybe 5 seconds to 1 minute until I get the blue spinning loading circle in Windows.

I can't click anything (I can move my mouse though). At the same time the unraid GUI becomes unresponsive and I have to hard reset my machine.

I can't find anything usefull in the logs (not the system log neither in the VM Log).

The SSDs are running in the array, not as cache drive.

Anyone else experiencing such a strange issue?

 

MainVM XML:

<domain type='kvm' id='1'>

  <name>SkyFire</name>

  <uuid>2373ab0a-2456-e6fd-0809-b6d8901e7523</uuid>

  <metadata>

    <vmtemplate name="Custom" icon="windows.png" os="windows"/>

  </metadata>

  <memory unit='KiB'>12582912</memory>

  <currentMemory unit='KiB'>12582912</currentMemory>

  <memoryBacking>

    <nosharepages/>

    <locked/>

  </memoryBacking>

  <vcpu placement='static'>6</vcpu>

  <cputune>

    <vcpupin vcpu='0' cpuset='4'/>

    <vcpupin vcpu='1' cpuset='5'/>

    <vcpupin vcpu='2' cpuset='8'/>

    <vcpupin vcpu='3' cpuset='9'/>

    <vcpupin vcpu='4' cpuset='10'/>

    <vcpupin vcpu='5' cpuset='11'/>

  </cputune>

  <resource>

    <partition>/machine</partition>

  </resource>

  <os>

    <type arch='x86_64' machine='pc-i440fx-2.3'>hvm</type>

    <loader type='pflash'>/usr/share/qemu/ovmf-x64/OVMF-pure-efi.fd</loader>

  </os>

  <features>

    <acpi/>

    <apic/>

  </features>

  <cpu mode='host-passthrough'>

    <topology sockets='1' cores='6' threads='1'/>

  </cpu>

  <clock offset='localtime'>

    <timer name='rtc' tickpolicy='catchup'/>

    <timer name='pit' tickpolicy='delay'/>

    <timer name='hpet' present='no'/>

  </clock>

  <on_poweroff>destroy</on_poweroff>

  <on_reboot>restart</on_reboot>

  <on_crash>restart</on_crash>

  <devices>

    <emulator>/usr/bin/qemu-system-x86_64</emulator>

    <disk type='file' device='disk'>

      <driver name='qemu' type='raw' cache='writeback'/>

      <source file='/mnt/user/MainSSD/SkyFire/vdisk1.img'/>

      <backingStore/>

      <target dev='hdc' bus='virtio'/>

      <boot order='1'/>

      <alias name='virtio-disk2'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>

    </disk>

    <disk type='file' device='disk'>

      <driver name='qemu' type='raw' cache='writeback'/>

      <source file='/mnt/user/VDiskMain/SkyFire VM/vdisk2.img'/>

      <backingStore/>

      <target dev='hdd' bus='virtio'/>

      <alias name='virtio-disk3'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>

    </disk>

    <disk type='file' device='cdrom'>

      <driver name='qemu' type='raw'/>

      <source file='/mnt/user/ISOs/Win10_1511_German_x64.iso'/>

      <backingStore/>

      <target dev='hda' bus='ide'/>

      <readonly/>

      <boot order='2'/>

      <alias name='ide0-0-0'/>

      <address type='drive' controller='0' bus='0' target='0' unit='0'/>

    </disk>

    <disk type='file' device='cdrom'>

      <driver name='qemu' type='raw'/>

      <source file='/mnt/user/ISOs/virtio-win-0.1.113.iso'/>

      <backingStore/>

      <target dev='hdb' bus='ide'/>

      <readonly/>

      <alias name='ide0-0-1'/>

      <address type='drive' controller='0' bus='0' target='0' unit='1'/>

    </disk>

    <controller type='usb' index='0'>

      <alias name='usb'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>

    </controller>

    <controller type='pci' index='0' model='pci-root'>

      <alias name='pci.0'/>

    </controller>

    <controller type='ide' index='0'>

      <alias name='ide'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>

    </controller>

    <controller type='virtio-serial' index='0'>

      <alias name='virtio-serial0'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>

    </controller>

    <interface type='bridge'>

      <mac address='52:54:00:8c:23:c1'/>

      <source bridge='br0'/>

      <target dev='vnet0'/>

      <model type='virtio'/>

      <alias name='net0'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>

    </interface>

    <serial type='pty'>

      <source path='/dev/pts/0'/>

      <target port='0'/>

      <alias name='serial0'/>

    </serial>

    <console type='pty' tty='/dev/pts/0'>

      <source path='/dev/pts/0'/>

      <target type='serial' port='0'/>

      <alias name='serial0'/>

    </console>

    <channel type='unix'>

      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/SkyFire.org.qemu.guest_agent.0'/>

      <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/>

      <alias name='channel0'/>

      <address type='virtio-serial' controller='0' bus='0' port='1'/>

    </channel>

    <hostdev mode='subsystem' type='pci' managed='yes'>

      <driver name='vfio'/>

      <source>

        <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>

      </source>

      <alias name='hostdev0'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>

    </hostdev>

    <hostdev mode='subsystem' type='pci' managed='yes'>

      <driver name='vfio'/>

      <source>

        <address domain='0x0000' bus='0x02' slot='0x00' function='0x1'/>

      </source>

      <alias name='hostdev1'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>

    </hostdev>

    <hostdev mode='subsystem' type='pci' managed='yes'>

      <driver name='vfio'/>

      <source>

        <address domain='0x0000' bus='0x00' slot='0x1d' function='0x0'/>

      </source>

      <alias name='hostdev2'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>

    </hostdev>

    <hostdev mode='subsystem' type='pci' managed='yes'>

      <driver name='vfio'/>

      <source>

        <address domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>

      </source>

      <alias name='hostdev3'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>

    </hostdev>

    <memballoon model='virtio'>

      <alias name='balloon0'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x10' function='0x0'/>

    </memballoon>

  </devices>

</domain>

 

Hardware:

-----------

CPU: Intel i7 3930k4.3GHz

Motherboard: Asus P9X79-Pro

Graphics1: NVIDIA Geforce 210 (for Unraid)

Graphics2: NVIDIA GTX980ti (Main gaming VM)

Graphics3: NVIDIA GTX750ti (Guest gaming VM)

SSD1: Samsung_SSD_850_EVO_250GB_S21PNXAG319699H

SSD2: KINGSTON_SV300S37A120G (Main SSD Guest VM)

ParityHDD: Seagate_ST3000DM001_3TB

HDD1: WDC_WD20EZRX-00D8PB0_2TB

HDD2: Seagate_ST2000DM001-1CH164_2TB

HDD3: SAMSUNG_HD103SJ_1TB

tower-diagnostics-20160520-2047.zip

Link to comment
  • Replies 545
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

Go back to stable version ? Wait for next beta maybe ?  It seems LT has been working on the next beta for awhile, been mostly quiet, so maybe they are still having issues reproducing some of these issues in their lab. 

 

 

 

Hello everyone,

 

today I updated my Unraid 6.1.9 Server to the 6.2.0b21. The Update went good, then the problems started.

I mainly use Unraid for Running my Gaming VMs. Everything was fine in 6.1.9.

Since the update to the beta release my VMs are unusable. When Windows 10 is loaded it takes maybe 5 seconds to 1 minute until I get the blue spinning loading circle in Windows.

I can't click anything (I can move my mouse though). At the same time the unraid GUI becomes unresponsive and I have to hard reset my machine.

I can't find anything usefull in the logs (not the system log neither in the VM Log).

The SSDs are running in the array, not as cache drive.

Anyone else experiencing such a strange issue?

 

MainVM XML:

<domain type='kvm' id='1'>

  <name>SkyFire</name>

  <uuid>2373ab0a-2456-e6fd-0809-b6d8901e7523</uuid>

  <metadata>

    <vmtemplate name="Custom" icon="windows.png" os="windows"/>

  </metadata>

  <memory unit='KiB'>12582912</memory>

  <currentMemory unit='KiB'>12582912</currentMemory>

  <memoryBacking>

    <nosharepages/>

    <locked/>

  </memoryBacking>

  <vcpu placement='static'>6</vcpu>

  <cputune>

    <vcpupin vcpu='0' cpuset='4'/>

    <vcpupin vcpu='1' cpuset='5'/>

    <vcpupin vcpu='2' cpuset='8'/>

    <vcpupin vcpu='3' cpuset='9'/>

    <vcpupin vcpu='4' cpuset='10'/>

    <vcpupin vcpu='5' cpuset='11'/>

  </cputune>

  <resource>

    <partition>/machine</partition>

  </resource>

  <os>

    <type arch='x86_64' machine='pc-i440fx-2.3'>hvm</type>

    <loader type='pflash'>/usr/share/qemu/ovmf-x64/OVMF-pure-efi.fd</loader>

  </os>

  <features>

    <acpi/>

    <apic/>

  </features>

  <cpu mode='host-passthrough'>

    <topology sockets='1' cores='6' threads='1'/>

  </cpu>

  <clock offset='localtime'>

    <timer name='rtc' tickpolicy='catchup'/>

    <timer name='pit' tickpolicy='delay'/>

    <timer name='hpet' present='no'/>

  </clock>

  <on_poweroff>destroy</on_poweroff>

  <on_reboot>restart</on_reboot>

  <on_crash>restart</on_crash>

  <devices>

    <emulator>/usr/bin/qemu-system-x86_64</emulator>

    <disk type='file' device='disk'>

      <driver name='qemu' type='raw' cache='writeback'/>

      <source file='/mnt/user/MainSSD/SkyFire/vdisk1.img'/>

      <backingStore/>

      <target dev='hdc' bus='virtio'/>

      <boot order='1'/>

      <alias name='virtio-disk2'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>

    </disk>

    <disk type='file' device='disk'>

      <driver name='qemu' type='raw' cache='writeback'/>

      <source file='/mnt/user/VDiskMain/SkyFire VM/vdisk2.img'/>

      <backingStore/>

      <target dev='hdd' bus='virtio'/>

      <alias name='virtio-disk3'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>

    </disk>

    <disk type='file' device='cdrom'>

      <driver name='qemu' type='raw'/>

      <source file='/mnt/user/ISOs/Win10_1511_German_x64.iso'/>

      <backingStore/>

      <target dev='hda' bus='ide'/>

      <readonly/>

      <boot order='2'/>

      <alias name='ide0-0-0'/>

      <address type='drive' controller='0' bus='0' target='0' unit='0'/>

    </disk>

    <disk type='file' device='cdrom'>

      <driver name='qemu' type='raw'/>

      <source file='/mnt/user/ISOs/virtio-win-0.1.113.iso'/>

      <backingStore/>

      <target dev='hdb' bus='ide'/>

      <readonly/>

      <alias name='ide0-0-1'/>

      <address type='drive' controller='0' bus='0' target='0' unit='1'/>

    </disk>

    <controller type='usb' index='0'>

      <alias name='usb'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>

    </controller>

    <controller type='pci' index='0' model='pci-root'>

      <alias name='pci.0'/>

    </controller>

    <controller type='ide' index='0'>

      <alias name='ide'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>

    </controller>

    <controller type='virtio-serial' index='0'>

      <alias name='virtio-serial0'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>

    </controller>

    <interface type='bridge'>

      <mac address='52:54:00:8c:23:c1'/>

      <source bridge='br0'/>

      <target dev='vnet0'/>

      <model type='virtio'/>

      <alias name='net0'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>

    </interface>

    <serial type='pty'>

      <source path='/dev/pts/0'/>

      <target port='0'/>

      <alias name='serial0'/>

    </serial>

    <console type='pty' tty='/dev/pts/0'>

      <source path='/dev/pts/0'/>

      <target type='serial' port='0'/>

      <alias name='serial0'/>

    </console>

    <channel type='unix'>

      <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/SkyFire.org.qemu.guest_agent.0'/>

      <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/>

      <alias name='channel0'/>

      <address type='virtio-serial' controller='0' bus='0' port='1'/>

    </channel>

    <hostdev mode='subsystem' type='pci' managed='yes'>

      <driver name='vfio'/>

      <source>

        <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>

      </source>

      <alias name='hostdev0'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>

    </hostdev>

    <hostdev mode='subsystem' type='pci' managed='yes'>

      <driver name='vfio'/>

      <source>

        <address domain='0x0000' bus='0x02' slot='0x00' function='0x1'/>

      </source>

      <alias name='hostdev1'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>

    </hostdev>

    <hostdev mode='subsystem' type='pci' managed='yes'>

      <driver name='vfio'/>

      <source>

        <address domain='0x0000' bus='0x00' slot='0x1d' function='0x0'/>

      </source>

      <alias name='hostdev2'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>

    </hostdev>

    <hostdev mode='subsystem' type='pci' managed='yes'>

      <driver name='vfio'/>

      <source>

        <address domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>

      </source>

      <alias name='hostdev3'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>

    </hostdev>

    <memballoon model='virtio'>

      <alias name='balloon0'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x10' function='0x0'/>

    </memballoon>

  </devices>

</domain>

 

Hardware:

-----------

CPU: Intel i7 3930k4.3GHz

Motherboard: Asus P9X79-Pro

Graphics1: NVIDIA Geforce 210 (for Unraid)

Graphics2: NVIDIA GTX980ti (Main gaming VM)

Graphics3: NVIDIA GTX750ti (Guest gaming VM)

SSD1: Samsung_SSD_850_EVO_250GB_S21PNXAG319699H

SSD2: KINGSTON_SV300S37A120G (Main SSD Guest VM)

ParityHDD: Seagate_ST3000DM001_3TB

HDD1: WDC_WD20EZRX-00D8PB0_2TB

HDD2: Seagate_ST2000DM001-1CH164_2TB

HDD3: SAMSUNG_HD103SJ_1TB

Link to comment

 

 

Yeah... Memtest 5.01 is pretty old and buggy tbh :P

We use it at work, leaving it Single threaded works 100% and always catches errors (Leaving it on for 72hrs per server)

 

Multithreaded / SMP however... On our lowest end servers it works alright, but move up to Hexcore Xeons with DDR4 etc.... Memtest doesn't get along with that hardware haha

 

Single threaded mode should be more than ample at catching memory errors though :)

 

Yeah wasnt ddr4 support only put in memtest 6.0?  @limetech why dont you upgrade the memtest in unraid?

 

Memtest86 versions have almost always been confusing, and the current state isn't any better.  I won't go into the history, you can look it up, but currently there are 2 sources, both based on the original source code.  One is open source and fully distributable, is the one included with unRAID, but unfortunately has fallen behind, has only a few devs (perhaps only one), and the last version is 5.01, released in 2013, the exact version we have.  The other was taken commercial by PassMark, and has been greatly updated, currently on 6.3.0, with comprehensive support for recent technologies.  They do provide a free version with no restrictions on usage, but only provide it as part of a bootable image that doesn't look like it could be included with others software.  Perhaps there is a way, but I'd be wary of PassMark lawyers breathing on your neck.

 

Due to the current state, the version included is a good first step, but if you have more recent tech, such as DDR4 and modern motherboards and CPU's, you should probably download and create a bootable flash with the latest PassMark Memtest86 and run it instead.

 

Ah ok that makes perfect sense. Thanks

Link to comment

Hello everyone,

 

today I updated my Unraid 6.1.9 Server to the 6.2.0b21. The Update went good, then the problems started.

I mainly use Unraid for Running my Gaming VMs. Everything was fine in 6.1.9.

Since the update to the beta release my VMs are unusable. When Windows 10 is loaded it takes maybe 5 seconds to 1 minute until I get the blue spinning loading circle in Windows.

I can't click anything (I can move my mouse though). At the same time the unraid GUI becomes unresponsive and I have to hard reset my machine.

I can't find anything usefull in the logs (not the system log neither in the VM Log).

The SSDs are running in the array, not as cache drive.

Anyone else experiencing such a strange issue?

 

Yes, running a VM with any disk on the array would result in a completly unresponsive array. So everything that is related to the array breaks.

 

But after countless trys and a ton of support from Eric, I was able to find a workaround that worked for me.

 

1) Tunable "md_num_stripes" = 8192 (or higher)

2) Set "cache" mode of the vDisk that is on the array to "directsync"

3) Change Filesystem of the disk in the array to ReiserFS (XFS & BTRFS did not work)

 

Since I am running these settings (~4 weeks) I had not one single crash or other issue to report.

But its a rather drastic workaround and both 2) and 3) definitly do reduce Disk speed in my case.

 

Apart from that, you could also go back to 6.1 or move your vDisks from the array to a disk that is not part of the array (like cache disk)

Link to comment

Hello everyone,

 

today I updated my Unraid 6.1.9 Server to the 6.2.0b21. The Update went good, then the problems started.

I mainly use Unraid for Running my Gaming VMs. Everything was fine in 6.1.9.

Since the update to the beta release my VMs are unusable. When Windows 10 is loaded it takes maybe 5 seconds to 1 minute until I get the blue spinning loading circle in Windows.

I can't click anything (I can move my mouse though). At the same time the unraid GUI becomes unresponsive and I have to hard reset my machine.

I can't find anything usefull in the logs (not the system log neither in the VM Log).

The SSDs are running in the array, not as cache drive.

Anyone else experiencing such a strange issue?

 

Yes, running a VM with any disk on the array would result in a completly unresponsive array. So everything that is related to the array breaks.

 

But after countless trys and a ton of support from Eric, I was able to find a workaround that worked for me.

 

1) Tunable "md_num_stripes" = 8192 (or higher)

2) Set "cache" mode of the vDisk that is on the array to "directsync"

3) Change Filesystem of the disk in the array to ReiserFS (XFS & BTRFS did not work)

 

Since I am running these settings (~4 weeks) I had not one single crash or other issue to report.

But its a rather drastic workaround and both 2) and 3) definitly do reduce Disk speed in my case.

 

Apart from that, you could also go back to 6.1 or move your vDisks from the array to a disk that is not part of the array (like cache disk)

dAigo, you can probably revert #2 and #3.  The num_stripes (#1) was ultimately the key workaround.  I know #3 is kind of a pain to change from ReiserFS back to XFS but curious if you gain your speed back (+170MB/s) from doing so.

Link to comment

dAigo, you can probably revert #2 and #3.  The num_stripes (#1) was ultimately the key workaround.  I know #3 is kind of a pain to change from ReiserFS back to XFS but curious if you gain your speed back (+170MB/s) from doing so.

Ah, I see were I was wrong. We tried other tunables during the first tests and I just asumed you asked me to retry one of it after getting better results with ReiserFS for the Linux VM.

 

But it seems we actually did not try num_stripes before...

 

I'll test it later today, I still have enough xfs disks in the array so its no big deal.

Link to comment

Does this version have a NFS problem? Upgraded from 5.0.5 to latest beta and Kodi doesn't see what's in the media folder although Kodi sees the NFS server.

Need some help or I need to rescan 58tb of movies using SMB!

Thanks!

 

There have been configuration styles changes needed for NFS when upgrading from 5.x to 6.0, 6.1, or 6.2 series. Search the forum announcement threads for each releases for necessary information.

Link to comment

I just upgraded to 6.2beta21 from 6.19 with docker and kvm working.

The docker is working great but the kvm engine will not start.

the log shows:

May 21 21:01:03 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:01:03 Tower emhttp: shcmd (36773): /etc/rc.d/rc.libvirt start |& logger

May 21 21:01:03 Tower root: no image mounted at /etc/libvirt

May 21 21:02:02 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:02:02 Tower emhttp: shcmd (36786): /etc/rc.d/rc.libvirt start |& logger

May 21 21:02:02 Tower root: no image mounted at /etc/libvirt

May 21 21:02:26 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:02:26 Tower emhttp: shcmd (36799): /etc/rc.d/rc.libvirt start |& logger

May 21 21:02:26 Tower root: no image mounted at /etc/libvirt

May 21 21:02:26 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:03:03 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:03:03 Tower emhttp: shcmd (36812): /etc/rc.d/rc.libvirt start |& logger

May 21 21:03:03 Tower root: no image mounted at /etc/libvirt

May 21 21:03:51 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:03:51 Tower emhttp: shcmd (36825): /etc/rc.d/rc.libvirt start |& logger

May 21 21:03:51 Tower root: no image mounted at /etc/libvirt

May 21 21:03:52 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:03:52 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:03:52 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:03:52 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:03:52 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:03:55 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:03:55 Tower emhttp: shcmd (36838): /etc/rc.d/rc.libvirt start |& logger

May 21 21:03:55 Tower root: no image mounted at /etc/libvirt

May 21 21:04:00 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:04:00 Tower emhttp: sendFile: sendfile /usr/local/emhttp/webGui/images/scheduler.png: Broken pipe

May 21 21:04:00 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:04:00 Tower emhttp: sendFile: sendfile /usr/local/emhttp/plugins/preclear.disk/icons/userutilities.png: Broken pipe

May 21 21:04:00 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:04:00 Tower emhttp: sendFile: sendfile /usr/local/emhttp/plugins/preclear.disk/images/preclear.disk.png: Broken pipe

May 21 21:04:00 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:04:00 Tower emhttp: sendFile: sendfile /usr/local/emhttp/plugins/dynamix.system.stats/images/dynamix.system.stats.png: Broken pipe

May 21 21:04:02 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:04:02 Tower emhttp: shcmd (36851): /etc/rc.d/rc.libvirt start |& logger

May 21 21:04:02 Tower root: no image mounted at /etc/libvirt

May 21 21:04:02 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:04:02 Tower emhttp: shcmd (36864): /etc/rc.d/rc.libvirt start |& logger

May 21 21:04:02 Tower root: no image mounted at /etc/libvirt

May 21 21:04:03 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:05:02 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:05:02 Tower emhttp: shcmd (36877): /etc/rc.d/rc.libvirt start |& logger

May 21 21:05:02 Tower root: no image mounted at /etc/libvirt

 

 

What did I miss on the upgrade?

I did copied only the files on the root directory from the download over the flash.

 

john

 

Link to comment

I just upgraded to 6.2beta21 from 6.19 with docker and kvm working.

The docker is working great but the kvm engine will not start.

the log shows:

May 21 21:01:03 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:01:03 Tower emhttp: shcmd (36773): /etc/rc.d/rc.libvirt start |& logger

May 21 21:01:03 Tower root: no image mounted at /etc/libvirt

May 21 21:02:02 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:02:02 Tower emhttp: shcmd (36786): /etc/rc.d/rc.libvirt start |& logger

May 21 21:02:02 Tower root: no image mounted at /etc/libvirt

May 21 21:02:26 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:02:26 Tower emhttp: shcmd (36799): /etc/rc.d/rc.libvirt start |& logger

May 21 21:02:26 Tower root: no image mounted at /etc/libvirt

May 21 21:02:26 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:03:03 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:03:03 Tower emhttp: shcmd (36812): /etc/rc.d/rc.libvirt start |& logger

May 21 21:03:03 Tower root: no image mounted at /etc/libvirt

May 21 21:03:51 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:03:51 Tower emhttp: shcmd (36825): /etc/rc.d/rc.libvirt start |& logger

May 21 21:03:51 Tower root: no image mounted at /etc/libvirt

May 21 21:03:52 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:03:52 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:03:52 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:03:52 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:03:52 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:03:55 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:03:55 Tower emhttp: shcmd (36838): /etc/rc.d/rc.libvirt start |& logger

May 21 21:03:55 Tower root: no image mounted at /etc/libvirt

May 21 21:04:00 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:04:00 Tower emhttp: sendFile: sendfile /usr/local/emhttp/webGui/images/scheduler.png: Broken pipe

May 21 21:04:00 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:04:00 Tower emhttp: sendFile: sendfile /usr/local/emhttp/plugins/preclear.disk/icons/userutilities.png: Broken pipe

May 21 21:04:00 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:04:00 Tower emhttp: sendFile: sendfile /usr/local/emhttp/plugins/preclear.disk/images/preclear.disk.png: Broken pipe

May 21 21:04:00 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:04:00 Tower emhttp: sendFile: sendfile /usr/local/emhttp/plugins/dynamix.system.stats/images/dynamix.system.stats.png: Broken pipe

May 21 21:04:02 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:04:02 Tower emhttp: shcmd (36851): /etc/rc.d/rc.libvirt start |& logger

May 21 21:04:02 Tower root: no image mounted at /etc/libvirt

May 21 21:04:02 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:04:02 Tower emhttp: shcmd (36864): /etc/rc.d/rc.libvirt start |& logger

May 21 21:04:02 Tower root: no image mounted at /etc/libvirt

May 21 21:04:03 Tower emhttp: need_authorization: getpeername: Transport endpoint is not connected

May 21 21:05:02 Tower root: /mnt/cache/docker2/ is not a file

May 21 21:05:02 Tower emhttp: shcmd (36877): /etc/rc.d/rc.libvirt start |& logger

May 21 21:05:02 Tower root: no image mounted at /etc/libvirt

 

 

What did I miss on the upgrade?

I did copied only the files on the root directory from the download over the flash.

 

john

You need to include the image file name in the path also.

Link to comment

I'm happy to introduce a few more troubleshooting and tweaking tools, 2 are brand new and the third has been enhanced for troubleshooting purposes.

 

* Fix Common Problems already has had a suite of tests and detections, to make sure your unRAID server is configured correctly, and check for common mistakes and defects.  It keeps being improved, with more checks.

 

* Fix Common Problems now has added a Troubleshooting mode, for those times the system is crashing or locking up at random times, with no way to save the diagnostics.  It periodically runs diagnostics and other tests, and maintains a live tail of the syslog to the flash drive, until crash or reboot.  PLEASE use this if your system is crashing unexpectedly!  Then grab the saved diagnostics and syslog.txt (from the live tail) and post them.

 

* Tips and Tweaks wiki page, a collection of the various troubleshooting and performance tips, tweaks, and fixes scattered in multiple places around the forum.  It's new, may change often as new tips are found and old ones are no longer needed or are incorporated into new unRAID releases.  Please check it, especially if you are having issues with your system, and provide feedback here about what works for you, and what doesn't.  We're hoping the results and your feedback may help LimeTech.  Plus, we hope to add many more tips and tricks to the page, as users suggest them.

 

* Tips and Tweaks plugin, a companion tool for interactively implementing some of the tips and tweaks from the Tips and Tweaks wiki page, and a few that haven't made it to the wiki yet.  Should make it easier to try some of them out.  Again, we want feedback on what helps and what doesn't.

 

The original thought in starting this was to try and help with the bugs and crashes some users have been having here.

Link to comment

So today I am doing my first parity check since upgrading to 6.2b21, I have to 8TB Seagate drives as parity drives and the speed of the parity check is all over the place, anywhere from 6.5MB/s up to 58MB/s, is this normal? Under 6.19 with a single 8TB parity drive I used to get speeds no lower than 55MB/s and higher as time progressed, parity checks took no longer than 27-29 hours.

Link to comment

dAigo, you can probably revert #2 and #3.  The num_stripes (#1) was ultimately the key workaround.  I know #3 is kind of a pain to change from ReiserFS back to XFS but curious if you gain your speed back (+170MB/s) from doing so.

Ah, I see were I was wrong. We tried other tunables during the first tests and I just asumed you asked me to retry one of it after getting better results with ReiserFS for the Linux VM.

 

But it seems we actually did not try num_stripes before...

 

I'll test it later today, I still have enough xfs disks in the array so its no big deal.

Indeed, #1 seems to do the job, #2 & #3 are not needed, sorry for the confusion.

 

And yes, with XFS transferspeed is back to normal.

ReiserFSvsXFS.png.c1800246993afe0bdccf3ab1a71b8f05.png

Link to comment

Is there going to be a beta 22? there were a lot of beta updates at first and it seems to have slowed nothing. I wish the LT team could communicate this stuff with us a little better.

 

I'm sure we'll have something around the holiday.

 

 

 

Which holiday, I couldn't tell you.

Link to comment

So today I am doing my first parity check since upgrading to 6.2b21, I have to 8TB Seagate drives as parity drives and the speed of the parity check is all over the place, anywhere from 6.5MB/s up to 58MB/s, is this normal? Under 6.19 with a single 8TB parity drive I used to get speeds no lower than 55MB/s and higher as time progressed, parity checks took no longer than 27-29 hours.

 

Seems those Archive drive cause slow down parity check. I'm running 6Tx8+3Tx3 with LSI 9240 (IT mode), I could get highest 154MB/s speed and ~14hrs would finish dual-parity check.

Link to comment

So today I am doing my first parity check since upgrading to 6.2b21, I have to 8TB Seagate drives as parity drives and the speed of the parity check is all over the place, anywhere from 6.5MB/s up to 58MB/s, is this normal? Under 6.19 with a single 8TB parity drive I used to get speeds no lower than 55MB/s and higher as time progressed, parity checks took no longer than 27-29 hours.

 

Forget reported speed for a second, what was the total time for dual parity check (6.2.0b21) using the 8TB drives?

Link to comment

Hi All,

 

I have a few warnings in my logs that I just wanted to put out there. May help with beta 21 issues. Diagnostics attached. Cheers.

 

May 23 09:45:58 Core kernel: ACPI: Early table checksum verification disabled

May 23 09:45:58 Core kernel: spurious 8259A interrupt: IRQ7.

May 23 09:45:58 Core kernel: ACPI Error: [\_SB_.PCI0.XHC_.RHUB.HS11] Namespace lookup failure, AE_NOT_FOUND (20150930/dswload-210)

May 23 09:45:58 Core kernel: ACPI Exception: AE_NOT_FOUND, During name lookup/catalog (20150930/psobject-227)

May 23 09:45:58 Core kernel: ACPI Exception: AE_NOT_FOUND, (SSDT:xh_rvp08) while loading table (20150930/tbxfload-193)

May 23 09:45:58 Core kernel: ACPI Error: 1 table load failures, 8 successful (20150930/tbxfload-214)

May 23 09:45:58 Core kernel: ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S1_] (20150930/hwxface-580)

May 23 09:45:58 Core kernel: ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20150930/hwxface-580)

May 23 09:45:58 Core kernel: acpi PNP0A08:00: _OSC failed (AE_ERROR); disabling ASPM

May 23 09:45:58 Core kernel: floppy0: no floppy controllers found

May 23 09:45:58 Core kernel: ACPI Warning: SystemIO range 0x000000000000F040-0x000000000000F05F conflicts with OpRegion 0x000000000000F040-0x000000000000F04F (\_SB_.PCI0.SBUS.SMBI) (20150930/utaddress-254)

May 23 09:46:01 Core rpc.statd[1732]: Failed to read /var/lib/nfs/state: Success

May 23 09:46:37 Core avahi-daemon[8661]: WARNING: No NSS support for mDNS detected, consider installing nss-mdns!

core-diagnostics-20160523-1010.zip

Link to comment

So as a further update to my issue things are now getting worse on the pcie front and i can now replicate this issue time after time on my system, attached is my diagnostics file

 

Apr 27 04:11:06 Archangel kernel: pcieport 0000:00:02.0: AER: Corrected error received: id=0010
Apr 27 04:11:06 Archangel kernel: pcieport 0000:00:02.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0010(Receiver ID)
Apr 27 04:11:06 Archangel kernel: pcieport 0000:00:02.0:   device [8086:2f04] error status/mask=00000080/00002000
Apr 27 04:11:06 Archangel kernel: pcieport 0000:00:02.0:    [ 7] Bad DLLP              
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: id=0018
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0018(Requester ID)
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0:   device [8086:2f08] error status/mask=00004000/00000000
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0:    [14] Completion Timeout     (First)
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0: broadcast error_detected message
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0: broadcast mmio_enabled message
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0: broadcast resume message
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0: AER: Device recovery successful

 

I have never seen the error on requester device 10 before or one through that root bus. In these events both my vms lock up (paused under the gui) and can not be resumed. Unraid remains working perfectly but vms stop functioning. Things i have tried

  • Replace the gpu i have issues with (750ti)
  • Reflashed the bios multiple times
  • Done a 24 hour memtest on the latest version across all 16 cores with no faults
  • Changed the order of pcie slots on my motherboard
  • Removed the usb 3 controller to the vm that usually has trouble

 

To replicate this issue i launch a game on the vm with the 780, i then load a movie on amazon prime in the vm with the 750ti. After around 10 minutes of video playback the vms will lock up and fail every time - this is something i can repeat

 

I have no idea if this is related to the beta or hardware failure on the motherboard but if i can get some help debugging what is wrong it would be appreciated - especially as it only seems to affect vm's (dockers continue to run fine)

 

Edit

It seems there is an issue with the latest nvidia driver and playing items suchg as netflix and amazon prime on certain system. This appears to be causing pcie conflicts with other devices.

To test this i have replaced my 750ti with an older AMD 5750 and downgraded my 780 drive down to version 362 which people are reporting as a fix. I never considered the fact that a graphics issue on a guest would affect the host system so much (the 750ti over seabios is most likely the culprit). I am going to test this for a few days and see what happens - if all goes well with the AMD card i will put the 750ti back in with the older drivers and see if this fixes the issue

 

Hey bigjme, I was wondering since I hadn't seen any update about this if downgrading the drivers had fixed your issues? I'm having literally the same error myself. I only just now found it though as I hadn't ever really stress tested my system till just the other day. I tried running 2 VMs (both in OVMF) with my 960s passthroughed to them. I wanted to see what she could do, so I opened up multiple games, movies and web videos, but instead found that I can reliably recreate the error you described in a similar manner. VM locks up but shows paused in unraid and cant be resumed just as you described. I'll check out using the 362 drivers, but thought I would ask you first since you initiated it.

 

Also, similar to your other post (and a few others here), I also have the "samba lockup issue" when I transfer a larger amount of files between the shares. Not knowing much of anything about how Samba works (only vaguely that it has to do with file sharing) I'll note that this only seems to happen with transfers from inside the windows VMs to the unraid shares (which I think is the function of samba if I'm not mistaken). However, if I SSH in and move stuff around the shares via mc, I've not had any issues.

 

EDIT-- I found your post on the nvidia forum, and the issue that they are having is being reported as still causing problems with the newest driver set 365.19. Installing 362 now and see how she fares after that.

Link to comment

I have a few warnings in my logs that I just wanted to put out there. May help with beta 21 issues. Diagnostics attached. Cheers.

 

May 23 09:45:58 Core kernel: ACPI: Early table checksum verification disabled

May 23 09:45:58 Core kernel: spurious 8259A interrupt: IRQ7.

May 23 09:45:58 Core kernel: ACPI Error: [\_SB_.PCI0.XHC_.RHUB.HS11] Namespace lookup failure, AE_NOT_FOUND (20150930/dswload-210)

May 23 09:45:58 Core kernel: ACPI Exception: AE_NOT_FOUND, During name lookup/catalog (20150930/psobject-227)

May 23 09:45:58 Core kernel: ACPI Exception: AE_NOT_FOUND, (SSDT:xh_rvp08) while loading table (20150930/tbxfload-193)

May 23 09:45:58 Core kernel: ACPI Error: 1 table load failures, 8 successful (20150930/tbxfload-214)

May 23 09:45:58 Core kernel: ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S1_] (20150930/hwxface-580)

May 23 09:45:58 Core kernel: ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20150930/hwxface-580)

May 23 09:45:58 Core kernel: acpi PNP0A08:00: _OSC failed (AE_ERROR); disabling ASPM

May 23 09:45:58 Core kernel: floppy0: no floppy controllers found

May 23 09:45:58 Core kernel: ACPI Warning: SystemIO range 0x000000000000F040-0x000000000000F05F conflicts with OpRegion 0x000000000000F040-0x000000000000F04F (\_SB_.PCI0.SBUS.SMBI) (20150930/utaddress-254)

May 23 09:46:01 Core rpc.statd[1732]: Failed to read /var/lib/nfs/state: Success

May 23 09:46:37 Core avahi-daemon[8661]: WARNING: No NSS support for mDNS detected, consider installing nss-mdns!

 

ALL of the messages you have isolated are completely normal, completely typical, often found in almost all syslogs, and have been for many unRAID releases and kernel versions.  Nothing to worry about.  That "spurious 8259A interrupt: IRQ7" is how I identify a motherboard based on nVidia chipsets, has been for a long time, whether it's identified as nVidia or nForce or not.

 

ACPI issues are very common.  Most motherboard makers are primarily interested in cleaning up their BIOS for Windows, and once Windows is happy, it's generally a low priority whether you clean up anything else for Linux or other OS.  And Linux has done an exceptional job in recognizing and working around these issues.  The fact you are seeing the messages means the kernel detected the issue and handled it as best it could, almost always making them harmless.  You can try updating your BIOS, sometimes removes a few of the messages.  And sometimes a newer kernel will better handle them.

 

The other miscellaneous messages are equally common.

 

Here's another example of the kernel's ability to detect and work around the issues it finds -

Apr 27 04:11:06 Archangel kernel: pcieport 0000:00:02.0: AER: Corrected error received: id=0010
Apr 27 04:11:06 Archangel kernel: pcieport 0000:00:02.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0010(Receiver ID)
Apr 27 04:11:06 Archangel kernel: pcieport 0000:00:02.0:   device [8086:2f04] error status/mask=00000080/00002000
Apr 27 04:11:06 Archangel kernel: pcieport 0000:00:02.0:    [ 7] Bad DLLP              
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: id=0018
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0018(Requester ID)
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0:   device [8086:2f08] error status/mask=00004000/00000000
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0:    [14] Completion Timeout     (First)
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0: broadcast error_detected message
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0: broadcast mmio_enabled message
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0: broadcast resume message
Apr 27 21:06:31 Archangel kernel: pcieport 0000:00:03.0: AER: Device recovery successful

"AER: Corrected error received" shows it detected an issue, and fixed it.

"AER: Uncorrected (Non-Fatal) error received" shows it detected it, didn't fix it, but decided it wasn't important.

"AER: Device recovery successful" shows problems found, problems fixed (or worked around).

And these are just the ones it has decided to tell you about.

 

Our kernel is an amazing thing.  It's probably significantly larger than it could have been, just to work around all the bugs and inconsistencies in so many motherboard BIOS's and device drivers and hardware defects, and the ambiguities in most standards.

Link to comment

Also, similar to your other post (and a few others here), I also have the "samba lockup issue" when I transfer a larger amount of files between the shares. Not knowing much of anything about how Samba works (only vaguely that it has to do with file sharing) I'll note that this only seems to happen with transfers from inside the windows VMs to the unraid shares (which I think is the function of samba if I'm not mistaken). However, if I SSH in and move stuff around the shares via mc, I've not had any issues.

 

Have you tried the num_stripes fix (setting num_stripes to 8192)?

Link to comment
Guest
This topic is now closed to further replies.