limetech

unRAID OS version 6.5.3 available

70 posts in this topic Last Reply

Recommended Posts

On my Ryzen 1700x with the Asus x370 motherboard all Windows VMs with GPU passthough are not functional at all. VMs workes great at first boot after installation but when restarted a couple of times they start lagging and hanging and take over 2 minutes just to load the main desktop & from there it is very difficult to do any task. Once the GPU is removed I can remotely log in without an issue and everything will work fine, I think it is an issue related to GPU passthorugh the AMD side

 

the system momory is at default speed and other VMs are working without an isse

 

the VM has 8 cores + 8GB ram + 1 GPU (GTX 1060 6GB) passthrough not system gpu.

 

I eventaully solved it by moving unRAID to my older i7 6700k on Gigabyte Z170X board. the VMs are running smooth with over 10 reboots and still testing nothing is gone wrong till now. I think its an AMD related issue.

Share this post


Link to post

@PSYCHOPATHiO out of curiosity, did you update to the newest BIOS for the Asus board? I've found that when unRAID is doing really weird and unexplainable stuff that its generally related to the BIOS.

Share this post


Link to post
2 hours ago, ryoko227 said:

@PSYCHOPATHiO out of curiosity, did you update to the newest BIOS

Always mostly day of release, I have a Bookmark folder for daily bios updates checks for all my boards.

Once there is an update I check of the changes & start applying my settings, amongst them is disableing c-state & enabling vm & IMMO settings. I have to mention that my memory is not on the compatible list so I keep it at stock speeds at 2133. I've used the machine under windows for a week with the memory set to 3000mhz and no issues.

Every kernel panic cause the system to restart a parity check and with 8TB drives it takes for EVER.

Share this post


Link to post

The rc1 post for this specifically said this:
"However, we want to make this change in isolation and release to the Community in order to get testing on a wider range of hardware, especially to find out if parity operations are negatively affected."

 

I'm not sure if 6.5.3 is the culprit since my parity checking only happens every two months but I am now getting repeated parity sync errors.  First manual scan fixed 53 then the next one 123 run not even a day later.

 

I have not backed out of 6.5.3 yet to see if that's the issue but I wanted to know if there is a way to change back the pre-empt setting in 6.5.3 so it behaves like 6.5.2 again and I will see if the parity sync errors go away.  Have already run a 24h memtest with no issues.  SMART is good too.

 

I have 4 4TB WD Red and then 2 3TB WD Reds.  The 2 3TB drives are plugged into a SATA PCIx1 card which was listed on the forums as compatible.  
IO Crest 4 Port SATA III PCI-e 2.0 x1 Controller Card Marvell Non-Raid with Low Profile Bracket SI-PEX40064

https://smile.amazon.com/gp/product/B00AZ9T3OU

 

Built the system December 2017 and all parity checks were good until now.

Share this post


Link to post

The entire 6.5.3.x line should have no impact on actual parity functionality, just possibly parity speed performance. No one had encountered any issues up until your report.

 

What happens if you run a third manual parity scan today?

Share this post


Link to post
2 hours ago, nickp85 said:

The rc1 post for this specifically said this:
"However, we want to make this change in isolation and release to the Community in order to get testing on a wider range of hardware, especially to find out if parity operations are negatively affected."

 

I'm not sure if 6.5.3 is the culprit since my parity checking only happens every two months but I am now getting repeated parity sync errors.  First manual scan fixed 53 then the next one 123 run not even a day later.

 

I have not backed out of 6.5.3 yet to see if that's the issue but I wanted to know if there is a way to change back the pre-empt setting in 6.5.3 so it behaves like 6.5.2 again and I will see if the parity sync errors go away.  Have already run a 24h memtest with no issues.  SMART is good too.

 

I have 4 4TB WD Red and then 2 3TB WD Reds.  The 2 3TB drives are plugged into a SATA PCIx1 card which was listed on the forums as compatible.  
IO Crest 4 Port SATA III PCI-e 2.0 x1 Controller Card Marvell Non-Raid with Low Profile Bracket SI-PEX40064

https://smile.amazon.com/gp/product/B00AZ9T3OU

 

Built the system December 2017 and all parity checks were good until now.

 

Might I suggest that you open up a new thread in the 'General Support' subforum and be sure to include a diagnostics file which contains the time period during which you had parity sync errors.  I can tell you that SATA cards with  Marvell chip sets have had issues in the past.  (I am not saying at this point that is the your problem but there is precedence.)   

Share this post


Link to post
9 minutes ago, Frank1940 said:

 

Might I suggest that you open up a new thread in the 'General Support' subforum and be sure to include a diagnostics file which contains the time period during which you had parity sync errors.  I can tell you that SATA cards with  Marvell chip sets have had issues in the past.  (I am not saying at this point that is the your problem but there is precedence.)   

 

Thanks, I already have a support thread open. Wanted to say something here because of the comments made during the RC1 release.

Share this post


Link to post
Posted (edited)

@ryoko227 After I have moved the system to another USB and with a single HDD for VMs only, I copied the same VM of windows 10 from the First machine over the network and I'm actually runing it without an issue on an HDD. The only change is I pushed the DDR to 3000Mhz & running stable for almsot 24 hours now.

on an SSD and a GTX 1060 it becomes laggy, on an HDD and a GT 710 its going great, I'm confued lol

 

EDIT: also running same number of cores 8

 

Edited by PSYCHOPATHiO
  • Like 1

Share this post


Link to post

Been running for 4 days with no ill effect.  The parity check speed is the same for me and completed in 9hrs for 32TB.

Share this post


Link to post

Did update few days ago, and everything is fine...Thnx a lot...

Share this post


Link to post

I upgraded an older/smaller disk in my server. That went fine but it showed this:

 

unRAID Parity sync / Data rebuild: 18-06-2018 18:46
Notice [Tower] - Parity sync / Data rebuild finished (0 errors)
Duration: 4 hours, 48 minutes, 55 seconds. Average speed: 230.8 MB/s

 

Is it combining the parity and disk speed? Because there is no way it hit 230.8 MB/s writing.

Share this post


Link to post

i just notice the green background theme 

Share this post


Link to post
5 hours ago, 1812 said:

Is it combining the parity and disk speed?

It's an old bug, average speed is calculated based on parity size, not disk rebuilt size.

  • Like 1

Share this post


Link to post
5 hours ago, johnnie.black said:

It's an old bug, average speed is calculated based on parity size, not disk rebuilt size.

 

thanks!

Share this post


Link to post

Just updated from 6.5.2 to 6.5.3 a couple days ago. Initially I thought everything went smoothly, as I haven't noticed any behavioral/performance symptoms. However when I logged in this morning I noticed in the system log that I have a call trace in my syslog. (Diagnostics attached). Obviously I can't be 100% certain that this is related to 6.5.3 specifically, but I have never seen any traces at all in the past. (I've run pretty much every stable release on this hardware since the unRAID v5 days). Just seems coincidental that the call trace happened shortly after the software upgrade. It seems noteworthy that the trace is concerning netfilter/macvlan and that I do have somewhat of a "non-standard" networking configuration, in that I'm using VLAN tagging and a Mellanox ConnectX-2 10G NIC.

 

As I said, I haven't noticed any behavioral systems or anything like that so I'm not too panicked about this. Just thought someone might like to take a look to see if any corner cases or incompatibilities of some recent change weren't covered during testing. Let me know if more information beyond the diagnostics file would be helpful. I'll monitor the logs to see if it happens again, and whether I can correlate it to a specific event.

 

Cheers,

 

-A

nas-diagnostics-20180624-0755.zip

Share this post


Link to post
27 minutes ago, Ambrotos said:

It seems noteworthy that the trace is concerning netfilter/macvlan and that I do have somewhat of a "non-standard" networking configuration, in that I'm using VLAN tagging and a Mellanox ConnectX-2 10G NIC.

This thread might help.

Share this post


Link to post

I'm seeing this as well now. With UEFI boot enabled, a VM with iGPU passthrough doesn't boot, maxes out a CPU thread and totally fills the syslog with, 

kernel: vfio-pci 0000:00:02.0: BAR 2: can't reserve [mem 0xc0000000-0xcfffffff 64bit pref]

Disabling UEFI boot stops the problem and allows the VM and passthrough to return to working as expected.

 

Rich

 

diagnostics-20180624-1740.zip

Edited by Rich

Share this post


Link to post
9 hours ago, johnnie.black said:

This thread might help.

 

That definitely seems to be what I'm encountering. Thanks for the tip. I'll follow up in that thread.

 

Cheers,

 

-A

Share this post


Link to post

I'd put this update off for a while but I've done it today and no issues so far

 

I had one vm stutter a few times after fresh server reboots but a sits just the one vm I'm assuming it's a Windows problem as a reboot fixed it

 

Other than that, all good so far 

Share this post


Link to post
8 hours ago, Rich said:

I'm seeing this as well now. With UEFI boot enabled, a VM with iGPU passthrough doesn't boot, maxes out a CPU thread and totally fills the syslog with, 


kernel: vfio-pci 0000:00:02.0: BAR 2: can't reserve [mem 0xc0000000-0xcfffffff 64bit pref]

Disabling UEFI boot stops the problem and allows the VM and passthrough to return to working as expected.

 

Rich

 

diagnostics-20180624-1740.zip

 

Most likely what is happening is that efi-framebuffer is being loaded into the area of memory that the GPU is also trying to use when unRAID is booted in UEFI mode. This thread explains the who, how, what, why and how to fix it if your issue is the same as mine was.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


Copyright © 2005-2018 Lime Technology, Inc.
unRAID® is a registered trademark of Lime Technology, Inc.