bigjme

[Resolved] Primary GPU passthrough

9 posts in this topic Last Reply

Recommended Posts

Posted (edited)

So, another topic on this but there are a few things I want to check with my systems as I think it may just be me

 

So I've followed spaceinvader ones video on getting a vbios for tech power up, modifying it, and using that

 

All went well, passed through the primary gpu from unraid to a vm, did a full windows 10 install and basic setup with the gpu passed through, perfect

 

So I installed the new nvidia drivers for my 750ti, and carried on tinkering. I then decided to reboot the vm so the video drivers could finish installing and it will no longer boot

 

I don't get error 43 like most others. Windows starts to load, showing the loading icon with the uefi splash screen and then the vm pauses

 

This is the vm log from boot up to shutdown

 

2018-05-02T20:54:30.216302Z qemu-system-x86_64: -device vfio-pci,host=04:00.0,id=hostdev0,bus=pci.4,addr=0x0,romfile=/mnt/cache/VMImages/GPURoms/msi-750ti.rom: Failed to mmap 0000:04:00.0 BAR 3. Performance may be slow
2018-05-02T20:57:26.776428Z qemu-system-x86_64: vfio_region_write(0000:04:00.0:region3+0x1088, 0x7ffe11,8) failed: Device or resource busy
KVM internal error. Suberror: 1
emulation failure
RAX=ffffab7e37c11000 RBX=ffffab7e37c11000 RCX=ffffab7e37c11000 RDX=0000000000000000
RSI=ffffac04d77445c0 RDI=ffffac04d6055000 RBP=ffffac04d7f12000 RSP=ffff80890cd8d8f8
R8 =0000000000001000 R9 =0101010101010101 R10=fffff80fc1dcc4ac R11=ffff80890cd8d6b0
R12=ffffac04d3a75910 R13=ffffac04d77441e0 R14=0000000000000000 R15=0000000000100000
RIP=fffff80fc204b038 RFL=00010216 [----AP-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA]
CS =0010 0000000000000000 00000000 00209b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 00000000 00409300 DPL=0 DS [-WA]
DS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA]
FS =0053 0000000000000000 0000fc00 0040f300 DPL=3 DS [-WA]
GS =002b ffff9b8042cb9000 ffffffff 00c0f300 DPL=3 DS [-WA]
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 ffff9b8042cc8000 00000067 00008b00 DPL=0 TSS64-busy
GDT= ffff9b8042cc9fb0 00000057
IDT= ffff9b8042cc7000 00000fff
CR0=80050033 CR2=ffffd180f654a000 CR3=0000000268559000 CR4=001506f8
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d01
Code=66 66 66 66 0f 1f 84 00 00 00 00 00 66 48 0f 6e c2 0f 16 c0 <0f> 11 01 4c 03 c1 48 83 c1 10 48 83 e1 f0 4c 2b c1 4d 8b c8 49 c1 e9 07 74 2f 0f 29 01 0f
2018-05-02T20:57:45.189655Z qemu-system-x86_64: terminating on signal 15 from pid 12367 (/usr/sbin/libvirtd)

 

Now I did have unraid booted in gui mode but I'm assuming that as it passed through for setup, this isn't the problem. 

 

So I'm a little stuck on exactly what would be causing this error as the error says its in use, but this only occurs after installing the nvidia driver. 

 

After so many boots I am able to get into the windows recovery menus which all function fine so it seems to pause the second the drivers initialized 

 

I'm on unraid 6.5.1 with the vbios passed through the gui. The vm is ovmf with Hyper-V off on Q35-2.11. I can post the full xml if it would help

 

Any ideas as I am a little stumped? 

 

Regards, 

Jamie 

 

 

----------- ANSWER --------

 

For easy reading, this was the answer needed. In the event that you get mmap errors on passing through the rom, or you install the nvidia drivers and get errors like the above, try running the following in command line and try again

 

echo 0 > /sys/class/vtconsole/vtcon0/bind
echo 0 > /sys/class/vtconsole/vtcon1/bind
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind

 

I added these to a user script that triggers on first array boot up. I have now successfully remove a gpu i no longer need from my system and am able to reboot unraid and have the vm auto start on the primary gpu

Edited by bigjme

Share this post


Link to post

You could try dumping your actual bios instead of a modified one? Just to see if it makes any difference? Think gridrunner shows it in one of his other videos


Sent from my iPhone using Tapatalk

Share this post


Link to post
You could try dumping your actual bios instead of a modified one? Just to see if it makes any difference? Think gridrunner shows it in one of his other videos


Sent from my iPhone using Tapatalk

This video shows how




Sent from my iPhone using Tapatalk

Share this post


Link to post
Posted (edited)

Ok so i tried the above on my main card, and my other card which i currently pass to a VM. they are both 750ti's, the same make and model. Both had been passed through to a vm in a secondary slow at the time. The exported roms both times were a tiny 62KB

 

Safe to say, booting the vm i get no error saying the device is in use, but the vm has no video output at all

 

Having read through the export i noticed my gpu was on an older bios then the one i fetched from techpower so i went and fetched n older version, edited it to remove the jump, and booted the vm

 

I have video output and the windows startup recovery launched. So i restarted it to boot windows. Again like before, the windows loading screen comes up, and the seconds windows starts to initialise the nvidia drivers, i get the same error

 

2018-05-03T20:11:39.036200Z qemu-system-x86_64: vfio_region_write(0000:04:00.0:region3+0x1088, 0x7ffe11,8) failed: Device or resource busy
KVM internal error. Suberror: 1
emulation failure
RAX=ffffe3fca3011000 RBX=ffffe3fca3011000 RCX=ffffe3fca3011000 RDX=0000000000000000
RSI=ffff8f8b58f44830 RDI=ffff8f8b58fb1000 RBP=ffff8f8b58efc000 RSP=ffffa30c4d3868f8
R8 =0000000000001000 R9 =0101010101010101 R10=fffff80a6783c4ac R11=ffffa30c4d3866b0
R12=ffff8f8b56a72ab0 R13=ffff8f8b58f43010 R14=0000000000000000 R15=0000000000100000
RIP=fffff80a67abb038 RFL=00010216 [----AP-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA]
CS =0010 0000000000000000 00000000 00209b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 00000000 00409300 DPL=0 DS [-WA]
DS =002b 0000000000000000 ffffffff 00c0f300 DPL=3 DS [-WA]
FS =0053 0000000000000000 00017c00 0040f300 DPL=3 DS [-WA]
GS =002b ffffd401e8712000 ffffffff 00c0f300 DPL=3 DS [-WA]
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 ffffd401e8721000 00000067 00008b00 DPL=0 TSS64-busy
GDT= ffffd401e8722fb0 00000057
IDT= ffffd401e8720000 00000fff
CR0=80050033 CR2=ffffe40646de7000 CR3=000000026416e000 CR4=001506f8
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d01
Code=66 66 66 66 0f 1f 84 00 00 00 00 00 66 48 0f 6e c2 0f 16 c0 <0f> 11 01 4c 03 c1 48 83 c1 10 48 83 e1 f0 4c 2b c1 4d 8b c8 49 c1 e9 07 74 2f 0f 29 01 0f
2018-05-03T20:11:57.679635Z qemu-system-x86_64: terminating on signal 15 from pid 12367 (/usr/sbin/libvirtd)
2018-05-03 20:11:58.880+0000: shutting down, reason=destroyed

 

I've double checked the device is in its owm iommu group and it is

I've also checked the gpu is not bound to the vfio-pci driver and its not

 

My next thought is that its because i'm booting unraid into gui mode and thats using something perhaps?

 

--Edit

 

Ok so i just did a fresh reboot with unraid in console mode and still, the exact same behaviour

 

-- Edit 2

 

Ok so i found this post elsewhere on the forum

 

So it says to run these 3 lines

 

echo 0 > /sys/class/vtconsole/vtcon0/bind
echo 0 > /sys/class/vtconsole/vtcon1/bind
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind
 

I've ran them and the vm has started up and the rom errors have vanished

 

I'm going to run some tests and have added it into my user scripts to run on array start up to see if that is fine on a fresh restart

 

Edited by bigjme

Share this post


Link to post
Posted (edited)

I get similar mmap errors to what you describe, and have the same issues on driver install.

When I run the following in the webterminal:

echo 0 > /sys/class/vtconsole/vtcon1/bind
I get 

-bash: echo: command not found

When I run:

sudo echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind
  I get:

/sys/bus/platform/drivers/efi-framebuffer/unbind: Permission denied


Any thoughts?

 

I made a post here: 

but have not gotten any feedback thus far.

Edited by paperblankets

Share this post


Link to post

It may be something unsupported on the Web terminal, try it from an actual ssh connection, echo should always be available

 

For mine I ssh'd in as the root user (same details as the gui), you may find the Web terminal user may be different to root

 

Regards, 

Jamie

  • Upvote 1

Share this post


Link to post
2 hours ago, bigjme said:

It may be something unsupported on the Web terminal, try it from an actual ssh connection, echo should always be available

 

For mine I ssh'd in as the root user (same details as the gui), you may find the Web terminal user may be different to root

 

Regards, 

Jamie

 

Thanks for the info Jamie!

I indeed get a different result when ssh'ed as root. I am still running into issues though.

Last login: Thu Aug  9 19:06:02 2018 from 192.168.1.125
Linux 4.14.49-unRAID.
root@HuananTheHydra:~# echo 0 > /sys/class/vtconsole/vtcon0/bind
-bash: /sys/class/vtconsole/vtcon0/bind: No such file or directory
root@HuananTheHydra:~# echo 0 > /sys/class/vtconsole/vtcon1/bind
-bash: /sys/class/vtconsole/vtcon1/bind: No such file or directory
root@HuananTheHydra:~# echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind
-bash: /sys/bus/platform/drivers/efi-framebuffer/unbind: No such file or directory

 

My vtcon* directories do not contain a bind folder.

root@HuananTheHydra:~# /sys/class/vtconsole/vtcon0/
power/     subsystem/
root@HuananTheHydra:~# /sys/class/vtconsole/vtcon1/
power/     subsystem/

 

image.png

Share this post


Link to post

Hmm, I'm not entirely sure then. I followed a fix someone else found so I don't really know how they found what to do

 

Jamie

Share this post


Link to post
Posted (edited)

I am not sure if I needed to restart the server, or have the array stopped, but one of those let me run two of the 3 commands:

 

echo 0 > /sys/class/vtconsole/vtcon0/bind
echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind

Which resulted in a boot without mmap errors, which allowed me to install drivers.

 

@bigjme I added a user script to fix mmap, it sounds like this fix does not persist on a restart for you right?

Thank you for the debugging steps and suggestions!

Edited by paperblankets

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


Copyright © 2005-2018 Lime Technology, Inc.
unRAID® is a registered trademark of Lime Technology, Inc.