X10SAT - SATA PCI Passthrough

LeoLinux

Junior Member
May 24, 2014
3
0
0
Hi

I read a lot of posts of zir_blazer about X10SAT. I want to thank you at this point because it was quite helpful to read up on this information before I bought it ;)

Anyway. I've been working with Xen for quite some time now. Since a couple of days I'm trying to get the SATA device passed through. Unfortunately I could not get it working as expected. I might miss something due to my lack of knowledge in PCI passthrough, addressing, groups etc. So out of my frustration I decicded to consult this forum with the hope to get some useful information again ;)

Code:
root@XenServer-03 [~]$ uname -a
Linux XenServer-03.Mydomain.Local 2.6.32.43-0.4.1.xs1.8.0.853.170791xen #1 SMP Mon Mar 3 06:36:39 EST 2014 i686 i686 i386 GNU/Linux

root@XenServer-03 [~]$ lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v3 Processor DRAM Controller [8086:0c08] (rev 06)
00:02.0 VGA compatible controller [0300]: Intel Corporation Xeon E3-1200  v3 Processor Integrated Graphics Controller [8086:041a] (rev 06)
00:03.0 Audio device [0403]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller [8086:0c0c] (rev 06)
00:14.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI [8086:8c31] (rev 04)
00:16.0 Communication controller [0780]: Intel Corporation 8 Series/C220  Series Chipset Family MEI Controller #1 [8086:8c3a] (rev 04)
00:16.3 Serial controller [0700]: Intel Corporation 8 Series/C220 Series Chipset Family KT Controller [8086:8c3d] (rev 04)
00:19.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection I217-LM [8086:153a] (rev 04)
00:1a.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 [8086:8c2d] (rev 04)
00:1b.0 Audio device [0403]: Intel Corporation 8 Series/C220 Series  Chipset High Definition Audio Controller [8086:8c20] (rev 04)
00:1c.0 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 [8086:8c10] (rev d4)
00:1c.1 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #2 [8086:8c12] (rev d4)
00:1c.3 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #4 [8086:8c16] (rev d4)
00:1c.4 PCI bridge [0604]: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #5 [8086:8c18] (rev d4)
00:1d.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 [8086:8c26] (rev 04)
00:1f.0 ISA bridge [0601]: Intel Corporation C226 Series Chipset Family Server Advanced SKU LPC Controller [8086:8c56] (rev 04)
00:1f.2 SATA controller [0106]: Intel Corporation 8 Series/C220 Series  Chipset Family 6-port SATA Controller 1 [AHCI mode] [8086:8c02] (rev 04)
00:1f.3 SMBus [0c05]: Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller [8086:8c22] (rev 04)
00:1f.6 Signal processing controller [1180]: Intel Corporation 8 Series  Chipset Family Thermal Management Controller [8086:8c24] (rev 04)
01:00.0 SATA controller [0106]: ASMedia Technology Inc. ASM1062 Serial ATA Controller [1b21:0612] (rev 01)
02:00.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8606 6 Lane, 6 Port PCI Express Gen 2 (5.0 GT/s) Switch [10b5:8606] (rev ba)
03:01.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8606 6 Lane, 6 Port PCI Express Gen 2 (5.0 GT/s) Switch [10b5:8606] (rev ba)
03:04.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8606 6 Lane, 6 Port PCI Express Gen 2 (5.0 GT/s) Switch [10b5:8606] (rev ba)
03:05.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8606 6 Lane, 6 Port PCI Express Gen 2 (5.0 GT/s) Switch [10b5:8606] (rev ba)
03:07.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8606 6 Lane, 6 Port PCI Express Gen 2 (5.0 GT/s) Switch [10b5:8606] (rev ba)
03:09.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8606 6 Lane, 6 Port PCI Express Gen 2 (5.0 GT/s) Switch [10b5:8606] (rev ba)
07:00.0 USB controller [0c03]: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller [1912:0014] (rev 03)
08:00.0 PCI bridge [0604]: Texas Instruments XIO2213A/B/XIO2221 PCI Express to PCI Bridge [Cheetah Express] [104c:823e] (rev 01)
09:00.0 FireWire (IEEE 1394) [0c00]: Texas Instruments  XIO2213A/B/XIO2221 IEEE-1394b OHCI Controller [Cheetah Express]  [104c:823f] (rev 01)
0a:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
My aim is to passthrough at least the following device:
Code:
00:1f.2 SATA controller [0106]: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] [8086:8c02] (rev 04)
What I have done so far:
Code:
vi /boot/extlinux.conf
==EDIT==>  append /boot/xen.gz [...] splash xen-pciback.hide=(00:1f.2) --- /boot/initrd-2.6-xen.img
extlinux -i /boot
shutdown -r now
But when I do a lspci after the reboot the "00:1f.2" is still listed. I thought it should be hidden?! Is this a normale behaviour? I ignored it and continued with the following commands:

Code:
xe vm-list (Get the UUID of the VM)
UUID="88e0869b-4de6-5f6a-d438-fde555d40015"
xe vm-shutdown uuid="${UUID}"
xe vm-param-set other-config:pci=0/0000:00:1f.2 uuid="${UUID}"
xe vm-start uuid="${UUID}"
The VM started and I was already about to open a beer - but then during the boot of the VM a kernel panic occurred. Even after some more tries of booting the VM - it still always gets a kernel panic during boot. So I removed the PCI device by
Code:
xe vm-param-remove param-name=other-config param-key=pci uuid="${UUID}"
So this is how far I got this far. Since here and there I read about PCI address groups etc. I thought, that mybe I can not just passthrough this single SATA device BUT I must passthrough an entire group?! Can someone confirm this behaivour? If so, how do I find out about what's in a group and what isn't. I've read about a script called 'lsgroup.sh' ... but I couldn't find it for my personal use.

Any hints a gratefully appreciated
Thanks ;)

Kind regards
 

zir_blazer

Golden Member
Jun 6, 2013
1,191
483
136
Looks like I am a sort of Internet celebrity. My VT-d Thread reached nearly 60K views and catched the eye of tons of people that wanted to use Xen with PCI/VGA Passthrough. Who knows how much marketshare Supermicro grow thanks to me. I'm blushing.


I can't really solve your problem, but can give you some advice. You have two SATA Controllers, the Chipset one, and the ASMedia. In your list, they are these:

0:1f.2 SATA controller [0106]: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] [8086:8c02] (rev 04)
01:00.0 SATA controller [0106]: ASMedia Technology Inc. ASM1062 Serial ATA Controller [1b21:0612] (rev 01)

You do passthrough of the ENTIRE controller. This means, the 6 physical SATA ports from the Intel one, or 2 from the ASMedia. You can't passthrough a port individually in this fashion. I think I readed somewhere about Hard Disk passthrough or virtualization (Not sure if on Xen, but there was other Hypervisor that had something like that in the works), but I don't know details about that.
You didn't said how many HDs you have or where they are connected, but chances are that you are trying to pass the controller where you have your main HD is connected to the VM, then the Hypervisor (Dom0) complains that it suddently doesn't sees the HD any longer.

Don't forget to check to check the BIOS to make sure than the ASMedia controller is enabled if you're going to use it, and both controllers for IDE or AHCI mode according to your needs. I think than the ASMedia controller was fully functional and you could even boot from drives attached to it. However, the ASMedia controller doesn't have UEFI Firmware support if I recall correctly, so if you disable in the BIOS support for Legacy Boot ROMs, it will not work (Possibily it does from inside an OS after Drivers are loaded).

If you have two HDs, you may have more success using the ASMedia controller for your main HD with Dom0 and Xen, and pass the Intel controller with the HD you want on the VM. This is because Drivers, as ASMedia may need specific Driver support slipstreamed in your Windows installation ISOs while the Intel controller is more likely to be supported out of the box.
If you have only one HD and you want to do passthrough, you may need a USB Pendrive with Xen working, maybe you can do passthrough like that. Otherwise, if you want top performance running a single HD, read this Thread, or better, this Thread Xen Users Mailing List, the best responses were there. You can supposedly have near-native HD performance using LVM, as you pass it to Xen as "raw storage" for the VM. I still didn't tried it (I'm too lazy to decide what to do with some bad blocks issues in my HD, and still pondering if RMA or zero fill and keep using) so can't tell you results.


Also, I do VGA Passthrough. On the Syslinux bootloader the Sound Card and the discrete Video Card (A Radeon 5770) are supposed to be hidden, but they appears on lspci output (Even with the VM open and using these devices). So does the ASMedia controller, which in the BIOS is disabled because I have only one HD attached to the Intel controller. None of these seems logical to appear there but they do, so that may be standard behaviator. Finally, what you seem to be doing looks to me as hotplug (Can't really tell you because I don't recognize any of those commands), I have a line in the VM config file for them to be added to the VM when it is created.

Reelevant fragment of syslinux.cfg
LABEL xen
MENU LABEL Xen
KERNEL mboot.c32
APPEND ../xen-4.3.1.gz --- ../vmlinuz-linux console=tty0 root=/dev/sda5 rw xen-pciback.hide=(00:1b.0)(01:00.0)(01:00.1) earlyprintk=vga,keep --- ../initramfs-linux.img
00:1b.0 is the integrated Realtek Sound Card, 01:00.0 the Radeon 5770, and 01:00.1 the HDMI support.

Reelevant fragment of lspci output
00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset High Definition Audio Controller (rev 05)
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Juniper XT [Radeon HD 5770]
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Juniper HDMI Audio [Radeon HD 5700 Series]
02:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 01)

Reelevant line from the VM CFG file
pci = ['00:1b.0','01:00.0','01:00.1']
 
Last edited:

LeoLinux

Junior Member
May 24, 2014
3
0
0
Thanks for your reply. Yes, I guess you made it up to the stars if you would like to call it that way ;)

  • The ASM controller with its 2x should stay with the XenServer dom0. it is meant as XenServer root HDD as well as storage to place the VMs on.
  • The intel controller with its 6x ports should be entirely passed through to the FreeBSD VM. I know, I could just pass the raw device - but yet it would still be emulated by quemu and S.M.A.R.T. wouldn't work directly from VM - it would just be an ugly work around with lots of additional configuration for my VM and its monitoring etc. ... - I would rather call it the worst case scenario ;)
I've been doing some further investigations and it turned out for me, that "xen-pciback.hide" doesn't seem to work at all. Devices which should be hidden according to "xen-pciback.hide" are still showing up with lspci on dom0 ... I even checked with "fdisk -l" on the dom0 .. and disks connected to the intel SATA controller where still showing up. The disks only stop stop showing up on "fdisk -l" on dom0, once I passed it through like this:
Code:
xe vm-param-set other-config:pci=0/0000:00:1f.2" uuid="${UUID}"
and started the vm with
Code:
xe vm-start uuid="${UUID}"
the PCI device seemed to disappear from dom0 immidiatly into the VM. I tested it again with "fdisk -l" and -voila- the drives disappeared from dom0.

At this point my virtual FreeBSD was booting ... and funny as it sounds but it turns out my PCI passthough ONLY and ONLY works if the VM has less than 8 vCPUs attached. If 8x vCPUs are configured for the VM then it will result in a kernel Panic of the virtaul booting FreeBSD.

Code:
[...]
Netvsc initializing... SMP: AP CPU #4 Launched!
SMP: AP CPU #6 Launched!
panic: can't schedule timer
cpuid = 0
KDB: stack backtrace:
#0 0xffffffff808e7dd0 at kdb_backtrace+0x60
#1 0xffffffff808af8b5 at panic+0x155
#2 0xffffffff807a14dd at xentimer_et_start+0xed
#3 0xffffffff80d66d6d at loadtimer+0xfd
#4 0xffffffff80d657fd at handleevents+0x2dd
#5 0xffffffff80d65fc8 at timercb+0x308
#6 0xffffffff807a152d at xentimer_intr+0x4d
#7 0xffffffff80883e5b at intr_event_handle+0x9b
#8 0xffffffff80d8d1c8 at intr_execute_handlers+0x48
#9 0xffffffff80d96909 at xen_intr_handle_upcall+0x159
#10 0xffffffff80c760ac at Xxen_intr_upcall+0x8c
#11 0xffffffff80861238 at mi_startup+0x118
#12 0xffffffff802d3e0c at btext+0x2c
Uptime: 1m15s
Automatic reboot in 15 seconds - press a key on the console to abort
--> Press a key on the console to reboot,
--> or switch off the system now.
Rebooting...
SMP: AP CPU #7 Launched!
... Also if I have less than 8x vCPUs configured for the VM ... the FreeBSD VM always throws me following errors:

Code:
[...]
SNM: AP CPU #1 Launched!
SNM: AP CPU #5 Launched!
SNM: AP CPU #4 Launched!
SNM: AP CPU #2 Launched!
SNM: AP CPU #6 Launched!
g_dev_taste: make_dev_p() failed (gp->name=ada0, error=17)
ugen0.2: <QEMU 0.10.2> at usbus0
g_dev_taste: make_dev_p() failed (gp->name=ada0p1, error=17)
Trying to mount root from ufs:/dev/ada0p2 [rw]...
[...]
But the system will be booted up and working propper according to my tests. So I still have to do some research whether I could savely ignore those "g_dev_taste" errors or not.

Btw.: I use a "Intel(R) Xeon(R) CPU E3-1245 v3 @ 3.40GHz" on the SuperMicro X10SAT Mainboard. The CPU offers 4 cores. It also offers hyper threading. All in all 8x cores are offered to the operating system like XenServer.

  • Does anyone have a clue why I can only use less then 8x vCPUs when PCI pssthrough of the intel C226 SATA controller is activated?
  • Also, what's it with those lspci address groups I've read a couple of times when it came to PCI passthrough? If I understood those threads correctly then in some cases I have to passthrough an entiere PCI group instead of just a single PCI device?! This is not really clear to me yet.


EDIT: My friend Google helped to find something useful about the "g_dev_taste" error:
http://forums.freenas.org/index.php?threads/multiple-ada0-partitions-on-xen-pv-w-passthrough.16574/
I guess you've got naming conflict between normal disk names of passed through AHCI controller and fake disk with the same name created by XEN PV drivers (for "compatibility" reasons). FreeBSD GEOM does not deny having two providers with the same name, that is why you see both of them in `gpart show`, but that is definitely considered wrong practice. As a workaround you can make CAM subsystem to not use ada0 device name. Try to set via loader prompt or via GUI tunables interface something like: hint.ada.0.at="scbus100".


EDIT: Also, in case someone is interested: https://forums.freebsd.org/viewtopic.php?f=3&t=46582
 
Last edited:

zir_blazer

Golden Member
Jun 6, 2013
1,191
483
136
  • The ASM controller with its 2x should stay with the XenServer dom0. it is meant as XenServer root HDD as well as storage to place the VMs on.
  • The intel controller with its 6x ports should be entirely passed through to the FreeBSD VM. I know, I could just pass the raw device - but yet it would still be emulated by quemu and S.M.A.R.T. wouldn't work directly from VM - it would just be an ugly work around with lots of additional configuration for my VM and its monitoring etc. ... - I would rather call it the worst case scenario ;)
How many Hard Disks you have? You didn't said that. If you want SMART data directly from the VM, then passing the controller makes sense. In case it is useful, as a Plan B check this article on Arch Linux wiki about smartmontools and the automatic E-Mail daemon in case of SMART warnings.


I've been doing some further investigations and it turned out for me, that "xen-pciback.hide" doesn't seem to work at all. Devices which should be hidden according to "xen-pciback.hide" are still showing up with lspci on dom0
It seems to work like that, read last part of previous Post (Was edited in and probably you missed it). Even when the VM is actively using the Video Card so it is fully functional, it still appears on Dom0 lspci.
Because you use XenServer I can't really help you with most things as I use Xen 4.3.1 and Arch Linux as Dom0, some configuration things may be very different. On standalone Xen, you have the xl toolstack, and when you use xen-pciback.hide in the bootloader, the devices automatically appear on the list that you get when you use xl pci-assignable-list command, which are those that are available and ready to be attached to a VM. If I don't use pciback.hide in the bootloader, I have to manually use xl pci-assignable-add for each device address. Basically, chances are it is working. Confirm that you have a command in XenServer to check what are the available devices to attach and try with and without that line in the bootloader config file.


  • Does anyone have a clue why I can only use less then 8x vCPUs when PCI pssthrough of the intel C226 SATA controller is activated?
  • Also, what's it with those lspci address groups I've read a couple of times when it came to PCI passthrough? If I understood those threads correctly then in some cases I have to passthrough an entiere PCI group instead of just a single PCI device?! This is not really clear to me yet.
A year or so ago there was a 3 GB or so limit on the RAM that was possible to be assigned to a VM using VGA Passthrough due to a bug that was fixed, maybe your vCPU issue is bug related. I don't know how much XenServer trails Xen in case its a Xen bug and needs to be fixed.
The lspci "groups" may be related to the tree (lspci -t). I suppose that for some PCI Passthrough to function, you may have to give the VM both the device and the bridge it depends on, or something along those lines. In your case I think you don't need it because I recall people passing only the controller with no issues.
 

LeoLinux

Junior Member
May 24, 2014
3
0
0
How many Hard Disks you have? You didn't said that.
Sorry, I fogrot. It's 28x HDDs. One of them will be for the OS root and another SSD will interact as ZFS cache / swap

If you want SMART data directly from the VM, then passing the controller makes sense. In case it is useful, as a Plan B check this article on Arch Linux wiki about smartmontools and the automatic E-Mail daemon in case of SMART warnings.
Thanks for the tipp. My FreeBSD Storage-Server setup is already trained for this awsome feature ;)

The lspci "groups" may be related to the tree (lspci -t). I suppose that for some PCI Passthrough to function, you may have to give the VM both the device and the bridge it depends on, or something along those lines. In your case I think you don't need it because I recall people passing only the controller with no issues.
Thanks for this helpful info.

A year or so ago there was a 3 GB or so limit on the RAM that was possible to be assigned to a VM using VGA Passthrough due to a bug that was fixed, maybe your vCPU issue is bug related. I don't know how much XenServer trails Xen in case its a Xen bug and needs to be fixed.
Turned out, that this is not the issue. The problem is FreeBSD related. FreeBSD does not launch the processor cores quick enough after a PCI device has been passed through. This is the reason why a timeout lets the kernel panic.

Here is some links for people who suffer under the same issue:
http://freebsd.1045724.n5.nabble.com/Fr ... 86694.html
http://lists.freebsd.org/pipermail/free ... 02034.html
Here seems to be a patch related to this issue - but I havn't tested it yet: http://xenbits.xen.org/people/royger/00 ... imer.patch
Code:
From 8ea40470d15de47aa3bd6004fc5783f94535e00d Mon Sep 17 00:00:00 2001
From: Roger Pau Monne <roger.pau@citrix.com>
Date: Mon, 17 Feb 2014 16:08:58 +0100
Subject: [PATCH] xen: debug Xen PV timer

---
 sys/dev/xen/timer/timer.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/sys/dev/xen/timer/timer.c b/sys/dev/xen/timer/timer.c
index 354085b..a31343b 100644
--- a/sys/dev/xen/timer/timer.c
+++ b/sys/dev/xen/timer/timer.c
@@ -76,6 +76,8 @@ static devclass_t xentimer_devclass;
 
 #define   XENTIMER_QUALITY   950
 
+#define NUM_RETRIES   60
+
 struct xentimer_pcpu_data {
    uint64_t timer;
    uint64_t last_processed;
@@ -413,8 +415,10 @@ xentimer_et_start(struct eventtimer *et,
     *     equipped to deal with start failures.
     */
    do {
-      if (++i == 60)
-         panic("can't schedule timer");
+      if (++i == NUM_RETRIES) {
+         panic("can't schedule timer on vCPU#%d, interval: %" PRIu64 "ns",
+             cpu, first_in_ns);
+      }
       next_time = xen_fetch_vcpu_time() + first_in_ns;
       error = xentimer_vcpu_start_timer(cpu, next_time);
    } while (error == -ETIME);
-- 
1.7.7.5 (Apple Git-26)
I also experienced, that even 6 or 7 vCPUs may sometimes also already lead to this timeout. It's a matter of luck how quick the vCPUs get initialized inside the VM by the running VM guest OS like FreeBSD. Yet it ONLY happens when a PCI device has been passed through. Otherwhise on some hosts I've even attached more than 12x vCPUs on FreeBSDs machines without any problems / worries.

One workaround may be to increase "#define NUM_RETRIES 60" in "sys/dev/xen/timer/timer.c" to some higher number - so vCPUs / cores have enough time to be initialized. ... But again I've only read about it - I didn't test it by myself yet.

Kind Regards