QEMU/KVM + libvirt unable to set cpu cache size

Gameborn

Junior Member
May 17, 2017
3
0
1
I'm running windows 10 in vm configured via virt-manager with gpu passthrough using this guide: https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#Plain_QEMU_without_libvirt and my host distro is arch.
Everything works fine, cpu and gpu performance in benchmarks is very good, but in things like games performance isn't very good and i suppose this is because of virtualized 16mb of l3 cache, my host cpu (amd a10-7850k) doesn't have it at all. I tried to manually set cache size as described in this guide: https://libvirt.org/formatdomain.html but it says "Extra element cpu in interleave", while i copy all the code for cpu from guide. My full config xml:
Code:
<domain type='kvm'>
  <name>win10</name>
  <uuid>2473ae33-76a0-4e43-8121-2dba6f29ecb7</uuid>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>4</vcpu>
  <os>
    <type arch='x86_64' machine='pc-i440fx-2.9'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/ovmf/ovmf_code_x64.bin</loader>
    <nvram>/var/lib/libvirt/qemu/nvram/win10_VARS.fd</nvram>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
    </hyperv>
    <vmport state='off'/>
  </features>
  <cpu mode='host-passthrough' check='partial'>
    <topology sockets='1' cores='4' threads='1'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
    <timer name='hypervclock' present='yes'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/sbin/qemu-system-x86_64</emulator>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/home/ilya/Downloads/Win10_1703_English_x64.iso'/>
      <target dev='sdb' bus='sata'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/win10.qcow2'/>
      <target dev='sdc' bus='sata'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <interface type='direct'>
      <mac address='52:54:00:a5:b1:07'/>
      <source dev='enp2s0' mode='bridge'/>
      <model type='rtl8139'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <input type='tablet' bus='usb'>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='spice' autoport='yes'>
      <listen type='address'/>
      <image compression='off'/>
    </graphics>
    <sound model='ich9'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </sound>
    <video>
      <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='yes'>
      <source>
        <vendor id='0x062a'/>
        <product id='0x3633'/>
      </source>
      <address type='usb' bus='0' port='4'/>
    </hostdev>
    <redirdev bus='usb' type='spicevmc'>
      <address type='usb' bus='0' port='2'/>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
      <address type='usb' bus='0' port='3'/>
    </redirdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </memballoon>
  </devices>
</domain>

I'm running latest version of all packages from arch repository.
My system:
AMD A10-7850k
MSI A88XM-E45
8GB (4x2) ddr3 2133mhz
120gb ssd + 2tb hdd
350w psu
 

LurchFrinky

Senior member
Nov 12, 2003
309
64
101
I'm not at my home computer, so this post is just from my memory, and I didn't do a lot of tweaking with my xml file, so I am no expert.
How are you sure your gaming performance is due to l3 cache limitation? I would think that would imply that the game wouldn't perform on that processor even if run natively.
What is your graphics card. I am no expert, but I can't find it in your file. You also didn't list it in your system specs. You mention gpu passthrough which implies two graphics cards, one each for the host and guest.
In virt-manager you should be able to select what type of processor you have so that it knows which features are present. I know it isn't perfect since I have to select Sandybridge for my Haswell processor, but perhaps you can play with that?
 

Gameborn

Junior Member
May 17, 2017
3
0
1
I'm not at my home computer, so this post is just from my memory, and I didn't do a lot of tweaking with my xml file, so I am no expert.
How are you sure your gaming performance is due to l3 cache limitation? I would think that would imply that the game wouldn't perform on that processor even if run natively.
What is your graphics card. I am no expert, but I can't find it in your file. You also didn't list it in your system specs. You mention gpu passthrough which implies two graphics cards, one each for the host and guest.
In virt-manager you should be able to select what type of processor you have so that it knows which features are present. I know it isn't perfect since I have to select Sandybridge for my Haswell processor, but perhaps you can play with that?
First, thanks for taking time to write big reply
I alredy found out the problem, libvirt wast too old so I installed newer version from aur and was able to set cpu cache size.
But still, performance isn't that good. My gpu is integrated r7 graphics for vm (I got lucky that it worked, I rode that usually it doesn't) and gt210 for host, WHICH IS IMPORTANT, but my iGPU is often throttles to 350 MHz from 720 and I don't know how to fix it.
Game is league of legends and it runs perfectly on highest settings natively. I also benchmarked cinebench r15, and got 88 points (very low, maybe compareable to core 2 duo) while on native I get 242 and currently 301 with overclock (haven't tested vm with overclock). Also note that qemu isn't officialy support my cpu architecture (kaveri for whole apu and steamroller for cores), but it supports previous generation and sets it by default (Opteron_G5), however I use host-passthrough mode so it shouldn't be a problem. And, as I know, there is haswell option, I have one as well as broadwell and skylake.
 

LurchFrinky

Senior member
Nov 12, 2003
309
64
101
I'm glad you made some progress. I realized immediately after posting that it was a week after your first post and you had probably moved on.
It does suck about your performance, though. It looks like you are right on the edge of compatibility and you can't brute-force your way through it without a stronger system.
The only thing else I can think of is maybe locking your IGPU in BIOS. I don't even know if it's possible. Maybe disabling sleep states would help?
Anyway, good luck.
 

Gameborn

Junior Member
May 17, 2017
3
0
1
I'm glad you made some progress. I realized immediately after posting that it was a week after your first post and you had probably moved on.
It does suck about your performance, though. It looks like you are right on the edge of compatibility and you can't brute-force your way through it without a stronger system.
The only thing else I can think of is maybe locking your IGPU in BIOS. I don't even know if it's possible. Maybe disabling sleep states would help?
Anyway, good luck.

Unfortunately there is no option like this in bios, only for cpu, and all igpu options are device iniatilization order, memory size and clockspeed. Also my cpu architecture is not on the list for iommu(amd-vi), but older apus on almost same socket (fm2 rather than fm2+) are there and I'm really happy that it worked at least. Also maybe I'll try xen rather than kvm and see if it will work faster. Also, there is one more thing, and cpu's (at least fx, ryzen and apu's like mine) are modular, so there is two x86 cores, but shared fpu and l2 cache. To work properly, system sees it like two cores and four threads(in case of quad-cores), but I don't know what to use for vm. If I use two cores ad four threads, It shows warning that and cpu's does not support hyper threading, but vm still starts. It may use two processing threads from different modules if using emulated hyper threading, which will work slow because of slower speed between modules, than in one module obviously. And none of virtualization technologies support that kind of cpu topology configuration.
sorry this looks messy, I do my best at English.