Question Here's how dog slow proper x86 CPU emulation is

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
I will simply say that defining "proper x86 emulation" as "cycle-accurate" is ridiculous. 99.99% of the uses of real-world x86 emulation/translation have absolutely no benefit from near-cycle-accurate emulation of random 90s microarchitectures and their peripheral ICs.

Obviously it's useful for retrocomputing folks, people wanting to run certain older games, etc - but that isn't where most interest in emulation of x86 is, and I don't think that it's inherently more "proper" than anything else.
I mostly agree. It isn’t as relevant for x86. Modern high level emulation mostly works fine. (if you ignore compatibility)

Some lower level emulation is needed to fix certain apps, but cycle accurate emulation is likely only needed for a very small number of apps such as demos and such that explicitly depend on it, which isn’t the case for anything I am aware of.

Is anyone interested in seeing some actual benchmarks of WoA emulating x86 vs native ARM for the same benchmark? I might be able to make that happen (looking into it)

I also can do some compatibility testing, but my device is slow and there will be no 3D games and such.

I can also do power benchmarks from the wall. No idea about software support for measuring power.
 

Nothingness

Platinum Member
Jul 3, 2013
2,407
736
136
I mostly agree. It isn’t as relevant for x86. Modern high level emulation mostly works fine. (if you ignore compatibility)

Some lower level emulation is needed to fix certain apps, but cycle accurate emulation is likely only needed for a very small number of apps such as demos and such that explicitly depend on it, which isn’t the case for anything I am aware of.

Is anyone interested in seeing some actual benchmarks of WoA emulating x86 vs native ARM for the same benchmark? I might be able to make that happen (looking into it)

I also can do some compatibility testing, but my device is slow and there will be no 3D games and such.

I can also do power benchmarks from the wall. No idea about software support for measuring power.
Yeah that's what we need: benchmarks on obsolete Arm machines.
 
Jul 27, 2020
16,208
10,261
106
Is anyone interested in seeing some actual benchmarks of WoA emulating x86 vs native ARM for the same benchmark? I might be able to make that happen (looking into it)

I also can do some compatibility testing, but my device is slow and there will be no 3D games and such.

I can also do power benchmarks from the wall. No idea about software support for measuring power.
Yes please.

Bonus points for benchmarks that quantify the overhead of emulation using the worst case scenarios.
 
  • Like
Reactions: podspi

Nothingness

Platinum Member
Jul 3, 2013
2,407
736
136
Yes please.

Bonus points for benchmarks that quantify the overhead of emulation using the worst case scenarios.
The guy has no access to a modern Arm platform. Whatever he'll measure will be meaningless, just a lot of air and wasted time for everyone.
 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
The device is a Raspberry Pi 5, so yes you can call it “modern” in the sense it was released this year. UEFI drivers were recently made available for it, so Windows now runs.

There is a phrase: “If you can’t say something nice, don’t say anything at all” that applies here.

Some of us don’t have an agenda, we just love technology. I personally love playing with hardware, running benchmarks, etc. That is why I asked.
 

Nothingness

Platinum Member
Jul 3, 2013
2,407
736
136
Cortex-A76 is 6 years old. It's more than 3 times slower than M3.

I can certainly understand the need to play with it and benchmark it despite that 😀 It's just that I think the results won't tell us a lot about upcoming Arm devices such as Qualcomm based laptops.
 

Shivansps

Diamond Member
Sep 11, 2013
3,851
1,518
136
The problem is that A76 is currently the best thing avalible in SBC format, i think Allwinner will be the first one in SBC with a 2xA78/6xA55 but would not touch that with a stick. Not to mention its not going to be faster than quads a76.
But the 3588 is petty decent actually, ive been using it for almost a year at this point for both linux and windows. It is the first ARM soc i ever used that i could use it as my daily driver if i wanted, hell the 3588 with the x4 3.0 pcie could actually run some gpus for gaming if the PCIE implementation was not a complete mess and we actually had drivers. Not even N100 motherboards have that for some reason.

The good thing about the RPI5 is that overclocks like crazy you can make it a faster than a RK3588 once you overclock pass 2.6ghz. Other than that i preffer the 3588/3588S.
 
Last edited:

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
I am looking forward to playing with the Pi 5 and WoA. Hopefully this week! I might order an NVME attachment first. Windows doesn’t run well on SD cards. I may just put it on a fast USB drive, however. I will post updates, screenshots, numbers, and stuff barring any issues.
 

Nothingness

Platinum Member
Jul 3, 2013
2,407
736
136
The problem is that A76 is currently the best thing avalible in SBC format, i think Allwinner will be the first one in SBC with a 2xA78/6xA55 but would not touch that with a stick. Not to mention its not going to be faster than quads a76.
But the 3588 is petty decent actually, ive been using it for almost a year at this point for both linux and windows. It is the first ARM soc i ever used that i could use it as my daily driver if i wanted, hell the 3588 with the x4 3.0 pcie could actually run some gpus for gaming if the PCIE implementation was not a complete mess and we actually had drivers. Not even N100 motherboards have that for some reason.

The good thing about the RPI5 is that overclocks like crazy you can make it a faster than a RK3588 once you overclock pass 2.6ghz. Other than that i preffer the 3588/3588S.
Even an overclocked Pi 5 will be slow: https://browser.geekbench.com/v6/cpu/compare/5375634?baseline=4371296

I agree 8cx gen 3 devices are not in the same category. But, again, benchmarks on an SBC based on an old CPU won’t tell you a lot about what a modern device can achieve.
 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
Even an overclocked Pi 5 will be slow: https://browser.geekbench.com/v6/cpu/compare/5375634?baseline=4371296

I agree 8cx gen 3 devices are not in the same category. But, again, benchmarks on an SBC based on an old CPU won’t tell you a lot about what a modern device can achieve.
You don't need to measure absolute performance, only the difference between emulated or non emulated code. The only way (that comes to mind) that would be drastically different are if the snapdragon elite has instructions the emulator can take advantage of, which while possible, isn't likely. Microsoft isn't just building WoA for Qualcomm, they didn't even start it because of Qualcomm. Previous versions of the Pi ran Windows and older ARM laptops could run Windows. Compatibility should be similar. At any rate, it will be a fun little exercise and we can get a glimpse into just how compatible that emulator is.
 

Nothingness

Platinum Member
Jul 3, 2013
2,407
736
136
You don't need to measure absolute performance, only the difference between emulated or non emulated code. The only way (that comes to mind) that would be drastically different are if the snapdragon elite has instructions the emulator can take advantage of, which while possible, isn't likely. Microsoft isn't just building WoA for Qualcomm, they didn't even start it because of Qualcomm. Previous versions of the Pi ran Windows and older ARM laptops could run Windows. Compatibility should be similar. At any rate, it will be a fun little exercise and we can get a glimpse into just how compatible that emulator is.
Sorry but again that's misleading. Emulated code has a very specific profile that will quite often hammer (among other things) branch prediction more than most other code (exc: JS-like benchmarks). And recent Arm chips made a lot of improvements in that area. So your emulated vs native speed ratio won't be translatable to a recent chip.

OTOH I definitely agree compatibility won't be impacted by using an old chip.

To be clear: that's not a rebuttal of what you want to do! As you previously wrote, you have fun doing this so you should go forward, and I'm interested in what you'll find. And if that's possible (and I'm not too lazy), I will try to reproduce your work on an MBP M1.
 

SarahKerrigan

Senior member
Oct 12, 2014
361
515
136
I pointed him to DynamoRIO some weeks ago.

I had not heard of Wiggins/Redstone and it seems to have died with Alpha. Is there anything left of it beyond the HC presentation?

Nope. As far as I can tell, neither Dynamo nor Wiggins/Redstone made any real movements toward productization after about 2001, which is surprising - Itanium would have benefited a bit from runtime instrumentation and optimization (instead of painfully long PGO cycles.)

The information that's out there is pretty much what exists, AFAIK.
 

yottabit

Golden Member
Jun 5, 2008
1,364
229
116
It took until about 3GHz C2D or better to emulate even the measly SNES (especially auxiliary chips) in a cycle-accurate way.
You’re bringing up repressed memories of the first time I tried to run ZSNES on whatever low spec Pentium or Celeron I had at the time. I had sold my SNES in favor of a Playstation and wanted to revisit some games, and thought, “surely if I can run Mechwarrior II and other 3D games, I can emulate SNES!” only to be greeted by a total (glitchy) slideshow - forget about the SuperFX chip, even games that used Mode 7 were a struggle

I haven’t had the chance to try these new PC emulators yet but I’m excited to
 
  • Like
Reactions: lightmanek
Jul 27, 2020
16,208
10,261
106
You’re bringing up repressed memories of the first time I tried to run ZSNES on whatever low spec Pentium or Celeron I had at the time.
What year was that? I was totally floored to see my Celeron 700 MHz and Voodoo3 3000 AGP card running Mario 64 at a smooth framerate with UltraHLE. I tried other emulators but they were a slideshow. Connectix Virtual Gamestation and Bleem! were two other emulators that impressed me at the time. CVGS for its mostly accurate emulation (I think it cost them about $150K to develop that) and Bleem! for being able to enhance the original image. I think this was between years 1998 and 2002.
 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
Wanted to provide an update on the testing i had hoped to do. My oldest child was hospitalized (they will be fine) so I have been dealing with that. Maybe once that is over…

What year was that? I was totally floored to see my Celeron 700 MHz and Voodoo3 3000 AGP card running Mario 64 at a smooth framerate with UltraHLE. I tried other emulators but they were a slideshow. Connectix Virtual Gamestation and Bleem! were two other emulators that impressed me at the time. CVGS for its mostly accurate emulation (I think it cost them about $150K to develop that) and Bleem! for being able to enhance the original image. I think this was between years 1998 and 2002.
That is due to high level emulation. There are (were? haven’t checked if they were still around) N64 emulators that would absolutely not run on a machine that slow. UltraHLE was groundbreaking at the time because of that.

Current generation emulators use techniques similar to UltraHLE. That is why you can emulate a switch, for example.

You can read about it here: https://en.m.wikipedia.org/wiki/UltraHLE


I always love to nerd out over this stuff, so forgive me. 🤣
 

TheELF

Diamond Member
Dec 22, 2012
3,973
730
126
And I will add that if you don't care about accuracy as much then QEMU can do it much more quickly. It is why many PC can play Nintendo Switch games at full speed.
QEMU powers xemu which runs xbox og games at full speed on pretty low end CPUs.
That was a custom pentium III in that sucker.
I think the lowest I went was Win98 with VirtualBox.
I installed a few cd rom based games on win 3.1 on dosbox.
Watch a few youtube videos, it's pretty straight forward if you understand what you need.

The great thing about PCem is that it emulates a lot of sound and graphics cards which are needed to run these old games.
 
  • Like
Reactions: igor_kavinski

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
QEMU powers xemu which runs xbox og games at full speed on pretty low end CPUs.
That was a custom pentium III in that sucker.

I installed a few cd rom based games on win 3.1 on dosbox.
Watch a few youtube videos, it's pretty straight forward if you understand what you need.

The great thing about PCem is that it emulates a lot of sound and graphics cards which are needed to run these old games.

Thanks for the PCEm mention, I did not have that one on my radar and it ticks a lot of the boxes of features i need.