hardocpDX11 vs DX12 AMD CPU Scaling and Gaming Framerate

csbin

Senior member
Feb 4, 2013
838
351
136
http://www.hardocp.com/article/2016/04/13/dx11_vs_dx12_amd_cpu_scaling_gaming_framerate#.VySmOORf2Uk


medish.jpg




GPU Limited Performance Using GTX TITAN

medish.jpg


medish.jpg


medish.jpg




CPU Limited Performance with AMD Radeon Fury X

medish.jpg


medish.jpg


medish.jpg




Conclusions & Delusions



This testing has been somewhat harrowing for me over the last couple of weeks. I have spent literally a week dealing with issues with the AotS benchmark. After I had put this article together, I felt as though it would benefit from some added GPUs. Going back and building on my benchmarks, the AotS benchmark started crashing. New driver loads, new OS images, new OS installs, new game installs would NOT fix my issues. I changed all hardware and still had issues. When I moved the system over to an Intel CPU based system, all my issues went away. I do not have an explanation for this, I am just explaining that I wanted to have more AMD CPU based data on this, but it ended up being impossible for me. I will have a follow up article using a Haswell-E system with more GPUs as well.

I think some of the data that we came away from here with today is quite exciting! What can be accomplished with DX12 is impressive. With an old NVIDIA GPU, things did not look too exciting for DX12, but if you are using an older multi-core CPU, there can surely even be benefits there.

When we start looking at today's high end GPUs, like the AMD Radeon R9 Fury X, we saw the benefits of DX12 reaching into situations when our AMD FX 8370 was at its top Turbo clock of 4.3GHz. That is tremendously impressive when looking at DX11 vs DX12 and getting a LOT more out of your current hardware. Our results were consistent with big percentage increases.

Of course in those tests, according the the DX12 benchmark data, all of those frame rates are still 0% GPU limited. Meaning that the CPU is still the bottleneck in our system. And that is exactly why we used an AMD FX-8370 in this testing. If you want to see GPU limited testing, you can find that in our Day 1 Preview. I am already working on a follow up showing results with more Intel CPU cores and what results look like as we push the CPU IPC and clocks up a big higher.


The Bottom Line



Ashes of the Singularity is game that should be the poster child for CPU multi-threading and DX12. There is no doubt to me that DX12 has the ability to create a better gaming experience in certain situations. Most of those situations are going to be CPU limited however. HardOCP has been preaching now for more than a few years that putting your money into a new GPU is likely the best buy you can make, after you have moved to SSD of course. That said, there are more than a few of us keeping our CPUs for a little longer than we used to, and it seems that DX12 might very well be a contributing factor to keep those a little longer...bad news for Intel and AMD. Worth keeping in mind that is that we are just now starting to see real DX12 games, so there are a lot of unknowns at this point, but there is no doubt that DX12 has much better bones that DX11 ever did.
 
Aug 11, 2008
10,451
642
126
Lol, their gpu limited tests aren't gpu limited.

Graphs 1, 4, 5, and 6 are cpu limited. The rest are gpu limited. The test really shows very little except that nVidia still has work to do on dx 12. To truly evaluate "cpu scaling" the needed to test intel cpus, different core counts and hyperthreading.
 
Aug 11, 2008
10,451
642
126
Lol, their gpu limited tests aren't gpu limited.

Oops, double post.

Anyway, I guess the test does show as well that DX12 is faster than DX11 in this game. But I just was never impressed with these "cpu scaling" tests where they just take one cpu and downclock it. Captain Obvious---- if you use cpu limited settings and downclock the cpu, you will get a lower framerate. If they only were going to show FX, they could at least have turned off some cores to test core scaling. This test alone does nothing to prove their statements about how well DX12 uses multithreading.
 
Last edited:
Aug 11, 2008
10,451
642
126

These tests make my head spin. No consistency at all between settings of the two tests. Seems almost like they are deliberately avoiding testing FX and HW-E at the same settings so one can compare them.

Only semi-comparable data I can find about cpu between AMD and Intel is that:

Haswell 1080p FuryX crazy settings = 51.4
FX 1080p FuryX high settings = 40 FPS

So haswell at crazy is faster than FX at high, but we don't know if either or both settings are cpu or gpu limited.
 

lyssword

Diamond Member
Dec 15, 2005
5,761
25
91
Ashes of singularity actually has bad multi core scaling, no diff between i5 or i7
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,769
3,144
136
Ashes of singularity actually has bad multi core scaling, no diff between i5 or i7

that doesn't mean it has bad scaling, if they have good memory management and a job based system there could just be no execution bubbles for HT to take advantage of.

here is a solution guide describing how a piece for very expensive high end internet provider hardware has the exact same trait and how they de-allocate the hyper threads from the data/forwarding plane because of it.

https://support.f5.com/kb/en-us/solutions/public/15000/000/sol15003.html
 

lyssword

Diamond Member
Dec 15, 2005
5,761
25
91
that doesn't mean it has bad scaling, if they have good memory management and a job based system there could just be no execution bubbles for HT to take advantage of.

here is a solution guide describing how a piece for very expensive high end internet provider hardware has the exact same trait and how they de-allocate the hyper threads from the data/forwarding plane because of it.

https://support.f5.com/kb/en-us/solutions/public/15000/000/sol15003.html

http://pclab.pl/art67995-11.html Something like 2fps difference between g3920 (dual core) and i7 6700k with 980ti (can be explained by higher clock speed/cache of 6700k).I'd say pretty shitty scaling. I wouldn't be surprised if a single core skylake (1 disabled?) would be pretty close to that. Also 0 difference between 2 thread and 8thread cpu with gtx 970.
 
Last edited:

thesmokingman

Platinum Member
May 6, 2010
2,307
231
106
These tests make my head spin. No consistency at all between settings of the two tests. Seems almost like they are deliberately avoiding testing FX and HW-E at the same settings so one can compare them.

Only semi-comparable data I can find about cpu between AMD and Intel is that:

Haswell 1080p FuryX crazy settings = 51.4
FX 1080p FuryX high settings = 40 FPS

So haswell at crazy is faster than FX at high, but we don't know if either or both settings are cpu or gpu limited.



It's typical H.
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
http://pclab.pl/art67995-11.html Something like 2fps difference between g3920 (dual core) and i7 6700k with 980ti (can be explained by higher clock speed/cache of 6700k).I'd say pretty shitty scaling. I wouldn't be surprised if a single core skylake (1 disabled?) would be pretty close to that. Also 0 difference between 2 thread and 8thread cpu with gtx 970.

you are looking into beta benchmark made by pclol. Dont read much into it.
 

thesmokingman

Platinum Member
May 6, 2010
2,307
231
106
http://pclab.pl/art67995-11.html Something like 2fps difference between g3920 (dual core) and i7 6700k with 980ti (can be explained by higher clock speed/cache of 6700k).I'd say pretty shitty scaling. I wouldn't be surprised if a single core skylake (1 disabled?) would be pretty close to that. Also 0 difference between 2 thread and 8thread cpu with gtx 970.


Yea by pclol. You realize that is the shilliest of shill sites? Anyone posting links from there has an agenda.
 

TheELF

Diamond Member
Dec 22, 2012
3,973
730
126
http://pclab.pl/art67995-11.html Something like 2fps difference between g3920 (dual core) and i7 6700k with 980ti (can be explained by higher clock speed/cache of 6700k).I'd say pretty shitty scaling. I wouldn't be surprised if a single core skylake (1 disabled?) would be pretty close to that. Also 0 difference between 2 thread and 8thread cpu with gtx 970.

You can't tell if it has good scaling or not since any current GPU is completely overburdened by this game,so even a weak CPU get's the same FPS as a much stronger one.
We will have to wait for the new gen of GPUs to see any meaningful results for scaling.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,769
3,144
136
http://pclab.pl/art67995-11.html Something like 2fps difference between g3920 (dual core) and i7 6700k with 980ti (can be explained by higher clock speed/cache of 6700k).I'd say pretty shitty scaling. I wouldn't be surprised if a single core skylake (1 disabled?) would be pretty close to that. Also 0 difference between 2 thread and 8thread cpu with gtx 970.

Remember the benchmark doesn't include any AI, make sure you understand the test that's being run.
 

airfathaaaaa

Senior member
Feb 12, 2016
692
12
81
You can't tell if it has good scaling or not since any current GPU is completely overburdened by this game,so even a weak CPU get's the same FPS as a much stronger one.
We will have to wait for the new gen of GPUs to see any meaningful results for scaling.
so what you are saying (here and on the other thread)
Wait did you just make a case for staying in a single threaded world????

Because there currently is no GPU that would demand this.
Even the fastest GPU gets what, 50 FPS? A high Ghz current intel core can drive that in dx11.

I was quoting mahigan...

But what didn't you understand?
I7 has much faster single speed so it can run the dx11 driver thread much faster/fast enough,the celeron needs dx12 to do it.


is that you actually believe that the current cpu's arent capable of any meaningfull scaling?
are you actually believing this? that under dx12 wont be multithreaded?
 

TheELF

Diamond Member
Dec 22, 2012
3,973
730
126
is that you actually believe that the current cpu's arent capable of any meaningfull scaling?
are you actually believing this? that under dx12 wont be multithreaded?
The current Gpus are not fast enough to force a strong core to need dx12 to drive them,at least not in those dx12 games that only hammer the gpu with very low cpu usage.
Sure, on GCN you need dx12 because they just don't work quite as well with dx11.
On weak cores with strong gpus dx12 works as promised on both vendors.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,769
3,144
136
http://pclab.pl/art67995-11.html Something like 2fps difference between g3920 (dual core) and i7 6700k with 980ti (can be explained by higher clock speed/cache of 6700k).I'd say pretty shitty scaling. I wouldn't be surprised if a single core skylake (1 disabled?) would be pretty close to that. Also 0 difference between 2 thread and 8thread cpu with gtx 970.
ok i did some actual testing.......

setup:
3770K @ 4.3ghz R290 @1000/1250
lowest possible settings in game, windowed 1264x985

1 core with HT
Average FPS: 28.1
Aveage CPU FPS: 28.1
GPU bound normal 0.0%
GPU bound medium 0.0%
GPU bound heavy 0.1%

2 core with HT
Average FPS: 57.0
Aveage CPU FPS: 57.4
GPU bound normal 5.4%
GPU bound medium 0.1%
GPU bound heavy 0.4%


3 core with HT
Average FPS: 69.5
Aveage CPU FPS: 73.1
GPU bound normal 12.2%
GPU bound medium 12.3%
GPU bound heavy 21.7%


4 core with HT
Average FPS: 74.4
Aveage CPU FPS: 85.8
GPU bound normal 16.7%
GPU bound medium 39.6%
GPU bound heavy 64.5%



game doesn't scale with cores my.............
Next up, ill run with a bigger OC on the GPU, i expect CPU frame times to increase, will be interesting to so just how big ( if at all) an effect GPU frame time has on CPU frame time.


edit: tested @ 1100 and 1150mhz on GPU no difference to CPU FPS but GPU bound on medium and heavy batches did go down almost linear with GPU clock. Increasing memory OC from 1866 to 2133 saw an increase from 85.4 to 89.2 with 4 cores. I'll have to go back and test to see if its an improvement in latency or if im memory throughput bound by testing lower core counts, but given the low scaling ( 15% memory increase for a 5% perf increase) im guessing its from improved latency.
 
Last edited:

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
He doesn't need proof for both claims. Once claim #1 has been proven, claim #2 is immediately assumed to be true for anyone continually using it thereafter/continuing to use it more than very, very occasionally.

Seems, there is a need for a tech site scoring system. This might also help to improve their quality.
 
Aug 11, 2008
10,451
642
126
He doesn't need proof for both claims. Once claim #1 has been proven, claim #2 is immediately assumed to be true for anyone continually using it thereafter/continuing to use it more than very, very occasionally.

Well, actually he didnt prove *either* claim. I am not defending that site, I dont really know its objectivity. But, at best seems like a rather inflammatory way to present one's viewpoint.
 

guskline

Diamond Member
Apr 17, 2006
5,338
476
126
A 2015 Fury X vs a 2013 Titan, not a 2015 Titan X or even comparably priced GTX980 TI.

Why this comparison???
 

LTC8K6

Lifer
Mar 10, 2004
28,520
1,575
126
ok i did some actual testing.......

setup:
3770K @ 4.3ghz R290 @1000/1250
lowest possible settings in game, windowed 1264x985

1 core with HT
Average FPS: 28.1
Aveage CPU FPS: 28.1
GPU bound normal 0.0%
GPU bound medium 0.0%
GPU bound heavy 0.1%

2 core with HT
Average FPS: 57.0
Aveage CPU FPS: 57.4
GPU bound normal 5.4%
GPU bound medium 0.1%
GPU bound heavy 0.4%


3 core with HT
Average FPS: 69.5
Aveage CPU FPS: 73.1
GPU bound normal 12.2%
GPU bound medium 12.3%
GPU bound heavy 21.7%


4 core with HT
Average FPS: 74.4
Aveage CPU FPS: 85.8
GPU bound normal 16.7%
GPU bound medium 39.6%
GPU bound heavy 64.5%



game doesn't scale with cores my.............
Next up, ill run with a bigger OC on the GPU, i expect CPU frame times to increase, will be interesting to so just how big ( if at all) an effect GPU frame time has on CPU frame time.


edit: tested @ 1100 and 1150mhz on GPU no difference to CPU FPS but GPU bound on medium and heavy batches did go down almost linear with GPU clock. Increasing memory OC from 1866 to 2133 saw an increase from 85.4 to 89.2 with 4 cores. I'll have to go back and test to see if its an improvement in latency or if im memory throughput bound by testing lower core counts, but given the low scaling ( 15% memory increase for a 5% perf increase) im guessing its from improved latency.

You needed to duplicate a dual core, though. 1 core with HT is not like a dual core. You needed to do 2 cores without HT I think?