• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

[PCPER] 3DMark API Overhead Feature Test

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Given that the differences are fairly small, it is more of a study on what DX12 is capable of doing. The end result is that draw call limitations are going to be of little issue when DX12 is used. It's not likely they'll be pushing those limits anytime soon, the GPU's still have to be capable of generating frames from all those draw calls.
 
Both synthetic thus far (this & Starswarm) is specific to the case studies these benches aim to look at, it's NOT indicative of games because we don't know what features game engines will push or focus on.

There's no way someone is going to make games with many millions of drawcalls per second. Think about that a bit.

What you CAN extrapolate to, is that CPU usage will be lower across the board in DX12, this means less Total System Power for equivalent workloads.

ps. It is foolish to use synthetics to beat your hated vendor with, just as it was when Starswarm was showcased.

Starswarm is pretty close to being a real game. It was a tech show case for the game they're actually making.
http://www.ashesofthesingularity.com/
 
Would a game like Eve use enough draw calls to bring out the difference between AMD and nVidia in draw call performance? There have been some huge battles in that game.

NOTE: This is not debating the like/dislike of Eve. Just using it a an example to ask the question.
 
Would a game like Eve use enough draw calls to bring out the difference between AMD and nVidia in draw call performance? There have been some huge battles in that game.

NOTE: This is not debating the like/dislike of Eve. Just using it a an example to ask the question.

Nope. Far from it. Eve is limited by the server rather than drawcalls. They have very aggressive LOD to block little details in objects & ships out of close range. As such you won't even see individual turrets or glows/particles from distant ships, just a blip basically.
 
Nope. Far from it. Eve is limited by the server rather than drawcalls. They have very aggressive LOD to block little details in objects & ships out of close range. As such you won't even see individual turrets or glows/particles from distant ships, just a blip basically.

Not to mention to handle those large battles the servers have to slow down time immensely, takes hours to make a move.
 
Starswarm is pretty close to being a real game. It was a tech show case for the game they're actually making.
http://www.ashesofthesingularity.com/

Its far from being a game. It pretty much shows an ugly pile of blured crap to reach the amount of draw calls.

If I had to make a guess, I dont think draw call usage would be more than 2-3x of DX11 the next couple of years. Simply because there is not enough GPU power to justify its usage without being some kind of quantity over quality.
 
Its far from being a game. It pretty much shows an ugly pile of blured crap to reach the amount of draw calls.

If I had to make a guess, I dont think draw call usage would be more than 2-3x of DX11 the next couple of years. Simply because there is not enough GPU power to justify its usage without being some kind of quantity over quality.

One simple example for a massive increase in DC: Individual particles with full physics & lighting simulation via DX12 compute. Lots and lots of them. Plenty of draw calls to ensure developers can achieve their vision without the CPU being the bottleneck.

Have a look at recent Star Citizen demo, they are doing crazy stuff with their damage simulation using physics & compute. Mantle/DX12 open up to gamedevs an entire new world of creativity imo.
 
One simple example for a massive increase in DC: Individual particles with full physics & lighting simulation via DX12 compute. Lots and lots of them. Plenty of draw calls to ensure developers can achieve their vision without the CPU being the bottleneck.

Have a look at recent Star Citizen demo, they are doing crazy stuff with their damage simulation using physics & compute. Mantle/DX12 open up to gamedevs an entire new world of creativity imo.

The GPU also needs to be able to keep up. Else you just exchange one bottleneck for another.
 
That's why there's Asynchronous Compute, the key part in DX12 that's going to make such creativity possible without crippling rendering performance.


Also a lot of games now don't even fully utilize all the gpu resources. When was the last time that you saw a game max put gpu utilization?
 
The GPU also needs to be able to keep up. Else you just exchange one bottleneck for another.

Unless the CPU and GPU are both hitting 100% utilization to run a game there will always be one bottleneck or another. You are almost never going to have a situation where there is not a piece of hardware that is your bottleneck even if it is only limiting the other device slightly.
 
I did a couple tests of my own, but I have other projects running right now, so I will resume later.

For now, I did a comparison between my 970,570 and 7950, as well as 2500k vs Q9550 (only DX11 and Mantle for now, sorry).

Here is the 970 vs 7950.

Nvidia is almost four times faster in Direct X but Mantle is faster than anything.

GTX 570 vs 7950 also produces similar differences, but less pronounced.


Also what is interesting, but expected is the difference between the 2500k and Q9550 scores when both are running the GTX 970. The 2500k has double the performance in draw calls.

I also made these two vids for now, for anyone that want to see how these specific systems run the test.

3dmark Api Overhead feature test GTX 970 @1.5Ghz Q9550 @4GHz

3dmark Api Overhead feature test 7950 @1.1Ghz CORE i7-860 @4GHz
 
Finally with 2 cores many of our configurations are CPU limited. The baseline changes a bit – DX11MT ceases to be effective since 1 core must be reserved for the display driver – and the fastest cards have lost quite a bit of performance here. None the less, the AMD cards can still hit 10M+ draw calls per second with just 2 cores, and the GTX 980/680 are close behind at 9.4M draw calls per second. Which is again a minimum 6.7x increase in draw call throughput versus DirectX 11, showing that even on relatively low performance CPUs the draw call gains from DirectX 12 are substantial.
Can you please explain how can it be? I thought the main advantage of new APIs was the workload of all CPU cores (instead of one in DX11). If so, should't the performance double in 2-core mode? Why there is 6.7x increase in draw call instead of 2x ?

I know there is such advantage of Mantle and DX12 as direct addressing GPU, w/o CPU. But this test is about draw calls, requested from CPU to GPU. How can we boost the number of draw calls apart from using additional CPU core?
 
Last edited:
Lower overhead per call.

DX11 vs DX12.

2806.cpucompare-640x450.png
 
Can you please explain how can it be? I thought the main advantage of new APIs was the workload of all CPU cores (instead of one in DX11). If so, should't the performance double in 2-core mode? Why there is 6.7x increase in draw call instead of 2x ?

I know there is such advantage of Mantle and DX12 as direct addressing GPU, w/o CPU. But this test is about draw calls, requested from CPU to GPU. How can we boost the number of draw calls apart from using additional CPU core?

The benefit isn't just better multithreading but also lower overhead.

If each draw calls in DX 12 takes less time to execute than it does in DX11 then you will have more draw calls even on a single core CPU.
 
Now You all know why we Call the 6xx, 7xx and 9xx nV series Old Good Fermi ;-)
And why we have x6 or x8 Cores from AMD ;-) Just below 180$ 😀
I have to See the New NFS ;-) and StarWars BF ! in Mantle + DX12.3
 
Thanks, it gets clearer. Do you know what causes the lower overhead (less time to execute)? Using low-level commands?

The application is in full control, this includes hazard control. DX 11 driver currently does a lot of checks to make sure nothing goes wrong, however, modern engines can do those check as well and they know when checks are not needed (the driver doesn't always have the full picture of what the application does, but the application knows). So rather than the driver handling hazards, resource allocation, multi GPU, in DX 12 the application does that. However, that does mean the developer has to spend more work to make sure nothing goes wrong. There is no more holding hands from the driver.

This is how I understand it from what I read. I am sure someone else can give a better and more detailed explanation.
 
Noctifer616, thanks a lot, that really does make sense. I wonder what prevented MS and Khronos from developing such APIs for PC before, as they did for game consoles.
 
This is also why we got DX11.3 and DX12. DX12 is basicly for those with the $$$$ to make sure it all works.

DX10 reduced the overhead as well. Tho not in the same amount.

DXOverhead.jpg
 
Noctifer616, thanks a lot, that really does make sense. I wonder what prevented MS and Khronos from developing such APIs for PC before, as they did for game consoles.
Because it's incredibly risky/difficult. Only a limited number of people can write low-level code like that correctly and not shoot themselves in the foot in some manner. You basically need to be a guru-level programmer to properly handle D3D12.
 
Back
Top