D3D12 is Coming! AMD Presentation

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

destrekor

Lifer
Nov 18, 2005
28,799
359
126
You cannot have more than one Thread per Core (No SMT etc) per cycle.
What modern superscalar processors are able to do is to split each thread in to multiple sub portions (instructions) and decode, execute and retire them. But you always have a single thread per Core per cycle.

Um, no. Not for Intel's HT implementation, at least. It is true parallel, simultaneous multithreading. As in, two whole threads can be run in parallel.

What you may be thinking of is temporal multithreading, which requires threads to be parsed into a serial stream, thus only truly allowing one thread per clock.

I'm not sure why you say you cannot have SMT when there is in fact SMT available to utilize in the class-leading CPUs.
Explain.


edit:

Now obviously, SMT /= pure 2x performance, as in a doubling in performance versus the baseline number of cores. This is not the intent of multithreading, but it does not explain away the concept of SMT. It is not the case where 100% of situations will allow for every necessary thread to be fully capable of achieving peak performance within an SMT processor, due to resource sharing of that which is allocated in any individual core.

But in almost every single instance, proper utilization of the CPU will mean that an 8-thread, 4-core CPU can outcompete an 8-thread, 8-core CPU if the former's cores perform a sufficiently greater number of IPC. It is very application dependent. Many applications will perform better on that 4c/8t Intel CPU vs a 6c/6t Intel CPU, even when the latter has a greater IPC. In numerous cases, however, that is reversed.
 
Last edited:

Red Hawk

Diamond Member
Jan 1, 2011
3,266
169
106
Stop trying to read this as some sort of technical information, it's not. It's just a marketing slide to show 6 threads being used. It doesn't matter what he puts in those boxes the result is the same. That isn't technical information ether, but is at least a somewhat better representation.

Marketing what? This presentation isn't really meant for gaming consumers, but for developers. He's talking about DirectX 12, which is something AMD supports of course but will be equally usable by Nvidia. It's a very basic illustration, not highly technical of course, but it communicates the correct concept.

Holy balls Batman I didn't realize they were so expensive on the CPU side.

Ever tried turning shadows all the way up in Rome 2/Attila Total War? Doesn't matter what graphics card you have or how many CPU cores you have, it will cause PCs with mid-level clock speeds to chug.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
Um, no. Not for Intel's HT implementation,

I specifically left SMT out to simplify things. Without any kind of SMT or any other technology like that, you only have a single Thread per core per cycle, simply as that.

Anyway i was not talking about the console CPU cores, i had to explain that from the beginning my bad.
 

TheELF

Diamond Member
Dec 22, 2012
4,027
753
126
Marketing what? This presentation isn't really meant for gaming consumers, but for developers. He's talking about DirectX 12, which is something AMD supports of course but will be equally usable by Nvidia.
Dude!
http://forums.anandtech.com/showpost.php?p=31520674&postcount=28
Nvidia uses this technology multithreaded rendering for so long that people don't even remember that they do.

AMD is ,like in so many cases,marketing themselves as the inventor of the wheel.

They putting a spin on having the worst drivers possible for so long by trying to convince people that they single-handedly crated mantle and forced the industry to move to dx12.
 

Red Hawk

Diamond Member
Jan 1, 2011
3,266
169
106
Dude!
http://forums.anandtech.com/showpost.php?p=31520674&postcount=28
Nvidia uses this technology multithreaded rendering for so long that people don't even remember that they do.

AMD is ,like in so many cases,marketing themselves as the inventor of the wheel.

They putting a spin on having the worst drivers possible for so long by trying to convince people that they single-handedly crated mantle and forced the industry to move to dx12.

Deferred contexts in Direct3D 11, which not even every D3D11 game used, is not the same thing as a low level API like Mantle, Vulkan, and D3D12. Similar ideas, maybe, but not the same, and not as effective. The Star Swarm demo actually enables deferred contexts in D3D11 when running on an Nvidia card; it's not as good as running with AMD and Mantle. Richard Huddy of AMD said in an interview that they had explored using deferred contexts, but ultimately decided that low level APIs were the better option to invest resources in.

Also, some of what the AMD developer is talking about here has to do with asynchronous compute, a completely new feature to D3D12. Mantle was actually intended to support this feature, but Mantle development was abandoned in favor of Vulkan and D3D12 before it got that far. It's a completely new feature unrelated to deferred contexts, and in fact Fermi, Kepler, and Maxwell 1 cards do not support it, to my knowledge.
 

TheELF

Diamond Member
Dec 22, 2012
4,027
753
126
It's multithreaded rendering no matter how you look at it and that's what aten-ra is crying about the last few pages.
He was talking about that point specifically and not about mantle in general because he thought that that would promote his multicore concept.

Sure the low level APIs are the better option but that did not stop nvidia from releasing proper drivers 4 years ago that would at least use what was available at the time instead of saying lets wait 4 years for the next api before making hasty decisions...
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
It's multithreaded rendering no matter how you look at it and that's what aten-ra is crying about the last few pages.
He was talking about that point specifically and not about mantle in general because he thought that that would promote his multicore concept.

Sure the low level APIs are the better option but that did not stop nvidia from releasing proper drivers 4 years ago that would at least use what was available at the time instead of saying lets wait 4 years for the next api before making hasty decisions...

Don't confuse multi-thread with multi-core. FWIU all "rendering" is done on one core with DX11. Johan Andersson said that the biggest problem with DX11 is it's "painfully serial". The reason that multi-threading DX11 gives such minimal improvement (Which is AMD's complaint. It's just not worth it.) is because of these issues.

You can go back to first release of Kepler and fast forward to today and nVidia hasn't gained any performance advantage vs. AMD. If anything, they've lost performance. This is even true with Maxwell. The 5% clock increase in the 390X vs 290X isn't the reason it now trades blows with the 980. A card that was definitely faster (10%-15%) when it was released. nVidia's multi-thread rendering isn't really making a difference.
 

TheELF

Diamond Member
Dec 22, 2012
4,027
753
126
Don't confuse multi-thread with multi-core. FWIU all "rendering" is done on one core with DX11. Johan Andersson said that the biggest problem with DX11 is it's "painfully serial". The reason that multi-threading DX11 gives such minimal improvement (Which is AMD's complaint. It's just not worth it.) is because of these issues.
Oh is that why nvidia released a driver update last year that provided similar gains with what amd managed with mantle?
http://www.maximumpc.com/nvidia-takes-mantle-enhanced-dx11-driver-2014/
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
It's multithreaded rendering no matter how you look at it and that's what aten-ra is crying about the last few pages.
He was talking about that point specifically and not about mantle in general because he thought that that would promote his multicore concept.

No, i was specifically talking about DX-12, Multi-Core CPUs and how much more work you can feed the GPU simultaneously.
 

TheELF

Diamond Member
Dec 22, 2012
4,027
753
126
No, i was specifically talking about DX-12, Multi-Core CPUs and how much more work you can feed the GPU simultaneously.
Yeah but you still haven't showed us how long in cycles a draw call needs for it to be send to the gpu.
Or how this will overcome the much slower execution of the game code itself.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
Yeah but you still haven't showed us how long in cycles a draw call needs for it to be send to the gpu.
Or how this will overcome the much slower execution of the game code itself.

According to the Eurogamer 3D Mark API review, the slower Single Thread performance FX CPUs (Higher Core Count) are as fast or faster than the higher Single Thread performance Intel CPUs issuing draw calls in DX-12.

For example the Triple Module 6x Threads FX6300 is faster than Dual Core 4 Threads Core i3.
And the Quad Module 8 Threads FX8350 is close/faster vs the way Faster Single Thread Quad Core Haswell Core i5 4690K.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
According to the Eurogamer 3D Mark API review, the slower Single Thread performance FX CPUs (Higher Core Count) are as fast or faster than the higher Single Thread performance Intel CPUs issuing draw calls in DX-12.

For example the Triple Module 6x Threads FX6300 is faster than Dual Core 4 Threads Core i3.
And the Quad Module 8 Threads FX8350 is close/faster vs the way Faster Single Thread Quad Core Haswell Core i5 4690K.

Again you draw conclusions from something you dont understand. You are way too deep into the marketing trap.

That test is more GPU related than CPU due to the command processor limit.

2836210-4530027188-dx12-.jpg


And these draw calls are extremely simplified and doesnt show anything real world. And it wont go beyond 6 cores for that matter. The scaling from 2 to 4 cores is only 50%. From 4 to 6 cores its 15% at best. Meaning the faster quad wins again. but again its all moot since we are not going to see a game using such an extreme amount of draw calls. Because then it also look like star swarm.

73023.png
 
Last edited:

TheELF

Diamond Member
Dec 22, 2012
4,027
753
126
That is true only if you have something running in the CPU alone. In games that the GPU also do a lot of work, if your DX-12 game engine can scale to 6-8 treads, in order to feed the GPU more work at any given time, 6-8 slower cores will give you higher fps than 4 faster cores.
That is because you manage to feed the GPU doing more work per given time.
So we went from definitely being faster on a 6 core than on only 4 cores to being quite slower with 6 and just barely the same speed with 8 cores...

According to the Eurogamer 3D Mark API review,you posted.
FX 6300 DX12 7.7m 12.6m 12.5m 12.7m
FX 8350 DX12 7.7m 14.1m 16.0m 14.8m
i5 4690K DX12 8.1m 14.1m 14.5m 14.7m
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
So we went from definitely being faster on a 6 core than on only 4 cores to being quite slower with 6 and just barely the same speed with 8 cores...

According to the Eurogamer 3D Mark API review,you posted.
FX 6300 DX12 7.7m 12.6m 12.5m 12.7m
FX 8350 DX12 7.7m 14.1m 16.0m 14.8m
i5 4690K DX12 8.1m 14.1m 14.5m 14.7m

Again, the Eurogamer review above is only about drawcalls. In actual games your CPU will also have to perform other tasks simultaneously like AI, Physics etc etc and Draw calls.
We could have a way faster Quad Core that will be faster than a way slower 6 or even 8 core CPU but that needs to be in an order of 2x faster or something to accomplish this in DX-12 game.
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
Again, the Eurogamer review above is only about drawcalls. In actual games your CPU will also have to perform other tasks simultaneously like AI, Physics etc etc and Draw calls.
We could have a way faster Quad Core that will be faster than a way slower 6 or even 8 core CPU but that needs to be in an order of 2x faster or something to accomplish this.

This is also where the slow CPU will be hurt the most. Because the game logic itself wont be multithreaded like no tomorrow. Sure you can find a few exceptions. But that is about equal to claiming we all play chess games.

Also those draw calls will never be hit in real life.

In short, its unlikely that anything will change from how it looks today. If anything you will be able to get away with less CPU performance. But more likely no real change.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
This is also where the slow CPU will be hurt the most. Because the game logic itself wont be multithreaded like no tomorrow. Sure you can find a few exceptions. But that is about equal to claiming we all play chess games.

Also those draw calls will never be hit in real life.

In short, its unlikely that anything will change from how it looks today. If anything you will be able to get away with less CPU performance. But more likely no real change.

Heh, even today with DX-11 and latest games that can take advantage of 4-6 threads show significant performance increase with the 8-core FX CPUs against Core i5. You really believe that DX-12 games will not benefit more from higher core count than DX-11 ??
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
Heh, even today with DX-11 and latest games that can take advantage of 4-6 threads show significant performance increase with the 8-core FX CPUs against Core i5. You really believe that DX-12 games will not benefit more from higher core count than DX-11 ??

You greatly overestimate the benefit. The short answer is no, I dont think DX12 games will benefit from more core count than DX11 if we exclude slow CPUs. Sure thing the draw calls will be distributed more and benefit slower CPUs due to lower overhead. But that is also about it. Everything related to the game logic doesnt change. And by distributed it seems the sweet spot is still 4 cores for DX12. The scaling to 6 cores gave 0-15%. And anything above that 0. And that is assuming you are draw call limited.
 

Despoiler

Golden Member
Nov 10, 2007
1,968
773
136
You greatly overestimate the benefit. The short answer is no, I dont think DX12 games will benefit from more core count than DX11 if we exclude slow CPUs. Sure thing the draw calls will be distributed more and benefit slower CPUs due to lower overhead. But that is also about it. Everything related to the game logic doesnt change. And by distributed it seems the sweet spot is still 4 cores for DX12. The scaling to 6 cores gave 0-15%. And anything above that 0. And that is assuming you are draw call limited.

It's funny that you keep pointing out that the API overhead tests don't really mean anything in the real world. I agree that it needs to be understood they are a throughput test only. However, you then use these same limited scope tests to justify sweeping statements about core scaling benefits in the real world. Furthermore, you talk about game logic not changing, but that is already untrue. Ashes of the Singularity is showing us what is now possible.

The last I had read, the Oxide engine uses a thread to schedule all of the other threads It will dynamically spawn as many threads as it needs to get he work done. The scaling they said was massive to infinite. The other side to that is how much work do you actually have to do? With Ashes of the Singularity, the stated 16 core example was with "thousands of units". Keep in mine that their units have independent AI down to the turret. Thousands of units and multiple turrets per unit. Then you add the fact that all of the turrets have their own lights for their shots. It's easy to see how they are able to exploit as much CPU horsepower as is available.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
I would claim Ashes of Singularity is just as possible on DX11. It may not be as pretty due to draw calls.

All those units and AI got nothing to do with DX12. Only the graphics.