D3D12 articles - so much misunderstandings and miscommunications

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
sontin: Marketing strikes again.
Ask yourself. Would you buy a new hardware when they say these "new features" are possible with the existing GPUs with some workaround?:awe:

Don't get angry for this but they just want to sell the product. Most of us totally understand why they are doing this. But in the other hand this doesn't mean that we can't write the truth for example here.

What has this to do with marketing?! We are talking here about features supported by dedicated hardware blocks. Nobody will use CR emulated by shaders. If this would be practicable we dont need a new feature_level to support it or to introduce it as a new feature at all...

And unlike T&L or Vertex-shaders we dont talk about CPU emulation here.
 
Last edited:

zlatan

Senior member
Mar 15, 2011
580
291
136
Nobody will use CR emulated by shaders.
I'm using conservative rasterization for shadowing. It is a hybrid shadow mapping technique with ray-tracing. The performance is not a problem for me on PS4. I don't see why it would be problem on a PC, but yeah didn't try it.

As I said the IHVs need to sell the hardware. So they now creating new buzzwords for D3D12. And as you can see it's working.:awe:
Sure you can jump to this hype train, but this don't change the truth.
 
Last edited:

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
I feel like we should only have graphics programmers speak about highly technical and specialized topics like these ...

OT: AFAIK GCN is the only one to feature the TIER 3 binding model because it can support unlimited resources which allow for unlimited amount of CBVs, SRVs, UAVs, and samplers ...
 
Last edited:

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
Sure the performance is a problem on the PS4. Thats why, pun intended, they are using it conservatively. CR only applies on static objects and is done in large chunks (1m relative game size) in The Tomorrow Children (there really is not a ton of stuff on screen in that game).

Just because its being used doesn't mean that does not incur large performance penalties.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
Intel published a paper about "Deep Shadow Buffers" in which they used a software CR approach with Kepler, GCN and Intel GPUs: https://software.intel.com/en-us/articles/deep-shading-buffers-on-commodity-gpus

A few quotes from it:
Due to the lack of hardware support in current GPUs, we implement conservative rasterization in a geometry shader (GS)
The main drawback of a geometry shader-based approach, besides having to enable the GS stage, is that we cannot rely on built-in perspective-correct vertex attribute interpolation.
Instead, the vertex attributes of the original triangle have to be passed to the pixel shader for manual interpolation, which consumes a large number of input/output registers. For these reasons, we believe hardware support for conservative rasterization is highly desired.
Indeed, support for conservative rasterization in Direct3D 12 has already been announced.
We note that conservative rasterization (CR) implemented in the geometry shader consumes a disproportionately large portion of the total frame time for the complex ARENA scene. The reason for this is twofold: the scene contains a large number of primitives, which are nearly all visible, and uses a large number of vertex attributes. These all have to be passed from the geometry shader to the pixel shader, and manually interpolated. The savings would be very large if conservative rasterization instead was implemented in hardware[...]
So yes, DX11_3 and DX12 will require a "hardware" implementation of CR within the rasterizers.
 

zlatan

Senior member
Mar 15, 2011
580
291
136
There are several implementation option for conservative rasterization. On GCN it doesn't really need to be a geometry shader implementation. I believe them that a GS approach has several drawbacks, so don't use it.

Fun fact: The vertex->pixel shader interpolation on GCN is already manual. The PS stage fetches the data than interpolate it manually. The architecture is specially designed to work this way, and do the work efficiently. It is unrolled by the complier, and the LDS memory contains the vertex data for the rasterized triangle.

I believe that conservative rasterization with Intels approach can be costly. But this is not the only way to do it.
Building a hardware for conservative rasterization will probably a huge help for Intel and NVIDIA, but not for AMD. They already use a robust solution, and GCN has much more registers than any other architectures.
The real question is what's possible in the APIs. GNM may use some features that are unaccessible on the present PC APIs.
 
Last edited:

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
And here ends the discussion about DX12 and DX11_3:
There will be only one way - through the rasterizers.

Yes, there are different ways to implement CR. But only one with the DX flag.
 

dacostafilipe

Senior member
Oct 10, 2013
810
315
136
And here ends the discussion about DX12 and DX11_3:
There will be only one way - through the rasterizers.

Yes, there are different ways to implement CR. But only one with the DX flag.

I don't understand what you want to say with that.

But I tried ... really ... :sneaky:
 

zlatan

Senior member
Mar 15, 2011
580
291
136
And here ends the discussion about DX12 and DX11_3:
There will be only one way - through the rasterizers.

Yes, there are different ways to implement CR. But only one with the DX flag.
I don't understand this. The IHVs can choose how to implement certain features. MS just want to standardize the access for these. The hardware implementation can be different. If it's compatible with the standard than MS don't care about the hardware.
 

alej

Junior Member
Feb 24, 2015
3
0
0
zlatan, can you already tell us about the support for the so called "async compute" on maxwell?
Is it "well" supported?
Youve already mentioned that there was support for this in most gpus in a non efficient manner. So, kepler already supported this?

Thank you very much for the topic!
 

zlatan

Senior member
Mar 15, 2011
580
291
136
zlatan, can you already tell us about the support for the so called "async compute" on maxwell?
Is it "well" supported?
Youve already mentioned that there was support for this in most gpus in a non efficient manner. So, kepler already supported this?

Thank you very much for the topic!
The Maxwell v2 is fine. The first Maxwell is not that good, but better than a GK110 Kepler. The earlier Keplers are bad at this.
In theory the Broadwell iGPU is very good also, but async compute will force the hardware to run at normal clock. It will still better than the turbo mode but sometimes not much.
The real king for this worlkoad is GCN, with many ACEs.
 

rainy

Senior member
Jul 17, 2013
523
453
136
Thank you very much zlatan - it was really interesting to read. :thumbsup:

Btw, I would like to see more stuff like that in the future.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
I don't understand this. The IHVs can choose how to implement certain features. MS just want to standardize the access for these. The hardware implementation can be different. If it's compatible with the standard than MS don't care about the hardware.

Microsoft indeed specifies hardware features - look at DX11 Tessellation.

Up to now Microsoft hasn't announced a new shader stage or a new rendering order for CR. Which makes is more and more clear that the "rasterizing" stage will do the CR.
And there wouldnt exist a standardization if you need to implement three+ different CR paths...

BTW: Here are the new optional DX11.3 features:
https://msdn.microsoft.com/en-us/library/dn879499.aspx
 
Last edited:

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
The Maxwell v2 is fine. The first Maxwell is not that good, but better than a GK110 Kepler. The earlier Keplers are bad at this.
In theory the Broadwell iGPU is very good also, but async compute will force the hardware to run at normal clock. It will still better than the turbo mode but sometimes not much.
The real king for this worlkoad is GCN, with many ACEs.

Carrizo, Hawaii, Tonga, Rx 300
 

alej

Junior Member
Feb 24, 2015
3
0
0
Why not the hd7000 generation? Are 2 ACEs - I think with 2 queues each - not enough?
 

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
Intel published a paper about "Deep Shadow Buffers" in which they used a software CR approach with Kepler, GCN and Intel GPUs: https://software.intel.com/en-us/articles/deep-shading-buffers-on-commodity-gpus

A few quotes from it:
So yes, DX11_3 and DX12 will require a "hardware" implementation of CR within the rasterizers.

What you showed just proves that it is GCN that has the LEAST to gain from conservative rasterization in comparison to Kepler which has the MOST to gain from it ...

The Arena scene shows an increase of 366% in performance if the GTX 780 had a hardware implementation of conservative rasterization whereas the R9 290 would only achieve a boost of 67% ...

What you say about feature set 11.3 requiring IHVs to do a hardware implementation of conservative rasterization is rubbish since they can choose to implement it however they wish ...

What's more is that you also proved that a software approach to conservative rasterization is very viable when the R9 290's perfomance in the Arena only falls behind the GTX 780's estimated hardware implementation by 20% ...
 
Last edited:

Gloomy

Golden Member
Oct 12, 2010
1,469
21
81
That's using a geometry shader too. Isn't the point of manual interpolation in GCN that you could conceivably create "geometry shader"-alike pixel shaders? If I understand it correctly. But instead Intel does:

In this particular case, the GS expands the primitive and the PS performs attribute interpolation, before calling the original pixel shader.

But the pixel shader already does the interpolation in GCN no? So why did Intel add a GS on top? Sorry if these are dumb questions.
 
Last edited:

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
Microsoft indeed specifies hardware features - look at DX11 Tessellation.

Up to now Microsoft hasn't announced a new shader stage or a new rendering order for CR. Which makes is more and more clear that the "rasterizing" stage will do the CR.
And there wouldnt exist a standardization if you need to implement three+ different CR paths...

BTW: Here are the new optional DX11.3 features:
https://msdn.microsoft.com/en-us/library/dn879499.aspx

FYI, there's going to be a GPU compatible with the DX11 feature level out there that implements tessellation in the shaders instead of having a fixed function unit like every desktop GPU so it's clear that you don't know what your talking about when it comes to how an IHV can implement features ...

BTW, game developers or the ISV won't have to implement paths specific to each GPU manufacturer when it's clearly the driver and the compilers job to do that ...
 

RussianSensation

Elite Member
Sep 5, 2003
19,458
765
126
Thank you very much zlatan - it was really interesting to read. :thumbsup:

Btw, I would like to see more stuff like that in the future.

Seconded! It's way too technical for me since I am not a software programmer but I always like to learn the basics of some of these next generation features.

I would also like to know from the developers how long it will be before we see DX12 games? I mean if Windows 10 only becomes available Q3-Q4 2015, it's unlikely we'll actually see wide adoption of Windows 10 for another 1-2 years and only then developers will start thinking about making DX12 games that use very specific features of DX12 API. I don't actually believe we will see many DX12 games until late 2016 to early 2017 and whatever DX12 games will come out will have to support Fermi, Kepler, Maxwell and all GCN parts or the game will not sell. That's why I am not convinced that full DX12 and 11.3 functionality actually matters for today's gaming cards. In the past small extensions such as DX8.1 vs. DX8 or DX10.1 vs. DX10 made no difference in 99% of games in the short term (2-3 years). I don't see how it would be different this time in the short term. Usually it takes 2-3 years before developers dive into the new API because games take a bit of time to develop and usually we need 2nd or even 3rd generation GPUs of that API to actually be able to play next gen games of that next gen API. Every GPU we have today is just a mid-range product as far as next generation goes (290X/780Ti/980).
 
Last edited:

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
That's using a geometry shader too. Isn't the point of manual interpolation in GCN that you could conceivably create "geometry shader"-alike pixel shaders? If I understand it correctly. But instead Intel does:



But the pixel shader already does the interpolation in GCN no? So why did Intel add a GS on top? Sorry if these are dumb questions.

http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter42.html

If you read chapter 42 of GPU Gems 2 a geometry shader is needed in order to create a bounding polygon for the fragment program to be able to discard fragments that do not overlap with the AABB (axis-aligned bounding box).
 
Status
Not open for further replies.