D3D12 articles - so much misunderstandings and miscommunications

PPB · Feb 24, 2015

Really like that you came back to this forum, albeit you are still suffering the endless bickering from those people.

+1 to the question regarding DX12 adoption rate and a possible ETA for the very first DX12 games out there.

Your technical explanation only made it more obvious for me that I should dump this 760 for something more future proof, at least API-wise.

ThatBuzzkiller · Feb 24, 2015

sontin said:
Pixelshader comes after the rasterization stage. So to use CR without hardware support in the rasterizer you need to use vertex and/or geomety shaders:
https://developer.nvidia.com/content/dont-be-conservative-conservative-rasterization

Just to be clear conservative rasterization can also be partly worked on at the beginning of the pixel shader when it's required for discarding fragments for the second algorithm ...

Paul98 · Feb 24, 2015

RussianSensation said:
Seconded! It's way too technical for me since I am not a software programmer but I always like to learn the basics of some of these next generation features.

I would also like to know from the developers how long it will be before we see DX12 games? I mean if Windows 10 only becomes available Q3-Q4 2015, it's unlikely we'll actually see wide adoption of Windows 10 for another 1-2 years and only then developers will start thinking about making DX12 games that use very specific features of DX12 API. I don't actually believe we will see many DX12 games until late 2016 to early 2017 and whatever DX12 games will come out will have to support Fermi, Kepler, Maxwell and all GCN parts or the game will not sell. That's why I am not convinced that full DX12 and 11.3 functionality actually matters for today's gaming cards. In the past small extensions such as DX8.1 vs. DX8 or DX10.1 vs. DX10 made no difference in 99% of games in the short term (2-3 years). I don't see how it would be different this time in the short term. Usually it takes 2-3 years before developers dive into the new API because games take a bit of time to develop and usually we need 2nd or even 3rd generation GPUs of that API to actually be able to play next gen games of that next gen API. Every GPU we have today is just a mid-range product as far as next generation goes (290X/780Ti/980).

Many major engines will have DX12 support out around the time of release. Thus they will be able release DX12 versions of their games at that point. So you can expect to see them by the end of this year.

ThatBuzzkiller · Feb 24, 2015

Enigmoid said:
Sure the performance is a problem on the PS4. Thats why, pun intended, they are using it conservatively. CR only applies on static objects and is done in large chunks (1m relative game size) in The Tomorrow Children (there really is not a ton of stuff on screen in that game).

Just because its being used doesn't mean that does not incur large performance penalties.

@Bold Goodness lord who told you that ?! Or did you just think that up ?

1. Rasterization in general does not care whether whether it's static or dynamic geometry ...

2. Conservative rasterization is just a modification to the coverage tests and nothing more ...

Noctifer616 · Feb 24, 2015

Paul98 said:
Many major engines will have DX12 support out around the time of release. Thus they will be able release DX12 versions of their games at that point. So you can expect to see them by the end of this year.

I think he means true DirectX 12 games, those designed with the new generation of API's in mind. Most games that will come out with DirectX 12 support this year will be not much different from Mantle games now. DirectX 11 ports to DirectX 12.

Spjut · Feb 24, 2015

Battlefront will probably support DX12, Dice has been very quick in adopting even minor DirectX revisions

AFAIK, Crysis 3 was the first game to require actual DX11.0 hardware and it was released four years after DX11.0 capable GPUs were released. We're probably looking at a similar timeframe for DX12.

Rvenger · Feb 24, 2015

ShintaiDK said:
2 outdated documents with nothing to do with DX12 and what, find it yourself links?

Its you making the claims, back it up. Else its nothing but hot air.

The OP already stated that he can't provide certain information so I don't know what your complaining is about. If you are here to cloud the discussion you can leave this thread.

This also goes for everyone else as a FYI - Keep it open minded and stop the arguing.

-Rvenger

PPB · Feb 24, 2015

Spjut said:
Battlefront will probably support DX12, Dice has been very quick in adopting even minor DirectX revisions

AFAIK, Crysis 3 was the first game to require actual DX11.0 hardware and it was released four years after DX11.0 capable GPUs were released. We're probably looking at a similar timeframe for DX12.

That is good, it means probably Battlefront will be the borked game at release a la BF4 and hopefully BF5 launches on a very stablished platform (one can dream :awe

.

blastingcap · Feb 24, 2015

zlatan said:
- Are these new features will require new hardware?
The best answer is yes and no. This is a complicated question, and hard to answer it when the specs are not public. But let's say Typed UAV Load will require hardware support. The GCN based Radeons can support it, as well the Maxwell v2 (GM206/GM204) architecture. Maybe more hardware can access the feature from NVIDIA, but I don't know because they don't disclose what possible with Maxwell v1/Kepler/Fermi. Intel might support it, but I'm not familiar with these iGPUs.

Could you please elaborate?? I didn't think GCN v1.0 or even v.1.1 could support ALL features in DX12 with hardware-only (no emulation!!!). Are you saying even GCN v1.0 supports ALL features in DX12 with hardware-only, no need for emulation?

RussianSensation · Feb 24, 2015

PPB said:
Your technical explanation only made it more obvious for me that I should dump this 760 for something more future proof, at least API-wise.

Not related to the topic of this thread directly but imo if the 760 is good enough for the games you play, don't spend money upgrading for the sake of API support. Upgrade for the games if your 760 struggles in them. When DX12 games come out that actually show us tangible benefits in IQ and/or performance advantages over DX11 and DX12 cards, then you can upgrade. In 10 months get $700 780Ti performance in a $330 970. Think about how much more GPU horsepower and features you'll be able to purchase by December 2015. There is absolutely no point in upgrading for the sake of DX12 today. If Dying Light or Evolve or Project CARS or The Witcher 3 run poorly for your tastes, sure upgrade. I would hesitate to upgrade for some future game until it hits the shelf, and that means I would definitely never upgrade for some unannounced DX12 games some time 6-12-18 months from now because of how quickly things change in the GPU land.

PPB · Feb 24, 2015

RussianSensation said:
Not related to the topic of this thread directly but imo if the 760 is good enough for the games you play, don't spend money upgrading for the sake of API support. Upgrade for the games if your 760 struggles in them. When DX12 games come out that actually show us tangible benefits in IQ and/or performance advantages over DX11 and DX12 cards, then you can upgrade. In 10 months get $700 780Ti performance in a $330 970. Think about how much more GPU horsepower and features you'll be able to purchase by December 2015. There is absolutely no point in upgrading for the sake of DX12 today. If Dying Light or Evolve or Project CARS or The Witcher 3 run poorly for your tastes, sure upgrade. I would hesitate to upgrade for some future game until it hits the shelf, and that means I would definitely never upgrade for some unannounced DX12 games some time 6-12-18 months from now because of how quickly things change in the GPU land.

The API part is important for me as I am a 0% IQ 100% FPS guy, the CPU is always the weakest link in the chain for those kind of setups.

But what is really a downer for me is that maxwell is horrible for the CUDA applications that I use, and OpenCL isnt quite there yet to get rid of the closed standard.

RussianSensation · Feb 24, 2015

PPB said:
The API part is important for me as I am a 0% IQ 100% FPS guy, the CPU is always the weakest link in the chain for those kind of setups.

That's like me buying a GPU today for 2016-supposed launch of Star Citizen to "future-proof" for that game, but yet I have no clue as to the actual launch date of Start Citizen or the performance of it for any of the current cards with current drivers. I can guarantee 100% that by the time any DX12 game launched, there will either be cards faster than a $300-330 970/290X and $550 980 OR that level of performance will be available for much less. If any gamer today is upgrading for the sake of DX12 API, they are wasting money.

In fact, until Windows 10 + a single DX12 game actually launches, upgrading primarily for DX12 is irrelevant. Part of the reason today is no one has provided a good enough technical explanation if any of the current cards will support all of DX12 features. I am betting by the time a high quality DX12 game launches we'll be able to buy a card with 980's performance for $300 anyway. So it's really just a marketing bullet-point to get people to spend extra today. GPU companies love it when gamers upgrade for marketing gimmicks.

Enigmoid · Feb 24, 2015

ThatBuzzkiller said:
What you showed just proves that it is GCN that has the LEAST to gain from conservative rasterization in comparison to Kepler which has the MOST to gain from it ...

The Arena scene shows an increase of 366% in performance if the GTX 780 had a hardware implementation of conservative rasterization whereas the R9 290 would only achieve a boost of 67% ...

What you say about feature set 11.3 requiring IHVs to do a hardware implementation of conservative rasterization is rubbish since they can choose to implement it however they wish ...

What's more is that you also proved that a software approach to conservative rasterization is very viable when the R9 290's perfomance in the Arena only falls behind the GTX 780's estimated hardware implementation by 20% ...

Yes, because clearly a 67% gain is minimal.

The 290 handled it much better but we are still talking about HUGE gains using hardware rasturization. Even a 40% performance drop is unacceptably huge.

Mindtaker · Feb 26, 2015

zlatan said:
No need for these. Just check the return value in D3D12 when the API (and the drivers) available in public.
But if you want more technical detail, than here is it:
NV Fermi: Max UAV is limited to 8 -> TIER1
NV Kepler: Max UAV is limited to 8 -> TIER1
NV Maxwellv1: Max UAV is limited to 8 -> TIER1
NV Maxwellv2: SRVs/stage is limited to 2^20 -> TIER2
Intel Gen7dot5/Gen8: the universal hardware binding table is limited to 255 slot -> TIER1
AMD GCN v1/v2/v3...: GCN is designed to a simplified resource model, so this architecture works more like a CPU than a GPU. This will allow unlimited resource binding -> TIER3

GNC Master Race shouldn't be better?

And check this

What you describe sounds like GPGPU-related stuff. You ignore how the rasterizer fits into this story. The question was: can you force the order from threads generated by the rasterizer? If so, how?
Intel and nVidia do this by guaranteeing the triangle-order from the rasterizer, and then having a sort of critical section inside a pixel-shader to make sure that the per-pixel operations of each triangle are performed in-order as well.
If the rasterizer does not know about ROV, then it may try to be smart and triangles might overtake eachother. For example, say triangles 0-4 are queued on one cluster, where triangles 5-8 are queued on another or if triangles 0, 2, 4 etc are queued on one cluster and triangles 1, 3, 5 etc are queued on another, and triangles 0, 2, 4 take longer to render than 1, 3, 5 many kinds of scenarios where triangle order can not be solved by just a critical section inside the shader.

If this is possible with GCN/Mantle, Id like to have some detailed code explaining how to set up both the rasterizer and the pixel shaders for that. And then we can see how efficient that will be. The most naive solution would just serialize all triangles, making it extremely slow. The critical section part is what makes it very efficient, since it only slows down when there is actual overlap of pixels.

AnandThenMan · Feb 26, 2015

Are you going to give a source for that quote?

Mindtaker · Feb 26, 2015

AnandThenMan said:
Are you going to give a source for that quote?

Sure, check the comments.

AnandThenMan · Feb 26, 2015

That's from September last year. And the source...well....

Mindtaker · Feb 26, 2015

AnandThenMan said:
That's from September last year. And the source...well....

Tom said the same.

Typed UAV loads: While UAVs are not defined, Mantle use a universal buffer type (image), that can support any read/write operations with full type conversion. This is a more natural path for GCN, which architecture can create unlimited reasources.
The other architectures only support limited number of resource creation. For example Intel gen7dot5/gen8 has 255 slots for UAV/SRV/CBV/sampler/descriptor table. Fermi can support 8 UAV, 128 SRV, 14 CBV, 16 sampler, 5 descriptor table. Kepler/1st Maxwell can support 8 UAV, 2^20 SRV and sampler, 14 CBV, 5 descriptor table. 2nd Maxwell can support 64 UAV, 2^20 SRV and sampler, 14 CBV, 5 descriptor table. GCN can support anything with the ability of unlimited resource creation. This is also how the console APIs works. D3D12 will have a TIER_3 class binding model for the Xbox One, and this class can be supported by GCN. The other microarchs will support TIER_1 (gen7dot5/gen8/Fermi/Kepler/1st Maxwell) and TIER_2 (2nd Maxwell).

But that is only GPGPU data.

And? sorry but that isn't an argument.

ThatBuzzkiller · Feb 26, 2015

Mindtaker said:
But that is only GPGPU data.

And? sorry but that isn't an argument.

Uh oh ...

That guy doesn't seem to know about the resource binding model ...

sontin · Feb 26, 2015

ThatBuzzkiller said:
What you showed just proves that it is GCN that has the LEAST to gain from conservative rasterization in comparison to Kepler which has the MOST to gain from it ...

And the software approach will still be much slower than a hardware implementation within the rasterizer.

What you say about feature set 11.3 requiring IHVs to do a hardware implementation of conservative rasterization is rubbish since they can choose to implement it however they wish ...

So, then explain to us how a GPU supports CR when there doesnt exist any hardware block which is processing the api call.

What's more is that you also proved that a software approach to conservative rasterization is very viable when the R9 290's perfomance in the Arena only falls behind the GTX 780's estimated hardware implementation by 20% ...

The paper doesnt show any real numbers for a hardware implementation. They are estimated by looking at the numbers form the software approach.

ThatBuzzkiller · Feb 26, 2015

sontin said:
And the software approach will still be much slower than a hardware implementation within the rasterizer.

Won't be true for much longer when more powerful hardware kicks in along with other improvements geared towards making programmable rasterization a reality just like how T&L units were obsoleted by vertex shaders ...

sontin said:
So, then explain to us how a GPU supports CR when there doesnt exist any hardware block which is processing the api call.

Conservative rasterization is just an ALGORITHM, like all algorithms they can be implemented in software.

sontin said:
The paper doesnt show any real numbers for a hardware implementation. They are estimated by looking at the numbers form the software approach.

And their estimate is pretty accurate since they have the rendering pipeline statistics to show for it oh and look they even listed the duration of the conservative rasterization process for each GPUs they tested ...

Headfoot · Feb 27, 2015

blastingcap said:
Could you please elaborate?? I didn't think GCN v1.0 or even v.1.1 could support ALL features in DX12 with hardware-only (no emulation!!!). Are you saying even GCN v1.0 supports ALL features in DX12 with hardware-only, no need for emulation?

Please provide a link showing Dx12_0 compliance prohibits emulation/partial emulation

blastingcap · Feb 27, 2015

Headfoot said:
Please provide a link showing Dx12_0 compliance prohibits emulation/partial emulation

I asked OP for clarification because I want to know if existing hardware has hardware acceleration for all features of DX12, and you ... seem to want to pick a fight instead of helping clarify anything. I am not interested in merely DX12-compatible, I would prefer DX12 hardware accelerated.

If you have some sort of ax to grind, do it somewhere else.

ThatBuzzkiller · Feb 27, 2015

blastingcap said:
I asked OP for clarification because I want to know if existing hardware has hardware acceleration for all features of DX12, and you ... seem to want to pick a fight instead of helping clarify anything. I am not interested in merely DX12-compatible, I would prefer DX12 hardware accelerated.

If you have some sort of ax to grind, do it somewhere else.

That's not finalized ...

Headfoot · Feb 27, 2015

blastingcap said:
I asked OP for clarification because I want to know if existing hardware has hardware acceleration for all features of DX12, and you ... seem to want to pick a fight instead of helping clarify anything. I am not interested in merely DX12-compatible, I would prefer DX12 hardware accelerated.

If you have some sort of ax to grind, do it somewhere else.

I'm making a point which does in fact clarify.

Precisely like ThatBuzzKiller says, rendering and rendering techniques are algorithms. An algorithm can be implemented with close to 0 flexibility into hardware, like many DSP processors for example. Or you can implement that algorithm 100% in software running on general hardware. Or, you can fall somewhere in between, like on GPUs. With the rise of GPGPU the GPU is more flexible and less rigid than it once was. Until we have verifiable details on whether dx12_0 compliance requires 100% fully hardware accelerated implementation of some features, we can't call any architecture compliant or not. I suspect full hardware acceleration may not be needed given they're saying architectures all the way back to Fermi will "support DX12" (whatever that means).

It's very possible some architectures support some things partially in hardware and partially in software, like how QuickSync works. It'll probably be noticeably slower, but that doesn't mean its not compliant...

The point I made is: Hardware acceleration has NOT been announced as a requirement for DX12 compliance. It may be, it may not be -- they haven't publicly committed one way or the other yet.

D3D12 articles - so much misunderstandings and miscommunications

Golden Member

Golden Member

Diamond Member

Golden Member

Senior member

Senior member

Elite Member <br> Super Moderator <br> Video Cards

Golden Member

Diamond Member

Elite Member

Golden Member

Elite Member

Platinum Member

Junior Member

Diamond Member

Junior Member

Diamond Member

Junior Member

Golden Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Golden Member

Diamond Member