Info Vulkan Ray Tracing talk

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

soresu

Platinum Member
Dec 19, 2014
2,662
1,862
136
I found this linked on the Khronos website, it's a talk about how ray tracing works in Vulkan, and expectations that the cross vendor extension will be a superset of the nVidia vendor extension:

 
  • Like
Reactions: guachi and Krteq

soresu

Platinum Member
Dec 19, 2014
2,662
1,862
136
As far as machine learning is concerned, Vulkan will never be able to run the full featured suite. Only inferencing will be an appropriate fit. If you need to train some models then you're still effectively stuck with either CUDA or HIP ...
I don't think it's intended for training, or at least not for humongo big dataset fed DNN's - though don't quote me on that, the subgroup info seemed fairly early/sketchy.

Partnership with Vulkan is ideal for image/video/gfx/vision related machine learning.

It will be interesting to see if they can get the path guiding and neural importance sampling (zero variance sampling) code working on it.

From what I read on the published papers they seemed to be running that code from a CPU, and even then getting stellar improvements, so GPU acceleration should be a non brainer.
 

soresu

Platinum Member
Dec 19, 2014
2,662
1,862
136
While subgroup operations are good, Vulkan is still missing some bindless extensions and the other essentials for wider use cases ...
To be clear, I meant specialised working subgroups within the Vulkan standards development committee - ie one for ML, one for RT, one for video and so on.
Meh, emulation of old game systems with fixed function hardware isn't all that much of a complex case compared to creating game engines, render engines, CAD, etc ...
That's not how the Dolphin guys described it, they basically implied the ubershader is emulating the entire GC/Wii gfx pipeline within a huge shader, and it took ages to get it working in a performant manner - the point was to eliminate shader caching I think (which usually causes serious jank), its been a while and the specifics have gone fuzzy in my memory.

Link to the Dolphin ubershader blog post here.
 
Last edited:

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
I don't think it's intended for training, or at least not for humongo big dataset fed DNN's - though don't quote me on that, the subgroup info seemed fairly early/sketchy.

Partnership with Vulkan is ideal for image/video/gfx/vision related machine learning.

It will be interesting to see if they can get the path guiding and neural importance sampling (zero variance sampling) code working on it.

From what I read on the published papers they seemed to be running that code from a CPU, and even then getting stellar improvements, so GPU acceleration should be a non brainer.

Vulkan will never be good enough for training models in machine learning and that applies much more so to Metal as well. For that you need a single-source programming model which will enable C++ style templates which can only be offered with CUDA/HIP or even modern console APIs like PS4 GNM so there's a higher chance of getting the full featured Tensorflow framework running on the PS4 compared to any hardware running on Vulkan which has separate-source programming model which is unmaintainable for large projects ...

To be clear, I meant specialised working subgroups within the Vulkan standards development committee - ie one for ML, one for RT, one for video and so on.

That's not how the Dolphin guys described it, they basically implied the ubershader is emulating the entire GC/Wii gfx pipeline within a huge shader, and it took ages to get it working in a performant manner - the point was to eliminate shader caching I think (which usually causes serious jank), its been a while and the specifics have gone fuzzy in my memory.

Link to the Dolphin ubershader blog post here.

The ATi Flipper/Hollywood are old fixed function GPUs so they aren't much of a challenge emulate. Also uber shaders are just shaders with many branches. Far more systems out there with trickier GPUs to emulate ...
 

soresu

Platinum Member
Dec 19, 2014
2,662
1,862
136
The ATi Flipper/Hollywood are old fixed function GPUs so they aren't much of a challenge emulate. Also uber shaders are just shaders with many branches. Far more systems out there with trickier GPUs to emulate ...
As they say, put your money where your mouth is.

Talk is cheap, but developing an accurate, fast emulator takes a lot of time - any idiot can claim it's easy, not that I claim the capability for myself, but the emulator devs deserve the respect owed for time and effort taken.

The Dolphin devs are the experts here - they say it's hella complex to implement (100s-1000s of man hours invested just to prove it works), and I'm inclined to believe them, given their results clearly speak for themselves.
Also uber shaders are just shaders with many branches. Far more systems out there with trickier GPUs to emulate ...
None of them yet have ubershaders (unless I missed it, feel free to correct me if so), you kinda proved my point by mentioning them - I've seen a Cemu dev as good as say they won't even attempt to implement one.

Given the figure of time invested to proof of concept quoted by Dolphin, and their ubershader operating requirements - I'm inclined to think more complex GPU's will take an extremely long time to code ubershaders for, and may not even run adequately enough on modern GPU's to make such a project worthwhile.

Having said that it seems like GPU's are slowly getting better at branch heavy code, it may be a more viable path in the future.
 
Last edited:

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
As they say, put your money where your mouth is.

Talk is cheap, but developing an accurate, fast emulator takes a lot of time - any idiot can claim it's easy, not that I claim the capability for myself, but the emulator devs deserve the respect owed for time and effort taken.

The Dolphin devs are the experts here - they say it's hella complex to implement (100s-1000s of man hours invested just to prove it works), and I'm inclined to believe them, given their results clearly speak for themselves.

I don't mean to devalue their work but emulation of old GPUs stopped being a torture tests quite a while ago ...

None of them yet have ubershaders (unless I missed it, feel free to correct me if so), you kinda proved my point by mentioning them - I've seen a Cemu dev as good as say they won't even attempt to implement one.

Given the figure of time invested to proof of concept quoted by Dolphin, and their ubershader operating requirements - I'm inclined to think more complex GPU's will take an extremely long time to code ubershaders for, and may not even run adequately enough on modern GPU's to make such a project worthwhile.

Having said that it seems like GPU's are slowly getting better at branch heavy code, it may be a more viable path in the future.

It's the other way around. Dolphin having 'ubershaders' (a misnomer in itself) means it's very simple system. 'Ubershaders' as you would have it are nothing special since it's used all the time in game engines ...

Ubershaders in system emulation can get more complicated with modern systems by the fact that the applications for them do offline compilation. Take for examples Xbox 360 (custom DX shaders -> Xenos/custom Adreno 2XX binaries), PS3 (GCM shaders -> RSX/custom NV47 binaries), WII U (GX2 shaders -> custom R600/R700 binaries), Xbox One (custom DX shaders -> GCN2 binaries), PS4 (GNM shaders -> GCN2 binaries) and for fun we'll include portable systems as well like the 3DS (PICA shaders -> PICA binaries), Vita (GXM shaders -> USSE binaries), Switch (NVN shaders -> Maxwell 2 binaries) ... (as an added extra a gfx API is not without it's pipeline either such as DX/GNM/GX2 pipeline -> PM4 commands, GXM pipeline -> SGX commands, PICA pipeline -> PICA commands, NVN pipeline -> Maxwell 2 commands)

Ubershaders are a bad idea for doing modern game console emulation since their expensive offline compilation model is not conducive to that strategy compared to doing it for the ATi Flipper/Hollywood which are totally fixed function GPUs ...

Attempting to recompile the shader binaries into DXIL/SPIR-V bytecode and then having the drivers compile it again into the hardware's native ISA will introduce tons of compilation overhead. If emulator developers really wanted to achieve the same effect so badly as Dolphin's 'ubershaders' would have it then they could just reimplement the console's gfx API or make a new gfx API to match the original console as closely as possible on mesa but it'll cause tons of portability issues ...

Implementing GNM on mesa knowing that it'll only work on GCN2 GPUs doesn't seem very attractive. Same deal with implementing NVN on mesa which will only work on Maxwell 2 GPUs but on the bright side for both cases you'll get close to zero compilation overhead since the compatible host GPU will just happily accept their own binaries!

You could make a new gfx API that's less of a match to the console's gfx API to be compatible with more hardware configurations but you'll introduce compilation overhead for certain instructions/shaders that isn't compatible with the other hardware but the worst part is that the vendor(s) you were targetting could just obsolete your work if their next GPU designs aren't compatible with your API ... (Hardware designs become obsolete overtime so tying the API to hardware isn't a good idea)

Fun fact, both CUDA/HIP have an offline compilation model! (PTX is as 'offline' as it can be for Nvidia GPUs)

I still wish the Mantle API wasn't depreciated since it could've been more useful for PS4 emulation compared to Vulkan. Now I can't remember off the top of my head but with Mantle shaders I think you had the ability to either write raw GCN assembly code or use AMDIL (AMD intermediate language for AMD GPUs) but both cases were more useful than using SPIR-V since they were a better match for GNM shaders/GCN2 binaries ...
 
Last edited:

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
I’ve yet to hear of a game that works better in Vulcan vs dx11.

The problem here is probably drivers. For vulcan /dx12 stuff that is handeled by the driver needs to be done by the devs and as we know from history devs had been terrible to adhere to the standards and coding things correctly hence why such things as game-read drivers exists because they patch the broken /subpar code from the games. Now the devs need to do it themselves or we see the issue we see with micro stutters and inconsistent performance. Also in benches dx11 looks good because benches are usually done on high end CPUs.The real benefit of vulcan and co. are performance on weaker CPUs.
 

DeathReborn

Platinum Member
Oct 11, 2005
2,746
741
136
The problem here is probably drivers. For vulcan /dx12 stuff that is handeled by the driver needs to be done by the devs and as we know from history devs had been terrible to adhere to the standards and coding things correctly hence why such things as game-read drivers exists because they patch the broken /subpar code from the games. Now the devs need to do it themselves or we see the issue we see with micro stutters and inconsistent performance. Also in benches dx11 looks good because benches are usually done on high end CPUs.The real benefit of vulcan and co. are performance on weaker CPUs.

When AMD announced Mantle (and again when DX12/Vulkan was announced) one of my first thoughts was "oh, so this is how AMD gets better optimisation in it's drivers, shove the difficult coding onto the already overwhelmed developers". Probably all done with the best of intentions but it hasn't exactly panned out so far.

I do wish people would leave Star Trek out of this though, it's a "k" not a "c".