DX12 Multi-GPU is a challenge

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
Ultimate perceived Performance is not defined by frame rates. Frame rates are a stand-in substitute for defining performance that is objectively measurable. Frame rate measures throughput, not ultimate performance. Frame time measures latency. As we recently found out, these are not always matching figures and they both are useful data. Higher throughput GPUs are usually lower latency but not always.

Performance is how good the graphics look subjectively to the player. Gsync/FreeSync Monitor + lower Throughput GPU may have better subjective performance than faster throughput GPU + fixed HZ monitor.

SFR is exactly the same way. Multi-gpu methods that increase throughput usually decrease latency, but SFR decreases latency more while increasing throughput less. It may have less throughput but it has better perceived performance, which is ultimately trying to quantify the overall quality of the gaming experience. Don't confuse subjectively better (visual performance) with our best objectively measurable stand-in we use to try and quantify how much better it is (frames per second).

TL;DR: In graphics if it looks better, it is better.
 
Last edited:

thesmokingman

Platinum Member
May 6, 2010
2,302
231
106
Ultimate perceived Performance is not defined by frame rates. Frame rates are a stand-in substitute for defining performance that is objectively measurable. Frame rate measures throughput, not ultimate performance.

Performance is how good the graphics look subjectively to the player. Gsync/FreeSync Monitor + lower Throughput GPU may have better subjective performance than faster throughput GPU + fixed HZ monitor.

SFR is exactly the same way. It may have less throughput but it has better perceived performance, which is the ultimately trying to quantify the overall quality of the gaming experience.


I'd also add to that, that not all games lend themselves to high fps especially non-twitch games like RPGs. This is especially relevant in large games at large resolutions.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
DX12 multi gpu is explicit and coded by the programmer.

It may be that AMD and nVidia can provide a fall-back after-the-fact SLI and CF mode as well.

I dont think they are mutually exclusive.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
I'll also add that SFR is the broadest possible moniker.

I'm sure we'll see an absolute plethora of SFR methods. Just like how there are many different anti-aliasing methods there will be many different SFR methods, some of which are better or worse.
 

tential

Diamond Member
May 13, 2008
7,348
642
121
I'd also add to that, that not all games lend themselves to high fps especially non-twitch games like RPGs. This is especially relevant in large games at large resolutions.
Then even more in these games am I happy to get a boost from 40 fps to 60 fps minimums vs going from 90 average fps to 180 fps average from going cf/sli.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
One thing is the optimization for different architectures, but will it be easier to make code that split the workload between gpu's independently of hardware? (maybe not perfect scaling, but code that crudely split the workload, and give a minimum of ex. 50% scaling, and when optimized +80%)

Have we seen any multi GPU setups with different vendors where it wasn't something entirely different from SLI/CF. For example nVidia with Intel IGP as post processing?
 

tential

Diamond Member
May 13, 2008
7,348
642
121
I'm so glad I'm not in the mGPU camp anymore. Some devs can't even get v-sync write (really, v-sync on caps me to 30 FPS?).

Leaving it in their hands to create working profiles for x-product lines...naaaaah. Single card for me.

Good luck to you mGPU users.
I'm about to be in this camp soon as my monitor is available stateside. I see it selling in Korea for 1500 usd.

Tbh if you don't support cf/sli I won't purchase the game until a single gpu setup can run it at acceptable settings.

Really, games don't exist to me anymore until I can run them at 1800p+ resolution with acceptable costs and framerates. Otherwise I'll just wait for a gpu that can do it. Witcher 3 will probably take Arctic islands dual gpu card and if that can't handle it then 2017 will be when I play it. I got so may games to play I couldn't care less if cf/sli I supported it just means I'll be waiting! Between just Zelda games from dolphin emulator and wii u exclusives I'm occupied til mid 2016.
 

biostud

Lifer
Feb 27, 2003
19,744
6,826
136
Have we seen any multi GPU setups with different vendors where it wasn't something entirely different from SLI/CF. For example nVidia with Intel IGP as post processing?

I was more wondering if the developers "crude" code for splitting the work between multiple similar graphics cards (normal SLI/CF) would be the same for nvidia and AMD. So if they made a SFR code path it would work with both AMD and nvidia GPUs right from the start.

Then an optimized code path could be added for different vendors or for iGPU+dGPU setups, were different GPUs handled different part of the rendering/compute pipeline. This would of course require a lot more of work.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
I was more wondering if the developers "crude" code for splitting the work between multiple similar graphics cards (normal SLI/CF) would be the same for nvidia and AMD. So if they made a SFR code path it would work with both AMD and nvidia GPUs right from the start.

Then an optimized code path could be added for different vendors or for iGPU+dGPU setups, were different GPUs handled different part of the rendering/compute pipeline. This would of course require a lot more of work.

if its in the game engine code, it should be able to support targeting whatever dx12 compliant gpu is present. The reason AMD and nVidia cards dont work together is mostly political, not mostly technical. In a kumbaya kind of world they could share code and have their drivers work together or even have a single unified driver, but thats not how the real world works of course. Moving multi-GPU to the game developer puts them in a more neutral position where they can do these things if they think its worth the time and effort to do so.

Crude AFR-style code would certainly be possible. At the part of the program where you start a frame you could just round robin each frame between multiple GPUs without any additional intelligence, just like CF/SLI except at the engine level instead of the driver level. And I'm sure we'll see a lot of implementations that work this way.

Very exciting stuff. Most developers won't do anything fancy, just like today, but a few of the big name ones will do really cool things
 
Last edited:

dogen1

Senior member
Oct 14, 2014
739
40
91
I'm so glad I'm not in the mGPU camp anymore. Some devs can't even get v-sync write (really, v-sync on caps me to 30 FPS?).

Leaving it in their hands to create working profiles for x-product lines...naaaaah. Single card for me.

Good luck to you mGPU users.

SLI/Crossfire support is handled by amd and nvidia, not the game developers.
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
I'm so glad I'm not in the mGPU camp anymore. Some devs can't even get v-sync write (really, v-sync on caps me to 30 FPS?).

Leaving it in their hands to create working profiles for x-product lines...naaaaah. Single card for me.

Good luck to you mGPU users.

According to Stardock you just have to make multi-gpu work. It's nothing brand specific anymore. We'll see, I guess. But it's not supposed to be like you say though.
 

stuff_me_good

Senior member
Nov 2, 2013
206
35
91
What happened to the magical SLI bridge soldered to the motherboard that we had news many years ago? I only remember it making SFR multi-gpu rendering scaling over 90% or something like that, by rendering frames like tiled tiled resources.

Does anyone remember?
 

belmonkey

Junior Member
Sep 30, 2015
2
0
0
In the case of multi-GPU being a dGPU + iGPU, there's probably going to be a limit to how much the iGPU is allowed to do right? If the iGPU just handles something like postprocessing, would something much stronger than a typical Intel HD iGPU make much of a difference? I'm really wondering if one of those cheap Kaveri APUs like the A8-7600 with 384 shaders and 8 ACEs is going to do much in such a scenario.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
In the case of multi-GPU being a dGPU + iGPU, there's probably going to be a limit to how much the iGPU is allowed to do right? If the iGPU just handles something like postprocessing, would something much stronger than a typical Intel HD iGPU make much of a difference? I'm really wondering if one of those cheap Kaveri APUs like the A8-7600 with 384 shaders and 8 ACEs is going to do much in such a scenario.

Note the frame lag as well.

microsoft-dx12-build15-ue4frame.png
 

belmonkey

Junior Member
Sep 30, 2015
2
0
0
Note the frame lag as well.

microsoft-dx12-build15-ue4frame.png

Which part am I supposed to be looking at? The way it's labeled, it looks like the iGPU is a frame behind, or also like the iGPU has downtime after its frame while the main GPU is still working on its frame. I don't know much about this stuff.
 

tential

Diamond Member
May 13, 2008
7,348
642
121
What happened to the magical SLI bridge soldered to the motherboard that we had news many years ago? I only remember it making SFR multi-gpu rendering scaling over 90% or something like that, by rendering frames like tiled tiled resources.

Does anyone remember?
Lucid hydra or whatever it's already been mentioned in the thread. As with every single mgpu dream it died.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Which part am I supposed to be looking at? The way it's labeled, it looks like the iGPU is a frame behind, or also like the iGPU has downtime after its frame while the main GPU is still working on its frame. I don't know much about this stuff.

There is a slight benefit in raw FPS numbers in this case (~10%). But you also get input lag that is almost a frame behind (+20ms or so). That may make it unattractive for many FPS games.
 

Noctifer616

Senior member
Nov 5, 2013
380
0
76
There is a slight benefit in raw FPS numbers in this case (~10%). But you also get input lag that is almost a frame behind (+20ms or so). That may make it unattractive for many FPS games.

If it's a single frame then the latency depends on the FPS. At 60 FPS it's 16.6 FPS, at higher FPS it's would be lower.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
If it's a single frame then the latency depends on the FPS. At 60 FPS it's 16.6 FPS, at higher FPS it's would be lower.

Dont confuse it with screen FPS. This is rendering latency. From the start of rendering till output to the screen gives the added latency.

Essentially no different than playing on 10ms input lag screen vs a 30ms input lag screen.
 
Last edited:

biostud

Lifer
Feb 27, 2003
19,744
6,826
136
From nvidia:
https://developer.nvidia.com/dx12-dos-and-donts

Multi GPU
Do's

Use the DX12 standard checks to find out how many GPUs are in your system
No need to use vendor specific APIs anymore
Make sure to check the CROSS_NODE_SHARING tier

Take full control over which surface syncs need to happen and which don’t
Make full use of the explicit control over resources
Create resources that need to by synchronized on each node
Use the proper CreationNodeMask
Make them visible on other nodes that need access
Copy them to the current node when needed

Minimize the number of necessary syncs

If the device supports tier 2 cross node sharing
Check to see if RTVs, DSV and UAVs work as fast as expected
Always compare performance to a tier 1 type implementation

Dont's
Don’t try to benefit from implicit MGPU scaling

Don’t rely on any surface syncs to be done automatically (implicitly behind your back)
You should take full control over what syncs happen if you need them
 

thesmokingman

Platinum Member
May 6, 2010
2,302
231
106
^^Exactly. That poster comes across like a contrarian, argue just to argue. AFR or SFR, it doesn't really matter. What matters is that developers now have another way to run sli/cfx and they can choose the method that best suits their needs.