Microsoft Refines DirectX 12 Multi-GPU with Simple Abstraction Layer

Bacon1 · Jul 14, 2016

bystander36 said:
As long as they use AFR, latency is double. It's not an API thing, but a simple fact that every frame you see, took 2 frames of time to render (every other frame is done by a different GPU, that took 2 frames worth of time to render).

If they use SFR, scaling will be much worse.

Neither results in a great mid range CF/SLI experience. Multi GPU's are really only good when you want to go beyond what a single GPU can do at the high end.

Please read up on how it works in DX12. Ashes of the singularity devs have written about how they implemented it and removed latency / input delay.

RampantAndroid · Jul 14, 2016

Bacon1 said:
I think they just mean same card scaling, similar to current SLI (crossfire could do different GPU same arch for a while now).

Cross vendor / advanced functionality will take more work.

Why? If they're not relying on proprietary interconnects between the cards like an SLI bridge...Windows already supports multiple videocards running at the same time (I think that limitation was removed in Win8?)

Bacon1 · Jul 14, 2016

RampantAndroid said:
Why? If they're not relying on proprietary interconnects between the cards like an SLI bridge...Windows already supports multiple videocards running at the same time (I think that limitation was removed in Win8?)

Explicit mGPU comes in two flavors: homogeneous, and heterogeneous.

Homogeneous mGPU refers to a hardware configuration in which you have multiple GPUs that are identical (and linked). Currently, this is what most people think of when MultiGPU is mentioned. Right now, this is effectively direct DX12 control over Crossfire/SLI systems. This type of mGPU is also the main focus of this post.

Heterogeneous mGPU differs in that the GPUs in the system are different in some way; whether it be vendor, capabilities, etc. This is a more novel but exciting concept that game developers are still learning about. This opens up doors to many more opportunities to using all of the silicon in your system. For more information on heterogenous mGPU, you can read our blog posts here and here.

In both cases, MultiGPU in DX12 exposes the ability for a game developer to use 100% of the GPU silicon in the system as opposed to a more closed-box and bug prone implicit implementation.

https://blogs.msdn.microsoft.com/di...rectx-12-multigpu-and-a-peek-into-the-future/

bystander36 · Jul 14, 2016

Silverforce11 said:
What?

You need to re-think that.

Even in DX11, there are some titles where CF/SLI has zero issues with latency. Higher frame rate and smoother.

Let's imagine a 60 FPS scenario, 16ms per frame.

1 GPU = 60 FPS = 16ms per frame.
2 GPU with perfect scaling (95% is possible) = 120 FPS = 8ms per frame.

The problem is when it's done poorly, GPU #1 and #2 are not in sync well, leading to big frame time variance.

All DX12/Vulkan mGPU does is give developers more control. If they are capable, the result should be better. If they are not, well, no mGPU support at all. :/

This is where you fail to understand how AFR works.

With AFR, every other frame you see was displayed by alternating GPU's. If you get 120 FPS, your frame times are 8.33ms, the same as with a single GPU getting 120 FPS, BUT there is one major difference. Each GPU is only creating 60 FPS, and each individual frame they create, takes 16.67ms.

Let's me see if I can create a visual for you.

[GPU 1 frame][GPU 1 frame][GPU 1 frame]
..........[GPU 2 frame][GPU 2 frame][GPU 2 frame]

While the displayed times are 8.33ms a part, the rendering process of every frame is 16.67ms.

With a single GPU, the displayed and rendering times are 8.33ms.

bystander36 · Jul 14, 2016

Bacon1 said:
Please read up on how it works in DX12. Ashes of the singularity devs have written about how they implemented it and removed latency / input delay.

No matter how you slice it, every frame created in AFR takes twice as long to create as a single card at the same FPS. At 120 FPS, your frame times are 8.33ms, but each frame rendered took 16.67ms to create, and displayed at staggered intervals between the 2 GPU's.

They are talking about reducing other forms of latency created with SLI/CF, but you cannot get rid of that inherent limitation.

Silverforce11 · Jul 14, 2016

bystander36 said:
This is where you fail to understand how AFR works.

With AFR, every other frame you see was displayed by alternating GPU's. If you get 120 FPS, your frame times are 8.33ms, the same as with a single GPU getting 120 FPS, BUT there is one major difference. Each GPU is only creating 60 FPS, and each individual frame they create, takes 16.67ms.

Let's me see if I can create a visual for you.

[GPU 1 frame][GPU 1 frame][GPU 1 frame]
..........[GPU 2 frame][GPU 2 frame][GPU 2 frame]

While the displayed times are 8.33ms a part, the rendering process of every frame is 16.67ms.

With a single GPU, the displayed and rendering times are 8.33ms.

Wait, are you talking about frame time lag or input lag?

Because 120 fps @ 8.33ms interval will still appear like 120 FPS fluid. Unless the GPU #1 and #2 fail to sync up their frames, and miss an interval.

bystander36 · Jul 14, 2016

Silverforce11 said:
Wait, are you talking about frame time lag or input lag?

Because 120 fps @ 8.33ms interval will still appear like 120 FPS fluid. Unless the GPU #1 and #2 fail to sync up their frames, and miss an interval.

I said latency in general. Your latency, from input to display is increased due to the increased time to render each frame. Being fluid is great for viewing, but latency has an big effect on game play too.

Edit: to be more clear. Each frame displayed had double the rendering latency in AFR, this also increases total latency. And while Mantle implementations greatly improved frame time variance, they still aren't as good as a single GPU.

Thala · Jul 14, 2016

Edit: to be more clear. Each frame displayed had double the rendering latency in AFR, this also increases total latency. And while Mantle implementations greatly improved frame time variance, they still aren't as good as a single GPU.

Thats not necessarily the right way to look at it. If you are using AFR to increase the framerate, the latency would not increase. However you will not see the expected reduction in latency as the framerate increase compared with a (apparently faster) single GPU.
That having said, i am not the biggest supported of AFR to say the least. However i typically chose settings where the minimum framerate is above 60fps such that i can vsync. The input latency would then be comparable to running at 30fps (which means 33ms from input to buffer swap), which for me i consider acceptable.

krumme · Jul 14, 2016

Its a cost/benefit issue. What is better eg. 4 smaller dies on interposer vs big die. Cost for development and production but also perf and perf with latency issues.

Its the consoles driving this asynch stuff, and it seems to me they need to get onboard for this to be more mainstream tech. But how difficult/costly is it to implement a proper low latency multigpu solution in an engine, if its similar size used and made on interposer?

Flapdrol1337 · Jul 14, 2016

krumme said:
Its a cost/benefit issue. What is better eg. 4 smaller dies on interposer vs big die. Cost for development and production but also perf and perf with latency issues.

Its the consoles driving this asynch stuff, and it seems to me they need to get onboard for this to be more mainstream tech. But how difficult/costly is it to implement a proper low latency multigpu solution in an engine, if its similar size used and made on interposer?

I thought the main advantage was you can combine parts with different manufacturing processes. Like the intel chips with the 128MB on package memory, that cache is made on a process that makes slower but more power efficient transistors.

Search

Microsoft Refines DirectX 12 Multi-GPU with Simple Abstraction Layer

Bacon1

Diamond Member

RampantAndroid

Diamond Member

Bacon1

Diamond Member

bystander36

Diamond Member

bystander36

Diamond Member

Silverforce11

Lifer

bystander36

Diamond Member

Thala

Golden Member

krumme

Diamond Member

Flapdrol1337

Golden Member

TRENDING THREADS