Quantum Break: More like Quantum Broken.

Page 14 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Mahigan

Senior member
Aug 22, 2015
573
0
0
We know that Quantum Break uses a lot of Framebuffer. The developers had stated this prior to its release.

The Fiji cards performance, in this title, is largely based on a driver update from AMD which added +35% performance. This was achieved by tweaking the memory usage of the game, just as AMD did with Rise of the Tomb Raider.

The other heavy aspect of Quantum Break are its compute shaders used for its post processing and lighting effects.

The volumetric lighting is hard on hardware that doesn't support compute parallelism to the degree required by the compute shaders programmed into the game.

The combination of the heavy use of compute shaders and framebuffer explains most of the performance variances we're seeing in the benchmarks.

Quantum Break is pretty much the console effect in spades.

I'm not surprised at seeing Maxwell struggle compared to GCN under this title. It's pretty much what I had foreseen a while back (summer of 2015).

I think it is safe to say that Maxwell won't age well. Given the trend, Pascal may perform well once released but likely also won't age well.

For users who keep their vid cards for a year or so, this might not matter. For those who expect several years of use, best to go GCN.

I'm willing to wait on Total War: Warhammer but I doubt it will be any different.
 

mohit9206

Golden Member
Jul 2, 2013
1,381
511
136
Does Quantum Break run on non dx12 supported cards ? Or does the game not launch at all with those cards. You know those old but still capable cards like hd6950 and gtx560ti?
 

tential

Diamond Member
May 13, 2008
7,348
642
121
Does Quantum Break run on non dx12 supported cards ? Or does the game not launch at all with those cards. You know those old but still capable cards like hd6950 and gtx560ti?
What are those cards capable of exactly at this point? 720 p low res gaming?

Probably need to upgrade to pascal/Polaris for the dx12 era
 

mohit9206

Golden Member
Jul 2, 2013
1,381
511
136
What are those cards capable of exactly at this point? 720 p low res gaming?

Probably need to upgrade to pascal/Polaris for the dx12 era
Well i have a gt730 gddr5 which is DX12 supported but i have Windows 7. So people are screwed both ways. Those without dx12 cards and also those without win 10.
 

Bacon1

Diamond Member
Feb 14, 2016
3,430
1,018
91
Well i have a gt730 gddr5 which is DX12 supported but i have Windows 7. So people are screwed both ways. Those without dx12 cards and also those without win 10.

Upgrade to 10, its free.

That said, your 730 isn't going to run well at all.
 

tential

Diamond Member
May 13, 2008
7,348
642
121
Well i have a gt730 gddr5 which is DX12 supported but i have Windows 7. So people are screwed both ways. Those without dx12 cards and also those without win 10.
There isnt a nice way to tell you how little I dare about your plight considering windows 10 is free.

I don't remotely care about Windows 7 users sorry.
 

SPBHM

Diamond Member
Sep 12, 2012
5,066
418
126
this game will most likely run at under 20FPS even with all lowered settings on a 730, so it's not a loss,

if you want to play this game your options are basically buy an Xbox One, buy an Xbox One or run badly on a high end PC, considering the port even with a Fury X runs with broken frame pacing and looks the same 95% of the time, the only advantage is supporting mouse and keyboard I suppose.
 

mohit9206

Golden Member
Jul 2, 2013
1,381
511
136
There isnt a nice way to tell you how little I dare about your plight considering windows 10 is free.

I don't remotely care about Windows 7 users sorry.
But Windows 10 sucks right? Its always spying, forced automatic updates, etc.
But the main reason i won't buy Quantum Break even if i had a PC that could run it is because i don't want to support MS and their Windows store.
 

tential

Diamond Member
May 13, 2008
7,348
642
121
But Windows 10 sucks right? Its always spying, forced automatic updates, etc.
But the main reason i won't buy Quantum Break even if i had a PC that could run it is because i don't want to support MS and their Windows store.
I'm assuming you're joking with the w10.

With qb... If you release games with massive issues on release its clear to me you don't care about me remotely. You'd screw me at the first opportunity you could, and should I not do the same to you? If you don't care about me remotely, I don't care about you. And if I don't care about you.....

We'll leave it at that.
 

poofyhairguy

Lifer
Nov 20, 2005
14,612
318
126
AMD GPUs like Hawaii, Tahiti are impressive. But unfortunately all of this has not helped AMD financially.

The 390 is pretty much sold out right now on Newegg.

It is amazing seeing a 2015 $300 card (290x) beating a $650 card (980 ti) at 1080p. Directx 12 is the Maxwell killer.
 

MBrown

Diamond Member
Jul 5, 2001
5,726
35
91
This game is broken at 1440p at ultra settings for me. Like much worse than the benchmarks. I seriously am only getting like less than one fps. I don't know what is going on. I downloaded the latest nvidia drivers. Game runs perfectly fine at 1080p.
 

Raising

Member
Mar 12, 2016
120
0
16
The 390 is pretty much sold out right now on Newegg.

It is amazing seeing a 2015 $300 card (290x) beating a $650 card (980 ti) at 1080p. Directx 12 is the Maxwell killer.

The shill is strong in this one..

insulting other members is not allowed
Markfw900
 
Last edited by a moderator:
Feb 19, 2009
10,457
10
76
This game is broken at 1440p at ultra settings for me. Like much worse than the benchmarks. I seriously am only getting like less than one fps. I don't know what is going on. I downloaded the latest nvidia drivers. Game runs perfectly fine at 1080p.

From what other users have posted, dial it down from Ultra, it should run much better. The visuals still look quite good on Medium.
 

MBrown

Diamond Member
Jul 5, 2001
5,726
35
91
From what other users have posted, dial it down from Ultra, it should run much better. The visuals still look quite good on Medium.

Tried this. It starts of almost playable and then exponential slows down to a crawl in like 15 seconds.
 

poofyhairguy

Lifer
Nov 20, 2005
14,612
318
126
The shill is strong in this one..

Shill?! That was an obvious observation from the graph:

qb_ultra.png


Unless that graph is wrong we are seeing the 290x jump an entire class of card higher in THE Direct12 game that many were looking to as a barometer for future Directx 12 games.

I am sure Nvidia has something slick up their sleeve to sell this year, and Maxwell sure did well in a Directx11 era, but current Nvidia cards are falling behind in a way we haven't seen since the Direct 9 shift. And the 390s are pretty much sold out on Newegg:

http://www.newegg.com/Product/Product.aspx?Item=N82E16814127874

http://www.newegg.com/Product/Product.aspx?Item=N82E16814125805

http://www.newegg.com/Product/Product.aspx?Item=N82E16814202164

http://www.newegg.com/Product/Product.aspx?Item=N82E16814121974

http://www.newegg.com/Product/Product.aspx?Item=N82E16814150728

Shill not found.
 
Feb 19, 2009
10,457
10
76
The lack of 390s might just signify preparation for Polaris.

Miners more than anything, they buying it all out while $$ can be made with Radeons.

But yes, next-gen coming will mean AMD and NV would have ceased production of these GPUs.

If the mining craze is still strong in a few months, expect major supply shortage for Polaris.
 

Mahigan

Senior member
Aug 22, 2015
573
0
0
There is no Asynchronous Compute in Quantum Break, as I had stated...

d4c109018ed4789baea86399ce471dd0.jpg


The game appears to hammer GM20x's L2 cache leading to stuttering. There is no stuttering on AMD GPUs. Source: http://www.tweaktown.com/guides/7655/quantum-break-pc-performance-analysis/index2.html

The L2 cache, judging by the massive amount of work in the render queue, is being hammered SM wise, with concurrent Warps spilling into L2 cache as well as ROp wise. Any L2 cache reserved for compute work takes away from ROP bandwidth and memory. This forces the ROps into hitting the memory controllers which are themselves not too efficient.

This is what I think is happening..

SM20x is limited to 16 concurrent warps per SM before overflowing into the L2 Cache and causing a pretty drastic performance drop.

Meaning that performance falls starting past 16 concurrent warps or 512 Threads per SM. So while SM20x has great compute performance on paper, this doesn't translate well once you push the architecture:
f6fcd35e224b79fa7438d33b4aaa7879.jpg


GCN, has enough L1 and local cache on tap, per CU, to push 40 concurrent wavefronts per CU. That's 2,560 threads executing at full speed.

In Quantum Break, the Volumetric Lighting shader used really pushes GM20x hard. This is why we see GM20x struggle to match Hawaii, let alone Fiji, in this title.

On paper, GM20x has more ROps but the performance of those ROps is directly tied to L2 Cache availability and bandwidth as well as the available memory controller bandwidth:
64e149eeeface70d0025474b0f4a8c54.jpg

That's GM107 but with GM20x we have 16 ROps, up from 8, sharing 512KB of L2 Cache, down from 1MB, and tied to a 64-bit memory controller.

Therefore if a game is compute heavy and spills into L2 Cache then the ROp throughput is also affected which translates into more pressure on the Memory controllers which aren't very efficient to begin with:
f7d97a7a7ee2030baef4e949671b5cdb.jpg


Therefore as predicted back in the summer of last year, though I received a lot of hate, Hawaii/Grenada will sometimes match a reference GTX 980 Ti in upcoming titles. Fiji will beging to often surpass the GM200 behemoth.

This is largely due to GCNs more highly redundant memory/cache hierarchy (and the power usage that comes with it).
 
Feb 19, 2009
10,457
10
76
@Mahigan

This is the GPUView of Nano for Fable, per WCCFTech.

ViewNano.png


There's some Async Compute usage, but notice the 2x Copy Queues (DMA) are empty.

Yet in the GPUView of QB, the DMA queue is heavily used.

d4c109018ed4789baea86399ce471dd0.jpg


Async Compute can be a compute task (accessing shaders or ROPs) or a copy/transfer task using the DMA engines. As this is running in parallel while graphics is rendering, it is operating in Async mode. :)

There is a LOT of it too.

I suspect they use it as part of their temporal reconstruction rendering system, whereby they take the data from 4 frames and rebuild the scene with it. That's going to take a lot of memory access and where I think they achieve it by using Async Compute Copy queues.

Unfortunately for NV's GPUs, they can't even do that while graphics is working, despite the uarch having at least 1 DMA engine.

For those who don't know the real power of Async Compute, here's a very short primer: https://youtu.be/H1L4iLIU9xU?t=15m29s

^ It is NOT just about Shader utilization. It is actually about getting the ROPs and DMA engines to perform tasks in parallel that is impossible under older APIs that don't operate in this multi-engine nature. Shader utilization was not it's primary goal when the spec for AC was designed in 2009.

http://www.gamasutra.com/view/feature/191007/inside_the_playstation_4_with_mark_.php?print=1

If you look at the portion of the GPU available to compute throughout the frame, it varies dramatically from instant to instant. For example, something like opaque shadow map rendering doesn't even use a pixel shader, it’s entirely done by vertex shaders and the rasterization hardware -- so graphics aren't using most of the 1.8 teraflops of ALU available in the CUs. Times like that during the game frame are an opportunity to say, 'Okay, all that compute you wanted to do, turn it up to 11 now.'"

It is a very common misunderstanding of the purpose of Async Compute.

From the video:

Copy Queues are a subset of Compute Queues.
Dyi6JAR.jpg


Shaders are what we often only think about in a GPU but there's other important sub-units that also do work.
CKQQrpa.jpg


Copy Queues~ using the DMAs
HMONRjS.jpg


All of this can be done without touching a single Shader/SP/CC.
OniJhso.jpg
 
Last edited:

Mahigan

Senior member
Aug 22, 2015
573
0
0
@Mahigan

This is the GPUView of Nano for Fable, per WCCFTech.

ViewNano.png


There's some Async Compute usage, but notice the 2x Copy Queues (DMA) are empty.

Yet in the GPUView of QB, the DMA queue is heavily used.

Async Compute can be a compute task (accessing shaders or ROPs) or a copy/transfer task using the DMA engines. As this is running in parallel while graphics is rendering, it is operating in Async mode. :)

There is a LOT of it too.

I suspect they use it as part of their temporal reconstruction rendering system, whereby they take the data from 4 frames and rebuild the scene with it. That's going to take a lot of memory access and where I think they achieve it by using Async Compute Copy queues.

Unfortunately for NV's GPUs, they can't even do that while graphics is working, despite the uarch having at least 1 DMA engine.
Afaik, GM20x does support Parallel Copy and Graphics commands though both are handled by the driver (Static Scheduler). For GM20x, everything is placed into a single queue (3D/Graphics queue) and then the static scheduler handles the hardware assignments of tasks.

The only feature missing from GM20x is Asynchronous compute + Graphics (parallel execution of Graphics and Compute commands).

All that being said, you've made an interesting observation because hammering the Copy queue means hammering the memory controller's.

So QB hammers the SMs and the Memory Controller's. If you read my post above, you can pretty much see why that's a bad idea on GM20x.

If you hammer the SMs, you hammer the Local Cache's leading to a spill into L2. If you begin consuming L2 for compute tasks, you take away bandwidth and L2 resources from the ROps. This forces the ROPs into hammering the memory controller's which, due to the heavy use of copy commands, are already hammered.

The end result is access latency, which results in stuttering. What do we see in QB? Stuttering.