(Discussion) Futuremark 3DMark Time Spy Directx 12 Benchmark

Page 8 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
Hopefully those games come out plentiful and fast, because I get this feeling this benchmark is going to show up on GPU reviews soon enough.

Going to create a lot of forum fights when 480 AIBs loses to GTX 1060 AIBs in a async compute benchmark.

No worries, there's at least 6 DX12 games due until the end of this year ...

I don't expect AMD to win in Gears of War 4 since it's built on Unreal Engine 4 ...
 

railven

Diamond Member
Mar 25, 2010
6,604
561
126
The next wave of big games on DX12 is due in a few month, there will not be a major shift in time for the 1060's launch review, it will mostly be DX11 tested.

No worries, there's at least 6 DX12 games due until the end of this year ...

I don't expect AMD to win in Gears of War 4 since it's built on Unreal Engine 4 ...

6 games to the how many DX11 titles still in the pipeline?

Futuremark is right on the money with their prediction.
 
Feb 19, 2009
10,457
10
76
6 games to the how many DX11 titles still in the pipeline?

Futuremark is right on the money with their prediction.

That's new DX12 on top of the current ones. Not 6 in total.

As benchmarks ditch old games and replace it with new ones, it only makes GCN look better, I'm sure you've noticed that at least.

What's going to be games that decides it for most gamers?

Battlefield 1, Deus Ex MD, Watch Dogs 2, Halo Wars 2, Gears of Wars 4, Forza Horizons...

Or:

Crysis 3, ACU, Project Cars, Metro (yeah, lots of sites still use this ancient game lol) etc.

The benchmark landscape will look very different 3 months from now.
 

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
6 games to the how many DX11 titles still in the pipeline?

Futuremark is right on the money with their prediction.

If we only counted AAA games there is about only 6 or so games left for the rest of this year that are exclusively DX11 ?

That's not half bad all things considered ...

It would be great if AMD featured DX12 for the upcoming Call of Duty too just to make the transition smoother ...
 
Last edited:
Feb 19, 2009
10,457
10
76

This is spot on, DX12/Vulkan needs architecture specific paths to optimize it fully.

Using a single rendering path and hoping it runs the best on all hardware doesn't work for these next-gen APis.

As an example, Pascal gets a light Async Compute path so it can use it well with preemption. GCN gets a real parallel Multi-Engine path so it can flex it's power.

-----------------------------------

This was a good move from NVIDIA. Get into Time Spy early on and make sure it looks good on Pascal.

https://www.youtube.com/watch?v=kOsxV4-oRNA

^ From December. NV logo on Time Spy.
 

AnandThenMan

Diamond Member
Nov 11, 2004
3,991
627
126
Here's a direct link to the pdf may not work unless you first click the link Bacon posted. It's very easy to see that a one size fits all is basically impossible under DX12 you are going to have to compromise to the hardware with lesser capabilities.
 

Bacon1

Diamond Member
Feb 14, 2016
3,430
1,018
91
6 games to the how many DX11 titles still in the pipeline?

Futuremark is right on the money with their prediction.

Lets see released games we have:

Forza Apex (DX12 only)
Quantum Break (DX12 only)
Gears of War UE (DX12 only)
Hitman
Rise of the Tomb Raider
Total War Warhammer
Ashes of the Singularity
Doom (Vulkan)

Upcoming AAA we have:

Gears of Wars 4 (DX12 only again?)
Forza Horizons (Dx12 only again?)
Battlefield 1
Deus Ex Mankind Divided
Halo Wars 2
Watch Dogs 2

Plus anything else from DICE / EA, Square Enix and others which have released either Mantle, Vulkan or DX12 games now since they've done the heavy lifting with the engine work.
 

FM_Jarnis

Member
Jul 16, 2016
28
1
0
And that's the most damming thing to me actually, pretty much in exact contrast to what FM_Jarnis was saying. :\

Making games and making fair and unbiased benchmarks are two different things.

We have actually discussed this very subject with the graphics vendors and they are against doing it in 3DMark. Such optimizations almost always inevitably require altering the actual work being performed, and then it would no longer be a common reference point.

With a game it doesn't really matter if you optimize by subtly altering what is being rendered according to the strengths of each architecture, if doing so gives substantial gains in framerate, but in an unbiased benchmark that isn't a good idea.

(and before you drag async back into this; every single card performs the exact same work with Time Spy. Different architectures and drivers choose how exactly they arrange the work and manage the resources of the hardware, but the command queues going to the driver and the final output rendered is identical)
 

Zstream

Diamond Member
Oct 24, 2005
3,395
277
136
Making games and making fair and unbiased benchmarks are two different things.

We have actually discussed this very subject with the graphics vendors and they are against doing it in 3DMark. Such optimizations almost always inevitably require altering the actual work being performed, and then it would no longer be a common reference point.

With a game it doesn't really matter if you optimize by subtly altering what is being rendered according to the strengths of each architecture, if doing so gives substantial gains in framerate, but in an unbiased benchmark that isn't a good idea.

(and before you drag async back into this; every single card performs the exact same work with Time Spy. Different architectures and drivers choose how exactly they arrange the work and manage the resources of the hardware, but the command queues going to the driver and the final output rendered is identical)



How about a car analogy:

You test two cars on a quarter mile track. That's unbiased but also does not show that one of the cars is terrible at going fast around corners, or bottoms out at a quarter mile.

Yes, it's unbiased but also shows a lack of the real world and what you will do on the road.

That's as best as I can say it. It's unfortunate that you won't test real world usage, like driving around corners or longer than a quarter mile.

We want you to do a FULL test and show the realities of each other's architecture, for good or bad.
 

selni

Senior member
Oct 24, 2013
249
0
41
And that's the most damming thing to me actually, pretty much in exact contrast to what FM_Jarnis was saying. :\

I mean that's true but it's also pretty damning of every DX12 implementation so far as well isn't it? Who's doing distinct render paths for each architecture?
 

FM_Jarnis

Member
Jul 16, 2016
28
1
0
How about a car analogy:

You test two cars on a quarter mile track. That's unbiased but also does not show that one of the cars is terrible at going fast around corners, or bottoms out at a quarter mile.

Yes, it's unbiased but also shows a lack of the real world and what you will do on the road.

That's as best as I can say it. It's unfortunate that you won't test real world usage, like driving around corners or longer than a quarter mile.

We want you to do a FULL test and show the realities of each other's architecture, for good or bad.

If you really want to go there (car analogies... uuuuh), what you are suggesting is that we should have two tracks.

One with lots of curves for the car that is damn good at curves and another with just a few long straights for the thing that can't turn to save it's life.

Both are showing their "best sides", but the track is not the same, so how is this a fair comparison?

(this is a pretty silly discussion...)

Both cars on same track that has all kinds of curves and straights, with both going through the exact same thing, no? Ie. like 3DMark does it. I mean, sure, you can argue all day about how exactly the track should then be laid out, and which bit is better for one car and which bit for the other, but Futuremark has been doing this for almost 20 years and we have AMD, NVIDIA and Intel participating in the "track design", so how could we make it more fair?
 

DeathReborn

Platinum Member
Oct 11, 2005
2,786
789
136
If you really want to go there (car analogies... uuuuh), what you are suggesting is that we should have two tracks.

One with lots of curves for the car that is damn good at curves and another with just a few long straights for the thing that can't turn to save it's life.

Both are showing their "best sides", but the track is not the same, so how is this a fair comparison?

(this is a pretty silly discussion...)

Both cars on same track that has all kinds of curves and straights, with both going through the exact same thing, no? Ie. like 3DMark does it. I mean, sure, you can argue all day about how exactly the track should then be laid out, and which bit is better for one car and which bit for the other, but Futuremark has been doing this for almost 20 years and we have AMD, NVIDIA and Intel participating in the "track design", so how could we make it more fair?

That's how I read it as well, fair workloads to compare different makes. I am fairly sure ALL parties were pushing their own brand of what DX12 is but FM has to make it fair to all participants.

In "car speak" you have in Formula 1 Ferrari, Red Bull & Mercedes who have large budgets (many others don't), work within the same rules (DX12 equivalent) yet all race on the same track, some work better on the straights, some in the fast corners, some in slow corners. Some even like the rain (okay, not really but you get the picture) but they don't have different routes to suit different cars.

I'm sure FM could make a "balls to the wall" benchmark that you can't really use to compare but you just know people will and the point of the benchmark is completely lost.

Thanks to FM_Jarnis for trying to explain.
 

Bacon1

Diamond Member
Feb 14, 2016
3,430
1,018
91
If you really want to go there (car analogies... uuuuh), what you are suggesting is that we should have two tracks.

One with lots of curves for the car that is damn good at curves and another with just a few long straights for the thing that can't turn to save it's life.

Both are showing their "best sides", but the track is not the same, so how is this a fair comparison?

(this is a pretty silly discussion...)

Both cars on same track that has all kinds of curves and straights, with both going through the exact same thing, no? Ie. like 3DMark does it. I mean, sure, you can argue all day about how exactly the track should then be laid out, and which bit is better for one car and which bit for the other, but Futuremark has been doing this for almost 20 years and we have AMD, NVIDIA and Intel participating in the "track design", so how could we make it more fair?

So are you saying that there are parts of the test that show compute+graphics+copy all running at the same time with async compute?
 

Red Hawk

Diamond Member
Jan 1, 2011
3,266
169
106
Making games and making fair and unbiased benchmarks are two different things.

We have actually discussed this very subject with the graphics vendors and they are against doing it in 3DMark. Such optimizations almost always inevitably require altering the actual work being performed, and then it would no longer be a common reference point.

With a game it doesn't really matter if you optimize by subtly altering what is being rendered according to the strengths of each architecture, if doing so gives substantial gains in framerate, but in an unbiased benchmark that isn't a good idea.

(and before you drag async back into this; every single card performs the exact same work with Time Spy. Different architectures and drivers choose how exactly they arrange the work and manage the resources of the hardware, but the command queues going to the driver and the final output rendered is identical)

Ok...that does make more sense. I'm trying to give you the benefit of the doubt here. It does make sense that for a benchmark, you would give each graphics card architecture the same workload to render.

Lets see released games we have:

Forza Apex (DX12 only)
Quantum Break (DX12 only)
Gears of War UE (DX12 only)
Hitman
Rise of the Tomb Raider
Total War Warhammer
Ashes of the Singularity
Doom (Vulkan)

Upcoming AAA we have:

Gears of Wars 4 (DX12 only again?)
Forza Horizons (Dx12 only again?)
Battlefield 1
Deus Ex Mankind Divided
Halo Wars 2
Watch Dogs 2

Plus anything else from DICE / EA, Square Enix and others which have released either Mantle, Vulkan or DX12 games now since they've done the heavy lifting with the engine work.

Don't forget Dota 2 and The Talos Principle, which have Vulkan renderers now.
 

FM_Jarnis

Member
Jul 16, 2016
28
1
0
So are you saying that there are parts of the test that show compute+graphics+copy all running at the same time with async compute?

Compute + graphics running simultaneously. That is what async compute is. Compute queue and Graphics (aka Direct) queue running at the same time. Happens throughout Demo and Graphics Test 1 & 2.

Copy can also run simultaneously, but Time Spy does not use Copy to ensure it is an isolating graphics card benchmark (Graphics tests) - all content is instead loaded to VRAM before the test starts. So as long as you meet the VRAM requirements, there is no traffic to main RAM (if you don't, shared RAM is used, with the usual performance penalty, normal story with iGPUs etc. and then RAM performance matters)
 

Det0x

Golden Member
Sep 11, 2014
1,465
4,999
136
Compute + graphics running simultaneously. That is what async compute is. Compute queue and Graphics (aka Direct) queue running at the same time. Happens throughout Demo and Graphics Test 1 & 2.

Copy can also run simultaneously, but Time Spy does not use Copy to ensure it is an isolating graphics card benchmark (Graphics tests) - all content is instead loaded to VRAM before the test starts. So as long as you meet the VRAM requirements, there is no traffic to main RAM (if you don't, shared RAM is used, with the usual performance penalty, normal story with iGPUs etc. and then RAM performance matters)

Dont seem to be the general consensus looking at this screenshot ? :hmm:

s51q4IX.jpg


Just to quote some of the responses:

That would be a major dilemma. Last thing benchmark program should do is create separate optimized path for each GPU.

This is exactly the dilemma I expected from them. There's no way to have a single render path in a DX12 benchmark without optimizing it for the lowest common denominator and punishing the silicon with extra features.

"Impartial" benchmarking has become an oxymoron with DX12. You have to optimize for each vendor or you're unfairly punishing one of them. It just about makes the whole concept of "benchmark" meaningless.

They had no problem doing this with tessellation. Now suddenly they've got morals?

I'd say there's a difference between doing the same workload (serially vs in parallel) and actively reducing the amount of workload with tessellation (geometry) is there not? Or am I not understanding this correctly?

I get you, but DX12 is not a one-size-fits-all API. Arguably DX11 was, but AMD suffered with high tess and had driver optimizations to keep such punishment within architectural limits. These driver optimizations became invalid within 3dmark, so they were left competing one-for-one with Nvidia.

OK 3dmark, that's fine if you want to look neutral, but now with DX12 AMD isn't allowed to shine with its parallel hardware-- it must remain on a level playing field with an NV-optimized render path. It's not an indication of game performance, unless that game is specifically NV-optimized and has very few if any AMD async shader optimizations.

See the theme here? The last 3dmark was NV-optimized with tessellation levels. The limitation was on the AMD side, and the fix was ignored / bypassed. This 3dmark is NV-optimized in its avoidance of Async Compute + Graphics, aka Async Shaders. The limitation is on the Nvidia side, and the fix is honored.

It's a valid benchmark as long as AMD knows its place.

With the given evidence, we can say that Time Spy benchmark, intentionally or not, by design, fits perfectly for the capabilites of Pascal, other Nvidia architectures are not capable of async computing at all, and most of the AMD architectures in theory are left with spare room to be requested of much heavier async computing loads.

It's like Tessellation loads were designed to fit the inferior AMD capabilities back in the day. There is a clear pattern with Futuremark controversies regardless of who's on the right or wrong, and it's that they always favor Nvidia.

From what I understand based on Doothe's post, Time Spy is basically only just doing that new feature that Pascal has - it preempts some 3D work, quickly switches context to the compute work, then switches back to the 3D.

So it seems to me that Time Spy has a very minimal amount of async compute work compared to Doom and AotS, *and the manner in which it does its "async" is friendly to Pascal hardware. I don't think it's necessarily "optimized" for nvidia, as GCN seems to have no issue with context switching either. It's just not being allowed to take full advantage of GCN hardware.

* = read Pre-emption to suite the newest nv hardware, instead of truly asynchronous shaders

Compute queues as a % of total run time:

Doom: 43.70%
AOTS: 90.45%
Time Spy: 21.38%

It does look that way compared to AOTS, and DOOM. I don't have ROTR, Hitman, or any other DX12/Vulkan titles to test this theory against. In the two other games, GPUView shows two rectangles(compute queues) stacked on top of each other. Time Spy never needs to process more than one at a time.

  • Minimize the use of barriers and fences
  • We have seen redundant barriers and associated wait for idle operations as a major performance problem for DX11 to DX12 ports
  • The DX11 driver is doing a great job of reducing barriers – now under DX12 you need to do it
  • Any barrier or fence can limit parallelism

2831905

If we are misinterpreting the data, please feel free to correct us.

And as you said in the other thread:
The goal is to have a benchmark that gives an accurate indication how DX12 games, on average, perform on various graphics cards. To help people make educated purchasing decisions and to serve as an unbiased, neutral "yardstick" on gaming performance of various systems.

There are plenty of games out there that have various colored teams doing super special optimizations that favor one or the other architecture that may or may not show "what hardware is truly capable of". With 3DMark you know that it gives you the real deal *without* those bits that may influence hardware comparisons considerably. This also means it stays valid when a new generation of hardware arrives while those super-special-optimized games may suddenly perform much worse on the latest hardware when their optimizations no longer fit the new architecture.

"Educated purchasing decisions".. Thanks Time Spy, based on your guidance i have come to the conclusion (based on the benchmark scores) that the 970 will essentially be equal to the 300 series cards in DX12/Vulkan titles.

If it turns out that Maxwell cards end up getting slaughtered in the future when real DX12 titles drop, I'm sure 3DMark will accept accountability?
 
Last edited:

sontin

Diamond Member
Sep 12, 2011
3,273
149
106

dogen1

Senior member
Oct 14, 2014
739
40
91
So are you saying that there are parts of the test that show compute+graphics+copy all running at the same time with async compute?

Yes. The guide explicitly says they schedule asynchronous compute shaders(doing a variety of things) to run in parallel with shadow map rendering.

I've already copy pasted parts of the guide before, but you should read it yourself. They even have some nice diagrams how their queues are set up and whatnot.
 

FM_Jarnis

Member
Jul 16, 2016
28
1
0
Yes, the technical guide already has an updated section on this (which is a good chunk of what our post on this will be). Full post will go up on Futuremark.com within the next two hours (they putting it up to the publishing system now)

http://www.futuremark.com/downloads/3DMark_Technical_Guide.pdf

(Page 27 and onward)

(And if you want to object to something or other, I recommend waiting for the post on futuremark.com, it will have more detail - this is just the "tech" bit of it)
 
Last edited:
May 11, 2008
22,549
1,470
126
I am just a noob, but i do not get it. DX12 and vulkan gives a developer the chance to access the hardware at a lower level. Being less abstract when compared to opengl and the older dx versions. So to me it is inevitable, that there will be different ways to optimize for to get the best results on different hardware. To create a general way of using hardware (In a sense the developer created an abstraction level ) will either choose one architecture over the other or cripple (reduce performance) both.

I find it strange. :hmm:

There will always have to be architecture specific optimizations. Yes ?
 

dogen1

Senior member
Oct 14, 2014
739
40
91
This section wasnt in the technical guide the last few days. :thumbsup:

Description of async compute shader tasks and that they're run in parallel with shadow maps(provided the hardware and driver will do that) was in there from the beginning.