(Discussion) Futuremark 3DMark Time Spy Directx 12 Benchmark

Azix · Jul 18, 2016

jj109 said:
By a full implementation DX12 you mean using GCN specific shaders in a low level vendor specific code path? Intrinsic shaders are where most of the Doom Vulkan gains are coming from.

Where was this shown?

3DVagabond said:
Who's going nuts? People have questions as some of the statements are a bit vague.

he's been posting on different places. Some folks are going overboard.

Headfoot · Jul 18, 2016

FM_Jarnis said:
That is indeed the long term plan. Tho first we'll release VRMark which is designed for "lowest common denominator in VR", so, DX11 - but that is a separate product, not part of 3DMark.

Some background on Time Spy;

Initially Time Spy was targeted for launch late 2015 / early 2016. Back then the market share of graphics cards that could actually support FL12 was.. umm... "limited". Unfortunately both maturity of DX12 drivers and some really complex issues in implementing a brand new DX12 engine delayed it considerably. Patching a benchmark workload after the fact is really really really bad, so we rather take the time to do it right.

Average consumers take it REALLY badly if a new benchmark says "you can't run it on your brand new (well, 3 year old) system because of X". This also directed to supporting FL11. On CPU test we took a "bold step" of requiring SSSE3 and.. uuh.. I've apologized today already to four customers that no, their Phenom II or Opteron can't run the test.

A FL12 benchmark with fallbacks to FL11 would not really be feasible - it would basically be two separate benchmarks. FL12 adds some interesting features and fully exploiting those would take a dedicated approach.

Pretty much all games target DX12 FL11 and even there most current game engines are doing DX12 in a way that is "oh we just ported our DX11 code". Time Spy is at least taking one step further with an engine developed from the ground up 'the DX12 way' (which is one of the reasons why it took a while - oh the tales our engine team could tell of DX12 features where spec says one thing and driver implementations do... other things. A phrase "What? Nobody is doing it like this" is something that was told to us by driver developers more than once...).

So I'd say Time Spy is a legit tool that will reflect how games could perform when a pure DX12 engine targets the most widely used hardware base (ie, DX12 FL11). Yes, some newest cards could use code paths for FL12 and I'm sure some games will offer those, but the pool of compatible hardware is still very small. This is, after all, our first full blown DX12 benchmark, so have to start from the obvious first step.

I appreciate you guys reaching out. That makes a lot of sense. Lol @ the SSE3 people on Phenoms too.

alexruiz · Jul 18, 2016

I don't know if this is the right thread to post, but I'll start here.
For licensing, I noticed that existing 3dmark advanced licenses can get time spy custom for $4.99. That obviously applies to already registered licenses.

How about NOT registered licenses?
I have still a few 3DMark advanced licenses that haver not used yet.
Would registering with one of those would give me also the time upgrade to time spy custom? Or somehow the newer licenses are different and the old ones don't provide time spy custom?

Thanks

AnandThenMan · Jul 18, 2016

FM_Jarnis said:
You cannot make a fair benchmark if you start bolting on vendor-specific and even specific generation architecture centered optimizations.

It appears that is exactly what has happened. I'll wait for the official statement on the matter but from what I've read this bench has been specifically catered to Nvidia in respect to async.

Keysplayr · Jul 18, 2016

AnandThenMan said:
It appears that is exactly what has happened. I'll wait for the official statement on the matter but from what I've read this bench has been specifically catered to Nvidia in respect to async.

Most folks complaining here seem to share this exact sentiment. Mostly from those wanting to see one company crush the other in dx12 benches or very specific hardware centric favoring features of one gpu genre, but didnt see it happen. I guess I'd be miffed too.

AnandThenMan · Jul 18, 2016

Keysplayr said:
Most folks complaining here seem to share this exact sentiment. Mostly from those wanting to see one company crush the other in dx12 benches or very specific hardware centric favoring features of one gpu genre, but didnt see it happen. I guess I'd be miffed too.

You completely missed the point, also this is a technical discussion not a personal one so leave that stuff out.

FM_Jarnis · Jul 18, 2016

alexruiz said:
I don't know if this is the right thread to post, but I'll start here.
For licensing, I noticed that existing 3dmark advanced licenses can get time spy custom for $4.99. That obviously applies to already registered licenses.

How about NOT registered licenses?
I have still a few 3DMark advanced licenses that haver not used yet.
Would registering with one of those would give me also the time upgrade to time spy custom? Or somehow the newer licenses are different and the old ones don't provide time spy custom?

Thanks

I'm actually not 100% certain how Steam is set up, but my educated guess is "no". Also standalone version would not give you Time Spy from just "3DM-ICF-" key.

If you now purchase 3DMark from Futuremark Store (ie. Digital River), it gives both keys.

If you now purchase 3DMark from Steam, only a bundle is available that contains also Time Spy.

Note that the 3DMark package that contains also Time Spy is $5 more expensive than the old 3DMark Advanced that didn't contain it. The price goes up as soon as the launch sale ends (during launch sale the package is $10 and the Upgrade is $5)

Apologies for having to do a paid upgrade. 18 months of development is not free. We will continue adding more to 3DMark Advanced Edition, and to Time Spy Upgrade, for free.

Bacon1 · Jul 18, 2016

FM_Jarnis said:
You cannot make a fair benchmark if you start bolting on vendor-specific and even specific generation architecture centered optimizations.

DX12 is a standard. We made a benchmark according to the spec, up to the graphics card vendors how their products work implementing the spec (if they do not follow it, MS won't certify the drivers, so they do follow it).

Beyond that, we will be publishing an official clarification on this issue, probably later today or tomorrow. I fear it won't placate all the people who are going nuts over this with their claims, but we'll do our best.

Except with DX12, you have to. That is the power you get from using a lower level API, but with that power comes the responsibility to add in vendor specific paths. (Thanks Uncle Ben!)

There have been multiple presentations from the big development companies where they've all stated that driver code is tiny and that the developers now are required to do those hardware specific optimizations.

Keysplayr · Jul 18, 2016

AnandThenMan said:
You completely missed the point, also this is a technical discussion not a personal one so leave that stuff out.

What are you serious?
What I am seeing, and what you are saying are highly contradictory.
It is only a technical discussion that sprung from something personal.
Even the future mark rep was called into question despite exemplary explanations in detail.
But it isn't ever going to be good enough because?
Now go gang up and report my posts even though it is more directly on topic than almost any other post in this thread.
I'm not attacking, insulting, or specifically calling anyone out nor flaming anyone.
Simply calling things the way they are in a forum that doesn't permit it.
My humble apologies to the future mark rep.

AnandThenMan · Jul 18, 2016

FM_Jarnis said:
You cannot make a fair benchmark if you start bolting on vendor-specific and even specific generation architecture centered optimizations.

Then there is little point to Time Spy, or at best we can say it is not a proper DX12 benchmark.

But again Time Spy actually does use vendor specific code, or more accurately a method that allows Nvidia to benefit but is not what is optimal under DX12. That is how I understand it currently if there is more info that shows this is not the case then great.

Det0x · Jul 18, 2016

FM_Jarnis said:
You cannot make a fair benchmark if you start bolting on vendor-specific and even specific generation architecture centered optimizations.

DX12 is a standard. We made a benchmark according to the spec, up to the graphics card vendors how their products work implementing the spec (if they do not follow it, MS won't certify the drivers, so they do follow it).

Beyond that, we will be publishing an official clarification on this issue, probably later today or tomorrow. I fear it won't placate all the people who are going nuts over this with their claims, but we'll do our best.

Didn't Futuremark do exactly this in the past ? "Bolting on" PhysX into the benchmark ?

If you were using a Nvidia card, you have to option to choose if you wanted PhysX to run on the CPU or on a NV graphic card ?

Something which boosted scores from a specific vendor..

While if you were using a AMD card, your only option was to run it on the CPU. (or use the amd+nvidia PhysX hack)

FM_Jarnis said:
Intel, AMD and NVIDIA are all part of Benchmark Development Program. They have source code read access and they can suggest changes and give feedback (with the feedback public within BDP, so any changes they suggest have to be accepted by the other vendors as well while Futuremark retains final say as to what goes into the benchmark).

AMD : We want a (enableable) mode which supports the true multi-engine approach, where so we can get maximum utilization of our hardware.

Nvidia : No, out of the question.. Our hardware don't support it. If we agreed to to this, we would see the same results as we see in Doom Vulkan Benchmarks..

Futuremark : Nvidia's "AC lite" version it is then.

If it simply was to much work to implement, then please come out and say so, otherwise this pretty much looks like a double standard to me.

Headfoot · Jul 18, 2016

Guys - they were pretty up front that this is just the beginning. We will see a fleshed out DX12 FL_12 engine in time. For a product you're seeing today and given their 18 month dev cycle this means they started planning almost 2 years ago. There was no FL_12 engine out at that time. Ashes was still an early Mantle demo!

I personally am interested in seeing the results of a FL11 and FL12 benchmark as game devs will likely be creating both for the forseeable future. We probably wont be full 100% FL12 for years.

See post 30 above http://forums.anandtech.com/showpost.php?p=38362082&postcount=30.

Bacon1 · Jul 18, 2016

https://developer.nvidia.com/sites/.../GDC16/GDC16_gthomas_adunn_Practical_DX12.pdf

Compute Queue #1

Use with great care!
● Seeing up to a 10% win currently, if done correctly

Always check this is a performance win
● Maintain a non-async compute path
● Poorly scheduled compute tasks can be a net loss

Remember hyperthreading? Similar rules apply

Compute Queue #4
Prefer explicit scheduling of
async compute tasks through
smart use of fences
● Benefits are;
- Frame-to-frame determinism
- App control over technique pairing!
● Downsides are;
-It takes a little longer to implement

jj109 · Jul 18, 2016

Azix said:
Where was this shown?

https://www.reddit.com/r/Amd/comments/4snqdt/doom_is_the_first_game_to_use_shader_intrinsic/

The role of having async compute is testable too, since SMAA disables async compute while it's available in TSAA and FXAA and all 3 are fairly similar in resource requirements.

Gains from async compute is in the range of 5-10% for Doom Vulkan.

40% gains from just multi engine concurrency is just fanboy fantasy fueled by charlatans. It's a mix of transitioning away from terrible OpenGL drivers, intrinsic shaders, and higher utilization from multi-engine.

Hitman928 · Jul 18, 2016

jj109 said:
https://www.reddit.com/r/Amd/comments/4snqdt/doom_is_the_first_game_to_use_shader_intrinsic/

The role of having async compute is testable too, since SMAA disables async compute while it's available in TSAA and FXAA and all 3 are fairly similar in resource requirements.

Gains from async compute is in the range of 5-10% for Doom Vulkan.

40% gains from just multi engine concurrency is just fanboy fantasy fueled by charlatans. It's a mix of transitioning away from terrible OpenGL drivers, intrinsic shaders, and higher utilization from multi-engine.

He's asking for a source showing that the majority of gain is coming from the intrinsic shaders, not that they're being used.

Thala · Jul 18, 2016

jj109 said:
Intrinsic shaders are where most of the Doom Vulkan gains are coming from.

Could pleaser just answer precisely where your claim is coming from, namely that the majority gain is from intrinsic shaders?

jj109 said:
40% gains from just multi engine concurrency is just fanboy fantasy fueled by charlatans.

Thats all nice, but no-one here claimed this either and has nothing to do with the above question. But apparently you have a break-down of the individual gains...

Headfoot · Jul 18, 2016

jj109 said:
https://www.reddit.com/r/Amd/comments/4snqdt/doom_is_the_first_game_to_use_shader_intrinsic/

The role of having async compute is testable too, since SMAA disables async compute while it's available in TSAA and FXAA and all 3 are fairly similar in resource requirements.

Gains from async compute is in the range of 5-10% for Doom Vulkan.

40% gains from just multi engine concurrency is just fanboy fantasy fueled by charlatans. It's a mix of transitioning away from terrible OpenGL drivers, intrinsic shaders, and higher utilization from multi-engine.

The shader intrinsics has to be a huge chunk of the performance. From my understanding it's essentially allowing the devs to drop down into Assembly-level code for the GPU.

You can write assembly on CPUs today and go really, really fast. But its very difficult and time consuming. Folks these days likely are pretty used to it though because its been available on consoles for so long.

I'm guessing here, but extreme low level optimization specific to each piece of hardware obviously has to be worth the performance to spend the man hours to get it done.

jj109 · Jul 18, 2016

Hitman928 said:
He's asking for a source showing that the majority of gain is coming from the intrinsic shaders, not that they're being used.

Do the math. If toggling between TSAA and SMAA gains 5-10% and only one of them has async compute enabled...

Proof that AA modes have similar FPS hit: http://www.tweaktown.com/guides/7757/doom-graphics-performance-tweak-guide/index2.html

Also, this: https://youtu.be/GlzKPBIjZPo?t=80

Excuse the WCCFTech, but they did do the comparison so no math needed.

P.S: Can you guys stop trying to crucify FM because they "aren't using true async"? The gains from async on/off are in line with Doom Vulkan, as shown here.

Det0x · Jul 18, 2016

jj109 said:
Do the math. If toggling between TSAA and SMAA gains 5-10% and only one of them has async compute enabled...

Proof that AA modes have similar FPS hit: http://www.tweaktown.com/guides/7757/doom-graphics-performance-tweak-guide/index2.html

Also, this: https://youtu.be/GlzKPBIjZPo?t=80

Excuse the WCCFTech, but they did do the comparison so no math needed.

P.S: Can you guys stop trying to crucify FM because they "aren't using true async"? The gains from async on/off are in line with Doom Vulkan, as shown here.

SMAA = async disabled

TSSAA = async enabled

id Software has confirmed, Async Compute is only active with no AA or TSSAA. They will add support for other AA modes soon.

Seems like alot more then only 5-10% to me

Hitman928 · Jul 18, 2016

jj109 said:
Do the math. If toggling between TSAA and SMAA gains 5-10% and only one of them has async compute enabled...

Proof that AA modes have similar FPS hit: http://www.tweaktown.com/guides/7757/doom-graphics-performance-tweak-guide/index2.html

Also, this: https://youtu.be/GlzKPBIjZPo?t=80

Excuse the WCCFTech, but they did do the comparison so no math needed.

P.S: Can you guys stop trying to crucify FM because they "aren't using true async"? The gains from async on/off are in line with Doom Vulkan, as shown here.

Ok, but you implied that you have knowledge of the break down of the gain, which you don't. You then use a comparison video which is flawed as they don't actually keep track of any key statistics (avg framerate, min, etc) and the comparison to async on and off is actually comparing two different AA methods where one allows async to be enabled, so it's not really valid at isolating the async gain. I agree that the whole gain in Doom is not from async, but I don't think anyone here was trying to argue that, you were the only one I saw who was arguing any kind of specific gain break down.

Edit:

As far as timespy goes, someone at overclock.net did a GPUView breakdown of the queues during doom, aots, and time spy.

I'm not an expert on this, but from what I see and understand from others, is that this is showing how timespy does not in fact try to do a parallel graphics+compute async. You can see when a compute job comes in, the graphics pipeline receives a context switch and then work starts on the compute queue. Maybe someone with more experience in profiling graphics processing can chime in, but I think this shows what Mahigan was saying over there, that FM developed an "async lite" test that allows Pascal and GCN to better schedule the workloads through the pipeline but is not the same as we're seeing in Doom, Hitman, and AoTS where graphics, copy, and compute can be executed in parallel. Personally, I'm fine with this, they just need to be a little more forthcoming about it in their documentation if this is indeed the case. Any benchmark is worthless unless you understand what you are testing.

jj109 · Jul 18, 2016

Det0x said:
SMAA = async disabled

TSSAA = async enabled

On the current Vulkan patch, you may or may not be aware, but id Software has confirmed, Async Compute is only active with no AA or TSSAA. They will add support for other AA modes soon.

I'm well aware, that's why I'm saying it's possible to calculate how much gain we can expect from async. See my previous post and watch the video.

You've pulled up two completely different benchmark sites, testing different scenes and one with results not replicated elsewhere. What exact are you trying to prove?

Hitman928 said:
Ok, but you implied that you have knowledge of the break down of the gain, which you don't. You then use a comparison video which is flawed as they don't actually keep track of any key statistics (avg framerate, min, etc) and the comparison to async on and off is actually comparing two different AA methods where one allows async to be enabled, so it's not really valid at isolating the async gain. I agree that the whole gain in Doom is not from async, but I don't think anyone here was trying to argue that, you were the only one I saw who was arguing any kind of specific gain break down.

I've proved that the anti-aliasing method has little effect on FPS. Not having full averages is not an excuse for ignoring the result completely.

At this point everyone not caught up in a witchhunt should see that blasting Time Spy for not showing 40% gains on AMD GPUs with async compute turned on just BS. Stop it.

Thala · Jul 18, 2016

jj109 said:
Do the math.

Your claim implies that you did the math. So please show us your break-down or can we assume you are just talking hot air?

jj109 · Jul 18, 2016

Thala said:
Your claim implies that you did the math.

Cripes. I can't believe I'm actually arguing with you.

First you demand sources.

I give sources.

And then another guy posts "nuh-uh, look at my two completely different benchmarks sourced from different sites" and starts nitpicking anti-aliasing, which I already proved has little effect.

And then you ignore everything except the first sentence.

The reality distortion field is real.

Thala · Jul 18, 2016

jj109 said:
Sure thing. Already did.

Thanks for proving my "hot air" assumption.

ps. Everyone agrees, that the gain cannot be attributed to async compute alone. However you are the only one claiming to know, that the majority of gain coming from shader intrinsics. So it is reasonable to ask for a break-down.

Hitman928 · Jul 18, 2016

Ok, then we'll make it straight forward, out of the 52% gain from openGL to Vulkan that computebase.de showed with the FuryX, roughly how much of that gain is from async, how much from instrinsic shaders, then how much from multi-core rendering, etc?

(Discussion) Futuremark 3DMark Time Spy Directx 12 Benchmark

Golden Member

Diamond Member

Platinum Member

Diamond Member

Elite Member

Diamond Member

Member

Diamond Member

Elite Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Golden Member

Diamond Member

Senior member

Golden Member

Diamond Member

Senior member

Golden Member

Senior member

Golden Member

Diamond Member