DirectX 12Futuremark 3DMark Time Spy Benchmarks Thread

guskline · Jul 15, 2016

XabanakFanatik said:
http://www.3dmark.com/3dm/13217709

5960X at 4.375/4.25
980Ti at 1405/2000

Nice score! Especially on that cpu:thumbsup:

dogen1 · Jul 15, 2016

Silverforce11 said:
That's a terrible technical guide, it does not even go into the technical aspects.

They don't specify further, just that they use Async Compute to increase GPU utilization.

id Software uses Async Compute to both increase shader utilization with post effects, and to actually run Rasterizers & DMAs in parallel with Shaders via Shadow Maps & Megatexture streaming.

"Before the main illumination passes, asynchronous compute shaders are used to cull lights, evaluate illumination from prebaked environment reflections,
compute screen-space ambient occlusion, and calculate unshadowed surface illumination. These tasks are started right after G-buffer rendering has finished and are executed alongside shadow rendering."

"Particles are simulated on the GPU using asynchronous compute queue. Simulation work is submitted to the asynchronous queue while G-buffer and shadow map rendering commands are submitted to the main command
queue."

moonbogg · Jul 15, 2016

dogen1 said:
"Before the main illumination passes, asynchronous compute shaders are used to cull lights, evaluate illumination from prebaked environment reflections,
compute screen-space ambient occlusion, and calculate unshadowed surface illumination. These tasks are started right after G-buffer rendering has finished and are executed alongside shadow rendering."

"Particles are simulated on the GPU using asynchronous compute queue. Simulation work is submitted to the asynchronous queue while G-buffer and shadow map rendering commands are submitted to the main command
queue."

That's exactly what I assumed was happening. What about you guys?

Flapdrol1337 · Jul 15, 2016

moonbogg said:
That's exactly what I assumed was happening. What about you guys?

Kenmitch · Jul 15, 2016

dogen1 said:
"Before the main illumination passes, asynchronous compute shaders are used to cull lights, evaluate illumination from prebaked environment reflections,
compute screen-space ambient occlusion, and calculate unshadowed surface illumination. These tasks are started right after G-buffer rendering has finished and are executed alongside shadow rendering."

"Particles are simulated on the GPU using asynchronous compute queue. Simulation work is submitted to the asynchronous queue while G-buffer and shadow map rendering commands are submitted to the main command
queue."

Sorry....You lost me at before.

Sounds reasonable never the less.

Red Hawk · Jul 15, 2016

Ran the test using the 16.7.2 driver package.

Results on default settings:

Graphics: 3910
CPU: 2804

Results on default settings with asynchronous compute disabled:

Graphics: 3382
CPU: 2801

Yay for asynchronous compute! Getting better performance for free. Doesn't seem to make a difference in the CPU test -- however, it could still be affecting CPU performance in a way. The monitoring graphs showed my CPU frequency fluctuating much more in the graphics tests on the non-async run than with async compute. Though I should note there was quite a bit of CPU frequency fluctuation with async compute anyways, especially in graphics test 1. It just fluctuated even more without async compute. Interestingly, both tests recorded near constant 100% GPU usage, so whatever goes unused without async compute doesn't appear to be picked up by GPU usage monitoring.

The test really made the three fans on my 290X work for their lunch. Woof. Peak GPU temperature was 76 degrees C by the end of the second graphics test, peak CPU temperature was 59 during the GPU test, 82 degrees by the end of the CPU test.

Hail The Brain Slug · Jul 15, 2016

guskline said:
Nice score! Especially on that cpu:thumbsup:

Well, I literally spent countless hours dialing in the overclock to make it as fast as possible in every aspect. I'm really surprised I scored 17% higher per core than AdamK47 at only 9% faster clocks.

Termie · Jul 15, 2016

I've just put up Time Spy benchmarks for the 970, 980, 980 Ti, and 1070, all stock and overclocked. Scroll down this page to see them.

Hitman928 · Jul 15, 2016

Termie said:
I've just put up Time Spy benchmarks for the 970, 980, 980 Ti, and 1070, all stock and overclocked. Scroll down this page to see them.

Termie, can you do me a favor and test the 1070 at 4k with async on and off?

Red Hawk · Jul 15, 2016

Hmm...tried running Time Spy on my brother's system, a Core 2 Quad Q6600 and Radeon 260X. But it won't run at all, just goes straight to the results screen with the error message "No results produced". Yeah, no duh. It's on the 16.7.2 drivers, I tried running in windowed mode and uninstalling and reinstalling the application. Still won't run, but Fire Strike runs fine. So I'm not sure what the problem could be.

Silverforce11 · Jul 15, 2016

dogen1 said:
"Before the main illumination passes, asynchronous compute shaders are used to cull lights, evaluate illumination from prebaked environment reflections,
compute screen-space ambient occlusion, and calculate unshadowed surface illumination. These tasks are started right after G-buffer rendering has finished and are executed alongside shadow rendering."

"Particles are simulated on the GPU using asynchronous compute queue. Simulation work is submitted to the asynchronous queue while G-buffer and shadow map rendering commands are submitted to the main command
queue."

Is this in the technical guide? I didn't see it under Time Spy section.

JustMe21 · Jul 15, 2016

Just for grins, I ran it on an i7-3770 and Radeon R7 250X aka Radeon 7770 and I got a whopping 596.

Termie · Jul 15, 2016

Hitman928 said:
Termie, can you do me a favor and test the 1070 at 4k with async on and off?

Sure, here you go:

With async:
Graphics score
2,725
Graphics test 1
17.59 FPS
Graphics test 2
15.76 FPS

Without async:
Graphics score
2,633
Graphics test 1
16.97 FPS
Graphics test 2
15.25 FPS

Red Hawk · Jul 15, 2016

Tested at 1080p with 16x anisotropic filtering as that's my monitor resolution and I prefer to bump up the AF in games. No idea why trilinear filtering is the default for Time Spy, AF is pretty basic and doesn't have a high performance cost.

Async Off:

Graphics Score: 4,844
Test 1: 33.17 FPS
Test 2: 26.65 FPS

Async On:

Graphics Score: 5,524
Test 1: 38.37 FPS
Test 2: 30.05 FPS

dogen1 · Jul 15, 2016

Silverforce11 said:
Is this in the technical guide? I didn't see it under Time Spy section.

Yeah, I got it from the guide.

YBS1 · Jul 15, 2016

Jeez...I did something stupid. I loaded up my benchmark settings and ran it at 4.8GHz a couple of times and wondered why my scores had dropped to around ~10000, finally realized I had G-Sync on and it was set to 60Hz. I'll get back around to a 4.8GHz bench later, for now here is my daily 4.5Ghz.

http://www.3dmark.com/spy/50984

Hitman928 · Jul 15, 2016

Termie said:
Sure, here you go:

With async:
Graphics score
2,725
Graphics test 1
17.59 FPS
Graphics test 2
15.76 FPS

Without async:
Graphics score
2,633
Graphics test 1
16.97 FPS
Graphics test 2
15.25 FPS

Thanks, looks like the benefit of async on the 1070 drops to about 3.5% at 4k. I'll do the same on my 290 tomorrow to compare.

Hitman928 · Jul 15, 2016

Also, I wanted to try the explicit multi adapter functionality so I ran it with a 290 and a 280 with disappointing results, it was no faster than a single 280 even though the 290 was being loaded. The 280 was the primary card so I'll switch them to see if can get hopefully better results.

Red Hawk · Jul 15, 2016

JustMe21 said:
Just for grins, I ran it on an i7-3770 and Radeon R7 250X aka Radeon 7770 and I got a whopping 596.

That's interesting. I just ran the benchmark on my own PC with my brother's 260X swapped out for my 290X. Scored 1,428 total, 1,312 graphics. It's a total slideshow either way, but the 260X (Bonaire) having twice the geometry hardware as the 250X (Cape Verde) probably helped a bunch.

Async compute tests were interesting. I'd heard that the weaker/smaller the chip gets, the less of a difference asynchronous compute makes, because the whole chip is likely being used at any given moment and there's really no spare resources. It could even cause a loss in performance, like with Maxwell chips in Ashes of the Singularity, because async compute introduces latency if it goes unused. I tested at 1440p, 1080p, and 1440x900, with and without async compute. Graphics results were:

1440p, async off: 1,313
1080p, async on: 2,080
1080p, async off: 2,092
1440x900, async on: 2895
1440x900, async off:2,905

So yeah, results pretty consistent with what I heard. The 260X actually gets a few extra points without asynchronous compute. The benefit of asynchronous compute is definitely for high-end cards with compute units to spare, not low end cards which are being used to their max as it is.

Hitman928 said:
Also, I wanted to try the explicit multi adapter functionality so I ran it with a 290 and a 280 with disappointing results, it was no faster than a single 280 even though the 290 was being loaded. The 280 was the primary card so I'll switch them to see if can get hopefully better results.

...I didn't realize you could try that. I have my 290X, my brother's 260X, and a spare 270X I could try multiadapter with...though I doubt my PSU would appreciate both the 290X and 270X being hooked up to it. The 270X and 260X could be doable though...

richaron · Jul 15, 2016

Hitman928 said:
Also, I wanted to try the explicit multi adapter functionality so I ran it with a 290 and a 280 with disappointing results, it was no faster than a single 280 even though the 290 was being loaded. The 280 was the primary card so I'll switch them to see if can get hopefully better results.

Pretty sure I read it's only for matching GPUs.

Hitman928 · Jul 16, 2016

richaron said:
Pretty sure I read it's only for matching GPUs.

Reading through it again, I think you might be right but I'm still not sure what the limits are. In the tech guide and in the slides they talk about being able to combine discrete and integrated solutions and how through explicit multi-adapter they can have control over any GPUs in the system. But then they drop this at the very end which I hadn't read before

MDA configurations of heterogeneous adapters are not supported

Edit: Nevermind, found the answer. They do say they only support AFR with identical GPUs. It's funny that they have slides and a whole paragraph in the tech doc about how dx12 allows them to harness a heterogeneous gpu system, then pull back and are like, but we're not doing that. Fun while it lasted, lol. I am sure the 290 was being loaded though, heard the fans spin up and the temp increased throughout the benchmark. I wonder if it was running but the results just thrown away or something.

Deders · Jul 16, 2016

You can check with gpuz. Open 2 instances and make sure 1 is set to the secondary card.

Geegeeoh · Jul 16, 2016

1751 (G 1661/C 2531) - GTX 670 2GB on i5 2500k http://www.3dmark.com/3dm/13232898

Red Hawk · Jul 16, 2016

Ok, got Time Spy working on my brother's PC, had to revert to driver 16.3.2 to do it, possibly something to do with AMD using SSE4, which the Q6600 doesn't support, in DirectX 12 under drivers newer than 16.4. Anyways, I put his PC through some abuse to get numbers (edit: to clarify, this is the PC with a stock Q6600 and Radeon 260X):

Default:
Graphics 1340
CPU 1224
Total 1321

1440x900, async compute on:
Graphics 3043
CPU 1162

1440x900, async compute off:
Graphics 2918
CPU 1140

Over 100 point improvement at 1440x900 with async on over async off. So asynchronous compute may in fact have a (minor) benefit on all around low end systems.

Edit: Just tried running the demo for the first time on my brother's PC along with the tests. The demo is an absolute slideshow even at 1440x900. 1-2 FPS, whether or not async compute is on. Good lord.

Kenmitch · Jul 16, 2016

Red Hawk said:
Edit: Just tried running the demo for the first time on my brother's PC along with the tests. The demo is an absolute slideshow, 1-2 FPS, whether or not async compute is on. Good lord.

Too funny....When watching my rig I thought it was too choppy.

Dang Zotac and their aggressive core/boost clocks on the 1070 AMP Edition didn't really leave me much room to OC the core. Memory on the other hand I've tested up to 9408 so far. Looks like I get a decent little bump in overall score each time. So far I've only did a measly +20 on the core. I'm kind of thinking they used something similar to bin the chips for the AMP extreme vs the AMP cards.

Is there some kind of sweet spot for the GDDR5 as far as speed/timings go? I'm getting tired of messing with the ram speed. Would be nice if it was one of those 9469 nets the best results in the end kind of things.

DirectX 12Futuremark 3DMark Time Spy Benchmarks Thread

Diamond Member

Senior member

Lifer

Golden Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Senior member

Diamond Member

Diamond Member

Senior member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Platinum Member

Member

Diamond Member

Diamond Member