computerbaseAshes of the Singularity Beta1 DirectX 12 Benchmarks

Page 26 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

linkgoron

Platinum Member
Mar 9, 2005
2,588
1,234
136

Tapoer

Member
May 10, 2015
64
3
36
The 380x basically matches the 970 in those DX12 tests... I assume that something is wrong with the nVidia setup, drivers or whatever, as I can't imagine how that's possible.

It could be possible if Ashes of Singularity is using even more Asynchronous Computing than before, which Nvidia doesn't support in hardware.
EDIT: The 380x have the same peak teraflop performance than the 970, AC helps to archive it on the AMD side.

At least Nvidia performance in D3D12 is better than in D3D11.
 
Last edited:

Flapdrol1337

Golden Member
May 21, 2014
1,677
93
91
The 380x basically matches the 970 in those DX12 tests... I assume that something is wrong with the nVidia setup, drivers or whatever, as I can't imagine how that's possible.
A 380x has more memory bandwidth and is a bigger chip than a 970 if you count how much the 970 is cut. It also uses more power.

Not impossible imo.
 

zlatan

Senior member
Mar 15, 2011
580
291
136
The 380x basically matches the 970 in those DX12 tests... I assume that something is wrong with the nVidia setup, drivers or whatever, as I can't imagine how that's possible.
This is pretty normal in DX12 with well optimized standard multiengine code.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
Yes but R9 380X also not using all of its memory controllers, so its a little bit smaller than 366mm2 more closer to 350mm2.

I believe that if GTX 970 was a full die it would still be a little bigger than 350mm2.

But lets agree that they are both almost the same die size. ;)
 

linkgoron

Platinum Member
Mar 9, 2005
2,588
1,234
136
This is pretty normal in DX12 with well optimized standard multiengine code.

This is quite a claim, that this is "pretty normal".

Are there any other DX12 engines that present this kind of behavior? Usually I see the 290X/390 getting better performance than the 970, but I would assume that the 380x basically *matching* the 970 would make a bit more buzz.

BTW is the link down? I get a 404 now.
 

Tapoer

Member
May 10, 2015
64
3
36
This is quite a claim, that this is "pretty normal".

Are there any other DX12 engines that present this kind of behavior? Usually I see the 290X/390 getting better performance than the 970, but I would assume that the 380x basically *matching* the 970 would make a bit more buzz.

BTW is the link down? I get a 404 now.

Yes, it's giving error 404 for me too, but you can use google cache to see the original article.

Also weird is that the 980 TI is only 25% faster than the 980.

ashes_1080p_multi.png

ashes_2160_multi.png
ashes_1080p_dx11_vs_dx12.png

ashes_2160_dx11_vs_dx12_2.png
 

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
...

Also weird is that the 980 TI is only 25% faster than the 980.

...

Why in the world is that weird? In general a reference 980 Ti is only 20-25% faster than a reference 980:

18% faster at 1080P:
perfrel_1920_1080.png


25% faster at 4k:
perfrel_3840_2160.png
 

Azix

Golden Member
Apr 18, 2014
1,438
67
91
Edit: with the results above we see fury x destroying the 980ti. Async is definitely in play. Definitely not just 5 fps but 10 in this case. Need more sites testing though.

Hi there. Lead Designer here.
Beta 2 of Ashes will make heavy use of Async compute.
On GPU X: We get 30fps with Async off and 35 with it on. So it's a pretty big deal.
However, on GPU Y: We're not seeing any difference but the vendor is stating that they will have a driver soon that will show a difference.

https://www.reddit.com/r/Ashesofthe...devs_async_compute_usage_verification/d0044bg

Seems legit.

30 to 35 is a big deal sort of, but expected more. We'll see what nvidia comes up with. Driver has been months in the making
 
Last edited:

zlatan

Senior member
Mar 15, 2011
580
291
136
This is quite a claim, that this is "pretty normal".

Are there any other DX12 engines that present this kind of behavior? Usually I see the 290X/390 getting better performance than the 970, but I would assume that the 380x basically *matching* the 970 would make a bit more buzz.

BTW is the link down? I get a 404 now.

I don't know if there is another engine with well optimized multiengine code, but my own multiengine sample codes give nearly the same results. The reason is simple. Some architectures are better suited for the D3D12 multiengine design, while the other architectures structured for a single engine API designs like D3D11.

Don't worry Pascal will have better multiengine support. Nearly as good as GCN.
 

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
Thanks, I had the notion that the TI was on average 30-35% faster...

The 980 Ti does seem to overclock a fair bit better than the 980, so OC vs. OC the 30-35% span might hold true (although this is of course always hard to say with any level of certainty given the variability of overclocking).
 

linkgoron

Platinum Member
Mar 9, 2005
2,588
1,234
136
I don't know if there is another engine with well optimized multiengine code, but my own multiengine sample codes give nearly the same results. The reason is simple. Some architectures are better suited for the D3D12 multiengine design, while the other architectures structured for a single engine API designs like D3D11.

Don't worry Pascal will have better multiengine support. Nearly as good as GCN.

I'm not worried, I have a R9 390, and the improved AMD results are quite encouraging.

However, the 380x giving similar results to a 970 is still quite surprising IMO. I expected the 970 to perform much better. However, the game is still in beta and we're still in the early days of DX12 - so I assume that we'll have to wait and see how drivers and further development change the performance of the cards.
 

Mahigan

Senior member
Aug 22, 2015
573
0
0
No, "caring about" means that they do something for them.
They dont need a driver which is doing AS to improve performance over DX11. It is their job to optimize per hand for nVidia hardware. Otherwise a new driver wont do anything.

And without a NDA they can show the world how great AMD is - over and over and over again. This will be the third (or fourth) time that reviewers will use this "game" to showcase DX12. Even 9 months after the initial release nVidia cards dont take advantages of DX12.

I already explained it all to you in previous posts. I've been explaining this since last August.

GCN is under-utilized in DX11, Maxwell is also under-utilized.

The problem is that while GCN can achieve better utilization under DX12, Maxwell's lack of Asynchronous Compute + Graphics support makes it so it cannot achieve better compute utilization.

All in all though, GCN is at least 2 years ahead of Maxwell. GCN is a better architecture. This isn't fanboyism talking, it's an objective truth.
 

Glo.

Diamond Member
Apr 25, 2015
5,930
4,991
136
Wow - this doesnt make any sense.
You dont optimize for DX12. You optimize to the hardware. That's the only reason why somebody would want to use a low level API.

Otherwise you let the driver and a high level API do the job. :\

If your hardware sports on HARDWARE LEVEL API - then there is no problem.

If your hardware is unable to execute features on hardware level of API - thats when you have to optimize it for hardware, vide: Asynchronous Compute Context Switching on Nvidia hardware, and specific code lines in application just for specific Vendor.

Simple as it can be.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
I already explained it all to you in previous posts. I've been explaining this since last August.

GCN is under-utilized in DX11, Maxwell is also under-utilized.

The problem is that while GCN can achieve better utilization under DX12, Maxwell's lack of Asynchronous Compute + Graphics support makes it so it cannot achieve better compute utilization.

Neither Vulkan nor DX12 are just Async Compute over the previous one. They are whole new APIs which explizit control over the GPU (memory management, draw calls etc.). Without optimizing they are not better.

If your hardware sports on HARDWARE LEVEL API - then there is no problem.

If your hardware is unable to execute features on hardware level of API - thats when you have to optimize it for hardware, vide: Asynchronous Compute Context Switching on Nvidia hardware, and specific code lines in application just for specific Vendor.

Simple as it can be.

Right Oxide has designed an engine which is not optimized for nVidia and they are just doing nothing but disabling certain code paths. Isnt this what i am writing? :D
 

Glo.

Diamond Member
Apr 25, 2015
5,930
4,991
136
I am sorry, but I'm done. I do not have time, and will to explain everything again, and again to you.

You can go back to every post that Mahigan, Zlatan wrote on this topic on this forum. Thats the best way you will understand it.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
You havent explained anything. In fact you wrote things which doesnt make sense in the context of these APIs. Like saying "write something for DX12 API".

You should go and read what CroTeam have written about Vulkan. They have made it clear that the developer is responsible for nearly everything and they have to know what to do.

The fact stands that this is the fourth version of the game which doesnt improve performance on nVidia hardware with DX12 over DX11. Obviously this developer doesnt care as much as they care about the paycheck from AMD to support their development.
 

Mahigan

Senior member
Aug 22, 2015
573
0
0
Neither Vulkan nor DX12 are just Async Compute over the previous one. They are whole new APIs which explizit control over the GPU (memory management, draw calls etc.). Without optimizing they are not better.



Right Oxide has designed an engine which is not optimized for nVidia and they are just doing nothing but disabling certain code paths. Isnt this what i am writing? :D
DX12 allows support for Multi-threaded rendering, Multi-engine, multi-adapter etc

NVIDIA already made good use of multi-threaded command listing and deferred rendering under DX11. NVIDIA used hidden driver threads to boost CPU performance and thus lower their API overhead. This resulted in a higher draw call rate. This is achievable due to a few design perks, one is that since Kepler NVIDIA have been making use of Static scheduling. Static Scheduling means that a large part of your scheduling is done in software. NVIDIA multi-threaded their driver. So several Threads were feeding NVIDIAs hardware. Secondly the gigathread engine allows for larger batches of commands to be fed to NVIDIA GPUs. In essence, NVIDIA GPUs have a much larger command buffer. This is why you're under the illusion that Maxwell is superior to GCN, you never factored in the additional API overhead GCN entails under DX11 by its use of hardware scheduling.

Moving onto Vulcan and DX12, this API overhead has been lifted, giving us a glimpse of the true performance of GCN, Kepler and Maxwell.

On top of this GCN is a much more highly threaded architecture (more parallel). So GCN has a lot of untapped resources on hand. By using Asynchronous compute, you can maximize your applications usage of GCN.

When you couple both together, GCN overpowers Kepler and Maxwell. Why? Because GCN is a better architecture.
3c19c9f57a04929ba9230b7070dc1be3.jpg


Maxwell may have the upperhand it terms of its front end, compute wise however... Maxwel isn't even close.
 

Paul98

Diamond Member
Jan 31, 2010
3,732
199
106
Results are as expected for removing the CPU overhead and using async compute. Also GCN has a good bit more compute power than Maxwell.
 

Hitman928

Diamond Member
Apr 15, 2012
6,637
12,218
136
I am sorry, but I'm done. I do not have time, and will to explain everything again, and again to you.

You can go back to every post that Mahigan, Zlatan wrote on this topic on this forum. Thats the best way you will understand it.

Willful ignorance, it exists.
 

Dygaza

Member
Oct 16, 2015
176
34
101
Where? This small gain in 1080p? :thumbsup:
The AMD card improves over 2x with DX12 over DX11. The nVidia hardware sits at 3,9% or so. Very realistic for a low level API which is optimized for nVidia hardware.

There isnt a difference in 4K. AMD on the other hand jumps 60% in 4K.

Here is screenshots from AMD DX11 vs DX12 with MSI-afterburner attached to them.

DX11:
Fury-X-GPU-usage-DX11-extreme_zpsd7r8uwhe.png

DX12

Fury-X-GPU-usage-DX12-extreme_zpstfsej6bg.png


Notice, this isn't the patch that gives more async compute. Do you understand now where majority of AMD performance over DX11 comes from?

Nvidia don't get this performance uplift since they are already feeding their gpu nearly to 100% capacity.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
It is not about AMD. I'm talking about nVidia. This developer has stated that they need a low level API to create this engine and game. So where is the performance improvement on nVidia hardware? What happen to Star Swarm anyway?!

It is obvious that this is a marketing deal between them and AMD. Their beta versions have no NDA (unlike Fable Legends for example) and they are pushing them to reviewers every few months. Instead of optimizing their engine for nVidia user they are just using their "work" to get after them. You dont even hear anything about the game. It is only about DX12 and how great AMD is.

I dont even know why anybody with nVidia hardware should support this developer.