computerbaseAshes of the Singularity Beta1 DirectX 12 Benchmarks

Mahigan · Feb 28, 2016

ThatBuzzkiller said:
The reason why Rise of the Tomb Raider doesn't feature DX12 is because they weren't finished implementing it ...

Not because Nvidia paid them to rip it out ...

You need to know that DX12 on the Xbox One is NOT the same as DX12 on PC! Both have different compilers and different exposed feature sets!

Partially true,

But the differences are minimal compared to porting it onto DX11.

This image assumes PS4 and DX11 console code hence low-medium. DX12 code takes a rather low effort.

This doesn't mean NVIDIA themselves forced the non-adoption of DX12 (though possible if you take a glance and their history with Ubisoft). Another possibility is that SquareEnix wanted the extra sales that DX11 would bring them (Windows 10 adoption rate comes to mind).

The end result is a game that could have run smoother and allowed for 4K gaming on a single GPU being abandoned for the game we got.

Oh and Purehair/TressFX can run over Asynchronous compute. That could have been a nice feature.

Mahigan · Feb 28, 2016

And for those who think that Asynchronous compute is something which you add to a game in order to intentionally cripple a GPU..

Basically, you mark a shader and if the GPU supports the feature, it executes concurrently and in parallel to the Graphics tasks.

If your GPU doesn't support that, it gets added into the main Graphics queue.

And since Oxide provided and open access to the source code to IHVs (AMD/NVIDIA) then both can optimize the shader code for their architectures.

So in the end, what hurts NVIDIA isn't intentional gimping by Oxide but rather their lack of Asynchronous compute + graphics support.

Bacon1 · Feb 28, 2016

Mahigan said:
Bacon1 was right.

Did you mean about deferred command lists or something else?

Nice write up though your posts are always very informative and have clear examples to back them up

TheELF · Feb 28, 2016

Mahigan said:
Yeah,

I mean Ashes of the Singularity is 80/20 (Graphics/Compute) as per Kollock. The next iteration of the Nitrous engine will be 50/50 as per Kollock.

Wouldn't that slow down the graphics considerably?

Mahigan said:
So in the end, what hurts NVIDIA isn't intentional gimping by Oxide but rather their lack of Asynchronous compute + graphics support.

In the end why don't they let the user decide?
With physicx for example you can let your CPU handle it if you like,why in the world do they try to compute everything on the GPU?I would love to see some CPU utilization % for this game(in-game not for the bench where we already saw GPU bound of ~50% and more) because in the end if they don't even try to utilize the CPU it is just a showcase for AMD graphics cards and nothing more.

ShintaiDK · Feb 28, 2016

Magee_MC said:
I wish that I was able to post benchmarks. So far every time I try to run the benchmark program the game crashes my entire system.

I've got a 2500K and a R9 290 and I can't get farther than 30 seconds or so into the bench before my whole system locks up. I've tried both the 16.2 drivers and the 15.12 ones, DX11 and DX12, both with my normal OC of 4.3 and at stock to no avail.

Hopefully I'll be able to figure out what the problem is because the game definitely looks interesting.

Try down clock your 290 and test again.

sontin · Feb 28, 2016

Mahigan said:
Basically, you mark a shader and if the GPU supports the feature, it executes concurrently and in parallel to the Graphics tasks.

You dont mark shaders you put commands into an additional queue to not go through the graphics pipeline. This is marked by the command queue type or in Vulkan by bits.

If your GPU doesn't support that, it gets added into the main Graphics queue.

Wrong. It gets executed by the graphics engine.

And since Oxide provided and open access to the source code to IHVs (AMD/NVIDIA) then both can optimize the shader code for their architectures.

Shader optimizing has nothing to do with low level API. Most performance issues come from memory management, multi thread rendering, synchronisation of queues, fences etc. The driver cant do anything to it. It is the job of the developer to optimize it for every architecture.

So in the end, what hurts NVIDIA isn't intentional gimping by Oxide but rather their lack of Asynchronous compute + graphics support.

What hurts nVidia is the intentional gimping by Oxide. Saying otherwise would mean that developers are not responsible for the performance with low level APIs. :\

ShintaiDK · Feb 28, 2016

Mahigan said:
And since Oxide provided and open access to the source code to IHVs (AMD/NVIDIA) then both can optimize the shader code for their architectures.

So in the end, what hurts NVIDIA isn't intentional gimping by Oxide but rather their lack of Asynchronous compute + graphics support.

Since you and Kollock seems to know one another. And you can get him into discussions on a forum on your invite. How about you come up with some sources that isn't AMD or Oxide.

All I read is based on AOTS. If AOTS isn't going to be the standard of DX12, but rather than outliner that wont show the norm. Then someone is going to look very foolish and biased.

Kollock outright defended the known CPU disaster known as FX series as well.

Silverforce11 · Feb 28, 2016

sontin said:
What hurts nVidia is the intentional gimping by Oxide. Saying otherwise would mean that developers are not responsible for the performance with low level APIs. :\

Technically this is true.

Because Oxide enabled AC on NV GPU by default, it hurts NV GPUs since they can't handle it.

What they should do is disable it or even disable DX12 on NV, then let NV optimize the game in DX11 mode.

Leadbox · Feb 28, 2016

ShintaiDK said:
Since you and Kollock seems to know one another. And you can get him into discussions on a forum on your invite. How about you come up with some sources that isn't AMD or Oxide.

I would suggest you do the same, bring sources that aren't just your skepticism.

ShintaiDK · Feb 28, 2016

Leadbox said:
I would suggest you do the same, bring sources that aren't just your skepticism.

Fable was already dismissed as not being async compute enough. Only the AMD sponsored Oxide is good enough as the sole source of information it seems.

We can try see how the game is set when DX12 releases comes out. Quantum Break will be the first full game. Then we can see if putting all the eggs in the same basket was a good idea or not.

Ext3h · Feb 28, 2016

Silverforce11 said:
Technically this is true.

Because Oxide enabled AC on NV GPU by default, it hurts NV GPUs since they can't handle it.

What they should do is disable it or even disable DX12 on NV, then let NV optimize the game in DX11 mode.

You are assuming that Oxide makes the decision which API and which features should be picked per default for each vendors hardware platform.

Hint: That is not how it works. That's why both vendors are sending engineers over to ensure that the game runs on their hardware as good as possible.

Further, from an PR standpoint, ask yourself what's worse: "We can't handle that feature" or "We support that feature, even if it gains us nothing"?

ShintaiDK said:
Fable was already dismissed as not being async compute enough.

Fable was mostly dismissed for being optimized by other means, achieving a much better utilization even without AC. Unless we had absolute numbers, accurately representing the actual hardware utilization, we can't say for sure if additional gains could have been possible or not. After all, the performance was already pretty impressive on hardware from both vendors.

Leadbox · Feb 28, 2016

ShintaiDK said:
Fable was already dismissed as not being async compute enough. Only the AMD sponsored Oxide is good enough as the sole source of information it seems.

We can try see how the game is set when DX12 releases comes out. Quantum Break will be the first full game. Then we can see if putting all the eggs in the same basket was a good idea or not.

Yes we wait for more games, but that shouldn't stop us from discussing or speculating on what we see now, it's what forums are for.
If there was a dx 11 path for Fable, I bet we would be seeing the same and that is little to no performance gain on nvidia going to dx12.

ShintaiDK · Feb 28, 2016

Leadbox said:
Yes we wait for more games, but that shouldn't stop us from discussing or speculating on what we see now, it's what forums are for.
If there was a dx 11 path for Fable, I bet we would be seeing the same and that is little to no performance gain on nvidia going to dx12.

The discussion seems to be very one sided tho doesn't it. Its quite clear what you and some others hope for.

sontin · Feb 28, 2016

Ext3h said:
Fable was mostly dismissed for being optimized by other means, achieving a much better utilization even without AC. Unless we had absolute numbers, accurately representing the actual hardware utilization, we can't say for sure if additional gains could have been possible or not. After all, the performance was already pretty impressive on hardware from both vendors.

AMD promoted Asynchronous Shaders in Fable Legends and Microsoft said that this game is "gpu bound":

Fable Legends pushes the envelope of what is possible in graphics rendering. It is also particularly representative of most modern AAA titles in that performance typically scales with the power of the GPU. The CPU overhead in these games is typically less of a factor, and, because the rendering in the benchmark is multithreaded, it should scale reasonably well with the number of cores available. On a decent CPU with 4-8 cores @ ~3.5GHz, we expect you to be GPU-bound even on a high-end GPU.

http://www.pcper.com/reviews/Graphi...-Benchmark-DX12-Performance-Testing-Continues

And here is the part about Async Compute:

Dynamic GI is the cost of our dynamic LPV-based global illumination (see http://www.lionhead.com/blog/2014/april/17/dynamic-global-illumination-in-fable-legends/). Much of this work runs with multi-engine, which reduces the cost.

Compute shader simulation and culling is the cost of our foliage physics sim, collision and also per-instance culling, all of which run on the GPU. Again, this work runs asynchronously on supporting hardware.

AMD published numbers and here is a comparision with the GTX980TI:

872879d1451935193-amd-polaris-erster-exklusiver-blick-auf-chip-und-architektur-fuer-2016er-radeons-fury-x-980.jpg

/edit: Asynchronous Shaders makes the Fury X 15% faster.

Zanovar · Feb 28, 2016

ShintaiDK said:
The discussion seems to be very one sided tho doesn't it. Its quite clear what you and some others hope for.

Indead.its very clear.

Leadbox · Feb 28, 2016

ShintaiDK said:
The discussion seems to be very one sided tho doesn't it. Its quite clear what you and some others hope for.

One side has verifiable data and the other has doubts, it's bound to be one sided and don't worry, our hopes(whatever they are) won't sway you one bit now will they?

prtskg · Feb 28, 2016

guskline said:
Early on it this thread I posted my Benchmarks for both of my rigs below. I have now updated to Beta 2 and updated my Radeon drivers with the latest hotfix.

Ashes now supports multiple gpus so now both R9 290s are running in CF (previously a single R9 290 produced 36.4 in DX11 and 39.3 in DX12).

Now the scores for the 4790K @4.7Ghz and 2 R9 290s in CF is:

DX12 only as DX11 does not support CF 58.6fps overall while the single GTX980TI produces 45.4fps in DX12 and 46.4 fps in DX11.

Is there any difference in picture quality like it was last time you benchmarked? I mean Nvidia has already said that they haven't implemented async compute yet and the game uses it to improve picture quality.

It is the job of the developer to optimize it for every architecture.

The job of developer is to code according to api and job of IHV is to provide gpu conforming to api.

sontin · Feb 28, 2016

prtskg said:
The job of developer is to code according to api and job of IHV is to provide gpu conforming to api.

Doesnt make sense with Vulkan and DX12.

https://developer.nvidia.com/transitioning-opengl-vulkan

raghu78 · Feb 28, 2016

sontin said:
Doesnt make sense with Vulkan and DX12.

https://developer.nvidia.com/transitioning-opengl-vulkan

the problem is nothing makes sense to you if its not to Nvidia's benefit. Async compute works better on AMD. Thats a fact as we have verifiable data from games. btw AoS is just the start. Some of the async compute stuff developers are doing on PS4 and Xbox One is far more aggressive. So if you want to keep whining suit yourself you will have lot of opportunity in future too. Maybe if Pascal handles async compute much better you will stop. Till then have fun.

Dygaza · Feb 28, 2016

TheELF said:
In the end why don't they let the user decide?
With physicx for example you can let your CPU handle it if you like,why in the world do they try to compute everything on the GPU?I would love to see some CPU utilization % for this game(in-game not for the bench where we already saw GPU bound of ~50% and more) because in the end if they don't even try to utilize the CPU it is just a showcase for AMD graphics cards and nothing more.

You actually can disable async in current build.

AsyncComputeOff=1 in settings file

Paul98 · Feb 28, 2016

Anyone run that dx12 asteroid demo recently? What sort of numbers do you get with what setup?

sontin · Feb 28, 2016

raghu78 said:
the problem is nothing makes sense to you if its not to Nvidia's benefit.

This picture shows the difference between API OpenGL and API Vulkan. If you would "develop" just to the API there wouldnt be any difference at all.

Async compute works better on AMD. Thats a fact as we have verifiable data from games. btw AoS is just the start. Some of the async compute stuff developers are doing on PS4 and Xbox One is far more aggressive. So if you want to keep whining suit yourself you will have lot of opportunity in future too. Maybe if Pascal handles async compute much better you will stop. Till then have fun.

And Fable Legends doesnt count?

You have only facts from a biased game developed by a biased company which will defend always their sponsor - look at the response to the Guru3D story about the display problem of frames.

Paul98 · Feb 28, 2016

sontin said:
This picture shows the difference between API OpenGL and API Vulkan. If you would "develop" just to the API there wouldnt be any difference at all.

What the heck do you think you are trying to say here?

USER8000 · Feb 28, 2016

Paul98 said:
What the heck do you think you are trying to say here?

God knows. I am running a GTX960 4GB as a stop gap card until Polaris and Pascal are out,and he seems really annoyed that AMD is doing better in this game than Nvidia. Heck,even in Fable Legends the R9 285 was doing better too than my GTX960 according to Anandtech and The Techreport.

Meh,so what - its part of being an enthusiast,otherwise you might as well buy a console and not worry about these kind of things.

My 8800GTS(and the 8800GT/9800GT) lasted longer for gaming than an HD3870,the 9500 PRO was better than a mates FX 5600XT longterm,the 6800GT was better than the equivalent X800 series longterm,X1950 PRO a bit better than the 7900GS,etc.

Anybody who has been an enthusiast for a while would realise,companies make bets on microarchitechures years ago,and sometimes longterm these bets work and sometimes they don't.

parvadomus · Feb 28, 2016

sontin said:
This picture shows the difference between API OpenGL and API Vulkan. If you would "develop" just to the API there wouldnt be any difference at all.

And Fable Legends doesnt count?
You have only facts from a biased game developed by a biased company which will defend always their sponsor - look at the response to the Guru3D story about the display problem of frames.

Just like with DX11, the developer must optimize for different architectures, at certain points (I remember old architectures having to unroll loops of shader programs cause of lack of hw flow control, also you always had to be careful the amount of branches a shader had. that kind of things could kill performance for certain architectures), but the API is the same for everyone, you wont never code at assembler level, neither for different instruction sets :\
Here we have things like crapworks..

In DX12 its the same, but you have control over even more things.. I suppose the API has functions like "give me the number of available queues", I get it and then use the amounts I want. Or if the hw supports async compute or not. You always have to think how the code will perform for all platforms. If one platform does not work OK for something (like async compute) then at least give the option to disable it, you cant do much more for inferior architectures.

computerbaseAshes of the Singularity Beta1 DirectX 12 Benchmarks

Senior member

Senior member

Diamond Member

Diamond Member

Lifer

Diamond Member

Lifer

Lifer

Senior member

Lifer

Junior Member

Senior member

Lifer

Diamond Member

Diamond Member

Senior member

Senior member

Diamond Member

Diamond Member

Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Senior member