Ashes of the Singularity User Benchmarks Thread

Page 41 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
This is what is being circulated as the problem/fix with async compute on nVidia vs. AMD
SOURCE

I don't know why you persist on using Mahigan as a source, when he has been so discredited.

So "supports async compute" seems to not be so straight forward with nVidia as many of the tasks are being done by the CPU. Tasks that should be done by the GPU but Maxwell 2 (or Maxwell, Kepler, Fermi) can't do it.

You've completely swallowed AMD's marketing propaganda hook line and sinker. I told you before, there is no actual specification for asynchronous compute in DX12, so saying that the scheduling should be handled by the GPU is nonsensical.

IHVs should implement whatever works best for their particular architecture. Apparently, neither Intel or NVidia think that hardware schedulers are appropriate for their current DX12 GPUs.

I'm curious to see how effective this is. The whole purpose of async compute is to more efficiently use resources on the GPU. Is this a truly efficient process or will having the CPU involved cause slow downs?

I can play BF4 at well over 100 FPS at 1440p, and the Frostbite 3 engine makes ample use of compute shaders for lighting and post processing effects, yet rely on the CPU for scheduling.

Contrast to the PS4, which does use asynchronous compute in BF4, and can barely maintain 60 FPS at 900p, even with substantially lowered settings..

So apparently there must not be much of a slowdown :biggrin:
 

railven

Diamond Member
Mar 25, 2010
6,604
561
126
Some may feel that way, but I assure you, those waiting in the wing are those in my position. I'm running an Intel 2400 and a 270.

Looking at the posters, if it were so simple as AMD gaining a boost - kudos. But the notion of Maxwell2 not doing Asynchronous Compute was the cherry they wanted so bad. It was no longer about AMD having the advantage, now it was about Nvidia lying and etc etc.

Those constantly upgrading probably won't care. Just remember, the market percentage of those willing to shell out cash every year is dwindling and will be for quite some time.

As someone who upgrades every generation cycle, I didn't care. If things were "normal" I'd have gone through a 290X and be on a Fury X right now. If AMD gets that advantage and they deliver on their expectations (I'm not even talking about forum hype anymore, but what I determine are AMD's goals) you'll see me rocking an AMD card next upgrade cycle.

I was going to upgrade to a 970 but decided to wait. I'm glad I waited as the 3.5gb issue showed itself. Now, I almost recently upgraded to a 290, but decided against it as dx12 is coming and I should just wait to see some preliminary results.

In conclusion, not everyone is a partisan hack. Just remember that.

The 3.5 upgrade issue was a doozy. And Nvidia hopefully caught enough flack from it. While I'm not trying to defend Nvidia, I feel companies like Amazon basically absorbed the hit that Nvidia should have gotten. Everyone I knew who got a 970 from Amazon took them on their 20% refund offer. Made their 3.5GB basically cost ~$260.

(I'm always reminded of how we as consumers try to exploit every little mistake these companies make, but in fairness, eff em!)
 

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
The devs will use whatever is given to them, by the IHVs and the ISVs like Microsoft and Khronos Group. The devs innovate by themselves by coming up with new and more efficient algorithms for visual effects.

As an example, Lionhead studios came up with a new solution for dynamic global illumination that will be used in Fable Legends, and was actually integrated into the Unreal Engine 4.




It was probably was buggy, seeing as it was in alpha. But yes you're right, that the main issue with NVidia's performance had to do with NVidia themselves.

you might have missed my point, in order to innovate there has to be hardware support first. If not, then the industry would be stuck with only faster versions of old techniques.

eg. how can one author HSA software without the requisite hardware. so while devs are researching novel approaches to problems, the hardware will be there, sitting dormant probably for a few generations before it is ever deployed to end users.
 

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
I don't know why you persist on using Mahigan as a source, when he has been so discredited.



You've completely swallowed AMD's marketing propaganda hook line and sinker. I told you before, there is no actual specification for asynchronous compute in DX12, so saying that the scheduling should be handled by the GPU is nonsensical.

IHVs should implement whatever works best for their particular architecture. Apparently, neither Intel or NVidia think that hardware schedulers are appropriate for their current DX12 GPUs.



I can play BF4 at well over 100 FPS at 1440p, and the Frostbite 3 engine makes ample use of compute shaders for lighting and post processing effects, yet rely on the CPU for scheduling.

Contrast to the PS4, which does use asynchronous compute in BF4, and can barely maintain 60 FPS at 900p, even with substantially lowered settings..

So apparently there must not be much of a slowdown :biggrin:

I could understand your pov for alot of your replies but the bit about the ps4 is a little silly. what gpu are you using to compare with the ps4 apu?
 

tential

Diamond Member
May 13, 2008
7,355
642
121
I don't know why you persist on using Mahigan as a source, when he has been so discredited.



You've completely swallowed AMD's marketing propaganda hook line and sinker. I told you before, there is no actual specification for asynchronous compute in DX12, so saying that the scheduling should be handled by the GPU is nonsensical.

IHVs should implement whatever works best for their particular architecture. Apparently, neither Intel or NVidia think that hardware schedulers are appropriate for their current DX12 GPUs.



I can play BF4 at well over 100 FPS at 1440p, and the Frostbite 3 engine makes ample use of compute shaders for lighting and post processing effects, yet rely on the CPU for scheduling.

Contrast to the PS4, which does use asynchronous compute in BF4, and can barely maintain 60 FPS at 900p, even with substantially lowered settings..

So apparently there must not be much of a slowdown :biggrin:

Did you just compare a PS4 to GTX 980 SLI?

Surely this can't be serious right?

2 x GTX 980 SLI for 1440p 100+ FPS
1 x HD7870 caliber GPU for 900p 60 fps?

You just are proving the point that the console is far more efficient and that if you scaled that GPU to even just a single higher end card on the console you'd be far outperforming those SLI GTX 980....

You want to talk about discrediting people, you're doing it to yourself very fast right now.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
you might have missed my point, in order to innovate there has to be hardware support first. If not, then the industry would be stuck with only faster versions of old techniques.

Yes what you say is true, but historically hardware has nearly always evolved faster than software..

How much of our hardware is actually used by developers? Very little, whether they are limited by the API, funding or willpower..

DX11.2 had some nice features that were never used at all, like tiled resources for instance.. Let the IHVs continue to innovate when it comes to hardware, because much of it will likely never be used.

eg. how can one author HSA software without the requisite hardware. so while devs are researching novel approaches to problems, the hardware will be there, sitting dormant probably for a few generations before it is ever deployed to end users.

I agree. But perhaps I misunderstood your OP, because it seemed like you were saying developers could only innovate if new hardware is available..

That's what I disagree with. The example with the dynamic global illumination is a good one. It's about developers finding new ways to combat an old problem.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Did you just compare a PS4 to GTX 980 SLI?

Surely this can't be serious right?

2 x GTX 980 SLI for 1440p 100+ FPS
1 x HD7870 caliber GPU for 900p 60 fps?

Of course I wasn't comparing them directly. My comment was mostly tongue in cheek, because 3DVagabond suggested that involving the CPU would cause slowdowns..

Well BF4 on PS4 uses asynchronous compute with GNM (a low level parallel API) but it can't maintain the 60 FPS threshold, even at 900p and with lowered settings. So asynchronous compute doesn't seem to be helping it out much there if you ask me.

Whereas on the PC, BF4 uses DX11.1 which is an API with high overhead, but it can still put out blistering frame rates at much higher settings with no performance issues at all.

Why? Because the hardware is just so much more powerful, especially the CPU. Even with a serial high overhead API like DX11.1, a powerful desktop CPU like the Core i7 can bypass any handicap imposed on it by the API, NUMA architecture and a discrete graphics card.

So my point is, don't underestimate the power of desktop CPUs..

You just are proving the point that the console is far more efficient and that if you scaled that GPU to even just a single higher end card on the console you'd be far outperforming those SLI GTX 980....

You missed my point entirely apparently.. I wasn't even talking about the GPUs, it was about the CPU.

You want to talk about discrediting people, you're doing it to yourself very fast right now.

It may seem that way to you, but only because you didn't understand the context of my post.
 
Last edited:

AnandThenMan

Diamond Member
Nov 11, 2004
3,949
504
126
Whereas on the PC, BF4 uses DX11.1 which is an API with high overhead, but it can still put out blistering frame rates at much higher settings with no performance issues at all.

Why? Because the hardware is just so much more powerful, especially the CPU..
Why are you still comparing console hardware to a PC?
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Why are you still comparing console hardware to a PC?

For God's sake man, look at the entire context of the discussion rather than focusing in on a couple of sentences. You're worse than a journalist.

I'm not comparing console hardware to a PC. I'm making a point that despite the handicaps that the PC platform has compared to the consoles (ie high overhead APIs, NUMA, discrete components), the hardware is so powerful that it can overcome them with ease..

This realization should erase any doubts people might have that using the CPU is an good alternative to hardware schedulers.
 

coercitiv

Diamond Member
Jan 24, 2014
6,214
11,961
136
This realization should erase any doubts people might have that using the CPU is an good alternative to hardware schedulers.
Would you argue using the CPU in this case is the better alternative from a performance/power perspective?
 

Magee_MC

Senior member
Jan 18, 2010
217
13
81
For God's sake man, look at the entire context of the discussion rather than focusing in on a couple of sentences. You're worse than a journalist.

I'm not comparing console hardware to a PC. I'm making a point that despite the handicaps that the PC platform has compared to the consoles (ie high overhead APIs, NUMA, discrete components), the hardware is so powerful that it can overcome them with ease..

This realization should erase any doubts people might have that using the CPU is an good alternative to hardware schedulers.

I'd agree that the PC can overcome the additional handicaps compared to the consoles, by essentially using a brute force method, and that while it would take more resources to achieve the same result, it can be done.

However the question is whether it is a good method when compared to another option that has similar brute force in addition to the benefits of a console like API. NVIDIA seems to believe that it will be able to use software/drivers at least in the case of AC to achieve a satisfactory result.

However, I haven't seen any evidence that NV can match the efficiency of hardware schedulers with software running on the CPU. If they were able to do that I wouldn't expect them to have the problems that they have with latency in VR, and I certainly would have expected some sort of statement from on the subject.

NV's silence on the subject is what is allowing this entire question to gain traction. There hasn't been, that I have seen, any comment from them directly while their reputation is getting hammered in the Forums and in the press. If they had a way to handle AC that was as good as AMD's hardware implementation they should have made some statement to the effect of "Don't worry, we've got this."

Since they haven't said anything, that leads me to believe that they might have thought that they had a good answer, but it isn't working out as they had thought, and they're trying to figure out a plan B. The fact that they had Oxide pull AC from the pathway instead of saying, "Ok, here's the rest of the code to activate AC in DX12" reinfoces the perception that they don't have an acceptable solution at the ready.

In either case, this really makes me wonder what is in Pascal. Did NV decide to change the strategy that they've had from Fermi through Kepler and Maxwell, or did they decide to change their archetecture. Either they made changes in how they handle graphics and compute or they didn't. Pascal supposedly taped out a while back, so NV would have had to come to that realization well before then in order to make the needed changes.

I see a lot of people saying that it's ok, even if this is a problem, NV will have it fixed in Pascal, but I don't see any evidence that it did happen, or even that NV was aware of the need to make it happen, especially in the time frame that they would have had to have done it in.
 

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
I dont think NV needs hardware Async Compute for DX-12 for its desktop dGPUs, they may not even implement it in Pascal.
NV sponsored DX-12 games may not even implement AC, so no problem for NV here.

But because AC will have a tremendous effect in Mobile (Tablets and Phones) and a smaller impact in Laptops, they may try to implement a Hardware based AC like the one in AMD GCN.
But since NV is not pursuing the mobile market anymore, they may not need to implement AC after all for the x86 market only.
Hell, very difficult to predict at this point if Pascal will have a hardware AC, they may try something like GCN 1 with only 2 or 4 ACEs on Pascal

On the other hand, AMD needs it for the HSA and they will continue to implement hardware based AC throughout their product portfolio.
 

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
I can play BF4 at well over 100 FPS at 1440p, and the Frostbite 3 engine makes ample use of compute shaders for lighting and post processing effects, yet rely on the CPU for scheduling.

Contrast to the PS4, which does use asynchronous compute in BF4, and can barely maintain 60 FPS at 900p, even with substantially lowered settings..

So apparently there must not be much of a slowdown :biggrin:

I will suggest you downclock your CPU to 1.6GHz and try again to play BF4 and see what will happen ;)
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Would you argue using the CPU in this case is the better alternative from a performance/power perspective?

I don't know enough about this to make such a conclusion. Only an NVidia engineer would know at this point, since they are the ones implementing it in their drivers.

I would only argue that the CPU is a competent choice, because it's been doing this kind of workload for years, and getting faster at doing it..

Now with DX12/Vulkan, CPUs will have even more power for developers/IHVs to tap into courtesy of the lower overhead and parallel rendering.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
However, I haven't seen any evidence that NV can match the efficiency of hardware schedulers with software running on the CPU. If they were able to do that I wouldn't expect them to have the problems that they have with latency in VR, and I certainly would have expected some sort of statement from on the subject.

I've heard a lot of talk about the supposed problems they are having with latency in VR, but I haven't seen any concrete numbers or heard anything from any official sources.. Basically all I've heard is forum jibber jabber speak.

Anyway, asynchronous warp is something that is completely different from asynchronous compute I believe.. I think it's supposed to help bring VR to less powerful machines, but I could be wrong.

NV's silence on the subject is what is allowing this entire question to gain traction. There hasn't been, that I have seen, any comment from them directly while their reputation is getting hammered in the Forums and in the press. If they had a way to handle AC that was as good as AMD's hardware implementation they should have made some statement to the effect of "Don't worry, we've got this."

Perhaps the reason they didn't say anything was because it would be an admission that their DX12 drivers weren't fully ready. Fermi and Kepler still don't have DX12 functionality yet in the drivers, but that's supposed to happen with the 358.xx drivers.

I see a lot of people saying that it's ok, even if this is a problem, NV will have it fixed in Pascal, but I don't see any evidence that it did happen, or even that NV was aware of the need to make it happen, especially in the time frame that they would have had to have done it in.

No use in debating about Pascal at this time, as there's just not enough information to go on.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
I will suggest you downclock your CPU to 1.6GHz and try again to play BF4 and see what will happen ;)

I remember I did that once, but not to 1.6ghz. I tested my frame rate at stock clock and when overclocked to 4.5ghz, and there was only a 10% performance increase if I remember which surprised me.

That was on a slower rig than what I'm on now though, but Frostbite 3 still scales pretty well on DX11.1 CPU wise.....up to six threads..

If I had BF4 installed right now, I would try it out..
 

Glo.

Diamond Member
Apr 25, 2015
5,711
4,559
136
I hope not. I hope they can make it work through the standard DX12 pipeline. If there has to be specific code there will be shenanigans. Guaranteed!

Thanks to drivers and some level of abstraction the Maxwell GPUs "knew" how to prioritize the work in DirectX11 games. Thats how they got so much oomph in such small power envelope. Right now you have only API level, without any level of abstraction. The driver does not have any "understanding" how the application works. It is due to application and the hardware itself. This is mostly why Maxwell GPUs will tank performance in DX12 in comparison to DX11. Unless there is really big amount of CPU cores in the computer. Unfortunately because in Maxwell GPUs there is only one Asynchronous Compute Engine I would not expect any miracles.
Context switching is limited here, and the pipeline is in-order, and the work cannot overlap other "jobs" in GPU. If there would be more ACE's in the GPU the work could be split between them, but thats not the case.
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
I can play BF4 at well over 100 FPS at 1440p, and the Frostbite 3 engine makes ample use of compute shaders for lighting and post processing effects, yet rely on the CPU for scheduling.

Contrast to the PS4, which does use asynchronous compute in BF4, and can barely maintain 60 FPS at 900p, even with substantially lowered settings..

So apparently there must not be much of a slowdown :biggrin:

I remember I did that once, but not to 1.6ghz. I tested my frame rate at stock clock and when overclocked to 4.5ghz, and there was only a 10% performance increase if I remember which surprised me.

That was on a slower rig than what I'm on now though, but Frostbite 3 still scales pretty well on DX11.1 CPU wise.....up to six threads..

If I had BF4 installed right now, I would try it out..


Yea, go ahead and dig yourself deeper.
http--www.gamegpu.ru-images-stories-Test_GPU-Action-Battlefield_4-test-bf4_proz_2.jpg


https://www.youtube.com/watch?t=26&v=M4gaVvHXNC8

1.6 Ghz Jaguar faster than 3,33 Ghz Phenom II. Care to explain how jaguar's IPC deficit and halved clocks and limited GPU power help it perform better than the 6 core phenom II?
 

TheELF

Diamond Member
Dec 22, 2012
3,973
731
126
1.6 Ghz Jaguar faster than 3,33 Ghz Phenom II. Care to explain how jaguar's IPC deficit and halved clocks and limited GPU power help it perform better than the 6 core phenom II?

No VHQ settings in ps4...not even 1080p.

(And these numbers seem very very low for a single player level)
 
Last edited:

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Yea, go ahead and dig yourself deeper.
http--www.gamegpu.ru-images-stories-Test_GPU-Action-Battlefield_4-test-bf4_proz_2.jpg


https://www.youtube.com/watch?t=26&v=M4gaVvHXNC8

1.6 Ghz Jaguar faster than 3,33 Ghz Phenom II. Care to explain how jaguar's IPC deficit and halved clocks and limited GPU power help it perform better than the 6 core phenom II?

You make it so easy to nullify your arguments. First off, the PC version is using VHQ settings in the graph, which means that the draw distance is longer and there are more particle effects. Secondly, the average frame rate is still much higher than it is on the PS4 game which is locked at 60 FPS. Last but not least, you are using the campaign to illustrate your point, which is intellectually dishonest.

Multiplayer is way more intensive than the SP campaign, and that's what BF4 players play.. DigitalFoundry did a great YouTube video which shows how the PS4 performs during multiplayer:

Battlefield 4 Final Code: PS4 multiplayer frame rate tests

The frame rate plummets to the low 40s fairly often on some maps..
 

TheELF

Diamond Member
Dec 22, 2012
3,973
731
126
I'm pretty sure that demo was broken, or DX12 wasn't properly functional. That's why no professional review sites ever used it for benchmarking.
It was running dx12 though right?
I also ran 3dmark api overhead and that also worked on my 650.