because this was the problem all along with nvidia they are offloading gpu resources to the cpu this was that everyone knew its was going to happen on dx12/mantle/vulkanBecause their DX11 is worse than nVidia's and nVidia's architecture isnt geometry bound like AMDs.
"quality sources"? You mean this insider guy from oxide? Haha.
If you think nVidia wont benefit from multi thread rendering then explain why this demo (Vulkan Threaded Rendering) scales over all 8 threads of my CPU:
https://developer.nvidia.com/vulkan-android#samples
Yes, and with Async Compute you can send BOTH Graphics AND Compute AND Copy to the GPU at the same time. When without Async Compute you will only send one of them at the time.
:\But you need lots of CPU Threads in order to do that. That is why you will not get that much benefit from Multi-Core CPUs (Multi-Rendering) on MAXWELL. Because at the end of the day the GPU will only render one job at a time, either Graphics OR Compute OR Copy.
because this was the problem all along with nvidia they are offloading gpu resources to the cpu this was that everyone knew its was going to happen on dx12/mantle/vulkan
this is what happened on aos too they hammered the cpu with additional workload while the cpu was already doing the normal graphic workload..
 
	You can send as many queues as you want to a GPU at the same time.
:\
Sorry, but you should read more about the API and GPUs. GPUs doing more than just "one job at a time". They need thousends of threads to hide the latency.
They dont offloading anything.
Well, it's a complicated topic. Few people understand half the things written in GPU white papers.I know that he meant. Has still nothing to do with Multi Threaded Rendering.
Looking forward to itThe next week will be interesting for the Ashes fans. I saw the new RC update and it is remarkable.
you cant really tell what is going on on the cpu if you find a way to distinguish normal cpu game load from what the gpu is throwing on the cpu tell meI know that he meant. Has still nothing to do with Multi Threaded Rendering.
you cant really tell what is going on on the cpu if you find a way to distinguish normal cpu game load from what the gpu is throwing on the cpu tell me
oxide told us back on ocn that nvidia was offloading many of the processes on the cpu when the async was getting more and more involved into the game
The next week will be interesting for the Ashes fans. I saw the new RC update and it is remarkable.
And that's the whole argument here,is "doesn't support" really the right wording?He means executing work items from different queues concurrently.
SM200 supports: compute + compute, copy + compute, graphics + copy.
GCN supports: compute + compute, copy + compute, graphics + copy, compute + graphics.
Compute + Graphics is the performance boosting option and what AMD mean when they say "Asynchronous Compute". SM200 doesn't support this feature.
True, GCN is wider than SM200 though. Fiji is up to 163,840 threads wide and SM200 (GTX 980 Ti) is up to 45,056 threads wide.
Evidently, GCN can deal (and must deal) with more compute work in order to shine.
And that's the whole argument here,is "doesn't support" really the right wording?
SM200 can utilize all it's 45,056 threads with only graphics or only compute so doing async only adds complexity (context switching) and you get no boost or even lower performance.
GCN can not utilize all it's 163,840 threads with only graphics or only compute so it is the only arc that "supports" compute + graphics.
Same with the previous argument,1600x900 pixels divided through the available threads gives you a very small number of pixels for every thread so GCN is underutilized.
3840x2160 pixels divided through the available threads gives you a high number of pixels for every thread so sm200 is overburdened. (and so is GCN just not as much)
 
	 
	 
	 
	=GCN is always bottlenecked/bound/limited whatever unless you also do compute on the side.
(furyx 980ti)DirectX12 is showing that AMD were CPU bottlenecked at lower resolutions under DX11. Under DX12 demo's and Game Demo's, GCN does better at lower resolutions than SM200. An exact reversal of what we see under DX11.
Maybe because AMD has no driver that would do Multi-Threaded so of course it's the same.No, GCN is API bottlenecked under DX11. GCN could fair better if the systems CPU was faster as GCN hammers the CPUs primary thread. You can clearly see this in the above 3DMark API over head test. A 290x's D3D11 Single threaded results are the same as its Multi-Threaded results. The R9 290x also has lower single threaded and multi-threaded figures when compared to a GTX 980.
Again we have no idea of how much CPU time is spend on this.Once running D3D12, the roles are reversed and the 290x is set free from the CPU bottle neck caused by the DX11 API over head.
The Command Processor on GCN is faster than the one on SM200 but the lack of deffered rendering and multi-threaded command listing in GCN hampers its performance under D3D11.
(furyx 980ti)
We have no idea of how big the CPU utilization is or how much async compute is going on, ~60FPS is lower then the 70-80FPS these cards are getting in tomb raider in 1080(you have to slow down for the GCN to be competitive? ) ,while 52-53FPS is ~exactly the same as the FPS they get in tomb in 1440...
that's the only thing we can say for sure just by looking at these benches.
Also one FPS difference is no difference, it's within statistical error.
Maybe because AMD has no driver that would do Multi-Threaded so of course it's the same.
And yes it has lower single threaded and multi-threaded figures because it has a lot but slow shaders.
Again we have no idea of how much CPU time is spend on this.
Both are API bottlenecked under DX11 since sm200 also gains a lot under d3d12.
The Command Processor on GCN is faster than the one on SM200 just like a fx-8350 is faster then a i3-4170 but the lack of multi-threaded games hampers the fx's performance.
Speed of execution and amount of execution are not the same,the i3 is much faster in executing commands while the fx can execute more commands in a given time.
Let's face it,async is going to make games slower for everyone just like gameworks has done until now...
I have a 7950 upgraded to the 290 and would have upgraded to fiji if it was worth it but will upgrade to Polaris.Actually, not entirely true. It will be yet another boost injection for old GCN hardware (7970 FTW!). It means these old GPUs will serve their owners a little longer than their already retired competition. Which means amd will miss another sale compared to nvidia.
Someone who bought gtx680 already upgraded to 780 and later to 970 and will upgrade again to pascal for dx12.
Someone who bought 7970 a couple of years ago will be running dx12 games through 2016 on it
Isn't that the same difference? Every lighting effect is a (several? ) computation that has to be done in parallel/at the same time with all the other graphics,right?Actually we know how much Async compute is going on because both Lionhead and Oxide have told us. The answer is "very little". 5% of compute work items each.
Ashe's does more than Async. Kollock, an ashes dev, stated that they wouldn't use Ashes of the Singularity as an example of Async compute as they make very mild usage of the feature.
What's hammering the GPUs is the lighting. Each unit produces its own lighting.
Both engines are vastly different so you can't compare Tomb Raider to Ashes in terms of GPU bound scenarios. You can for CPU bound scenarios however.
And that's the whole argument here,is "doesn't support" really the right wording?
SM200 can utilize all it's 45,056 threads with only graphics or only compute so doing async only adds complexity (context switching) and you get no boost or even lower performance.
GCN can not utilize all it's 163,840 threads with only graphics or only compute so it is the only arc that "supports" compute + graphics.
Same with the previous argument,1600x900 pixels divided through the available threads gives you a very small number of pixels for every thread so GCN is underutilized.
3840x2160 pixels divided through the available threads gives you a high number of pixels for every thread so sm200 is overburdened. (and so is GCN just not as much)
=GCN is always bottlenecked/bound/limited whatever unless you also do compute on the side.
Stunned I tell you, absolutely stunned.
Well go ahead and give me your twisted, convoluted argument on why AMD loses perf relative to nVidia as resolution decreases,why is it harder to do less?This has to be the most twisted, convoluted argument I've ever read. This is just the never ending stream of excuses that we get because nVidia loses perf relative to AMD as resolution increases.
Well go ahead and give me your twisted, convoluted argument on why AMD loses perf relative to nVidia as resolution decreases,why is it harder to do less?
At least loosing performance when there is more data to perform on is logical,loosing less when there is more to perform is a special kind weird.
I meant for NVIDIA, what hammers the GPU is the lighting. Hence why they removed some of the post processing effects in order to be competitive with AMD via a driver cheat (as confirmed by Kollock in not so many words). Kollock stated that both AMD and NVIDIA should look the same when rendering Ashes. He then asked "Did this occur after an NVIDIA driver update?".Isn't that the same difference? Every lighting effect is a (several? ) computation that has to be done in parallel/at the same time with all the other graphics,right?
If you look for GPGPU/OpenGL lighting is one of the most common things you'll find.
Yes,both engines are vastly different but if they both manage the same amount of utilization on both cards they will also provide the same FPS at the same resolution on both cards.
(Unless one engine would use an effect that one of the cards can't handle at all)
It's a very very basic strategy game that hammers your GPU without any reason at all even with nothing at all going on on screen...Holy crap guys, this is all a little much. Does anyone even like this game? I know the DX stuff is interesting, but how is the game? I'm starting to realize that most people here are obsessed with GPU technology and they don't even have much use for the GPU in the first place. When I say "most people here" I actually mean "my damn self and maybe a few of you guys".
 
	 
	 
	That's exactly what I said earlierYou're trying to turn it on its head but you're missing because you don't understand what you're talking about, it's not harder for AMD to do less, it's actually easier, but it's the exact thing for NV, the resolution matters to them more. AMD loses performance less slowly than NV when the resolution is increased, it's that simple, they don't lose performance at lower resolutions except in comparison to NV.
You'd understand these things if you read Mahigan's posts till you comprehended them. They're very good posts.
Same with the previous argument,1600x900 pixels divided through the available threads gives you a very small number of pixels for every thread so GCN is underutilized.
3840x2160 pixels divided through the available threads gives you a high number of pixels for every thread so sm200 is overburdened. (and so is GCN just not as much)

 
				
		