bigboxes
Lifer
- Apr 6, 2002
- 41,834
- 12,341
- 146
Blaming gameworks is intellectually lazy to be honest![]()
It's a valid concern.
Blaming gameworks is intellectually lazy to be honest![]()
No, that article is about offloading vertex shader processing on to the CPU and managing the long pipelines in GPUs of that era. Irrelevant to this present discussion about NVIDIA's DX11 multi-threading.Might be, but it's still relevant because it shows that NVidia has been pursuing multithreaded drivers for a long time. Who knows what form it has taken in modern times? Only the NVidia software engineers know.
TLP is inherent in GCN; NVIDIA drivers can extract TLP in DX11 from Kepler onward.Despite the apparent gains offered by multithreading, de Waal expressed some skepticism about the prospects for thread-level parallelism for CPUs. He was concerned that multithreaded games could blunt the impact of multithreaded graphics drivers, among other things.
All their effort wasn't much. Mantle was abandoned soon after it was implemented.
Yeah Vulkan and DX12 aren't a thing right?
It's a valid concern.
No, that article is about offloading vertex shader processing on to the CPU and managing the long pipelines in GPUs of that era. Irrelevant to this present discussion about NVIDIA's DX11 multi-threading.
EDIT: You should be more thorough with the understanding of your own links that you post; here is the last paragraph of that article
De Waal cited several opportunities for driver performance gains with multithreading. Among them: vertex processing. He noted that NVIDIA's drivers currently do load balancing for vertex processing, offloading some work to the CPU when the GPU is busy. This sort of vertex processing load could be spun off into a separate thread and processed in parallel.
TLP is inherent in GCN; NVIDIA drivers can extract TLP in DX11 from Kepler onward.
so you can't just use deferred contexts and not driver command lists.
No. Deferred contexts requires driver command lists to work properly. The two are complimentary, and need to be used together for the draw call batch submission to work in parallel.
I think the really relevant question is, how many games today are actually draw call bottlenecked? Very few, and even then only in certain circumstances.
It scales all the way up to a deca-core CPU, which is incredible.
task based parallelism
Blaming gameworks is intellectually lazy to be honest![]()
As for DX11, it inherently has minimal ability to scale rendering across multicore/multithreaded CPUs
which is exactly why NVidia developed a way to partially circumvent that limitation, and why AMD hardware has been so affected by underutilization.
That's pretty simple. Many of the effects in the Gameworks SDK are based on compute shaders. GCN can handle compute very well; as long as there aren't other bottlenecks.If you honestly believe that, then you don't really understand what Gameworks is. And then you'd have to explain to me why certain Gameworks titles ran faster on AMD hardware, and vice versa.
Let's be more specific.I think you need to take your own advice and read the article more closely.The article is about NVidia drivers exploiting multicore/multithreaded CPUs to increase performance. Offloading vertex processing was just ONE example of that. In fact, at the time that article was written, vertex processing was already being offloaded to the CPU when the GPU was busy:
So like I said, you're wrong and it's clear that you're trying to minimize NVidia's longstanding efforts to exploit multicore/multithreaded CPUs and make it seem like it's something that only occurred with the advent of DX11 when it's clear this has been going on for much longer.
That's just regular pipelining.Some of the driver's other functions don't lend themselves so readily to parallel threading, so NVIDIA will use a combination of fully parallel threads and linear pipelining. We've seen the benefits of linear pipelining in our LAME audio encoding tests; this technique uses a simple buffering scheme to split work between two threads without creating the synchronization headaches of more parallel threading techniques.
He was concerned that multithreaded games could blunt the impact of multithreaded graphics drivers, among other things.
As long as you don't load the primary thread with other stuff, it doesn't.I have no idea what this means. TLP is inherent in GCN?Well tell me this, if "TLP is inherent to GCN," then why does GCN have such a hard time coping with DX11 CPU performance?
As for DX11, it inherently has minimal ability to scale rendering across multicore/multithreaded CPUs, which is exactly why NVidia developed a way to partially circumvent that limitation, and why AMD hardware has been so affected by underutilization.
Yeah Vulkan and DX12 aren't a thing right?
Please point me to the part of my post where I mention Vulkan and/or DX12. While you're at it, go ahead and explain how Mantle wasn't abandoned shortly after it's inception, which is what my post said.
Stop intentionally missing the point. It is widely known Vulkan is Mantle + Intel/Nvidia compatibility and DX12 is derivative of it as well in some capacity.Please point me to the part of my post where I mention Vulkan and/or DX12. While you're at it, go ahead and explain how Mantle wasn't abandoned shortly after it's inception, which is what my post said.
Uhh I did, did you not read the links? Vulkan is Mantle. They took what AMD had done and used that to create Vulkan. I mean even the name... Mantle = inside of of the Earth. Vulkan for Volcano, erupting... Even DX12 is influenced by Mantle which is why I mentioned it as well. Its clear because Vulkan and DX12 have many very similar features and capabilities which is why its easier to port between them than from either OpenGL or DX11 to them in the first place.
Also as @Despoiler pointed out Vulkan still exists and is used in LiquidVR and internal projects at AMD for testing new features.
It's not lazy when its true. Gameworks titles are optimized for Nvidia first and foremost. Please show me any AAA GW title that doesn't run very poor on AMD hardware at launch and that isn't at least 10% faster within a few weeks.
That is simply not true, go through this - straight from Ryan Smith himself.
As long as you don't load the primary thread with other stuff, it doesn't.
I said AMD abandoned mantle, which they did
Yes you can. AMD does not support driver command lists but does support deferred context lists. I've tested this myself.
Yeah been a thing for a decade now
It's not lazy when its true. Gameworks titles are optimized for Nvidia first and foremost. Please show me any AAA GW title that doesn't run very poor on AMD hardware at launch and that isn't at least 10% faster within a few weeks.
Didn't you just claim the opposite when pointing out how amazingly well Ghost Recon Wildlands scales??
I don't think a single person is claiming otherwise. We all know that Nvidia does more work in drivers instead of hardware. That's what the whole post is about.
That's pretty simple. Many of the effects in the Gameworks SDK are based on compute shaders. GCN can handle compute very well; as long as there aren't other bottlenecks.
Let's be more specific.
It talks about offloading vertex processing to a separate thread. Again, this is before unified shaders existed.
Lastly,
As long as you don't load the primary thread with other stuff, it doesn't.
That is simply not true, go through this - straight from Ryan Smith himself.
Well you set the bar low, so it wasn't very difficult for me to find a AAA GW title that ran well on AMD from the start. And this is with HBAO+ enabled as well.
It doesn't matter. Deferred contexts must work hand in hand with driver command lists or they're useless.
Again wrong, I've used them myself and they work without driver command lists (which AMD does not support!)2) AMD never supported driver command lists because they couldn't get it to work in their drivers, and driver command lists are absolutely necessary for DX11 multithreading to work.
To claim otherwise is just foolishness. We can only postulate.
excellent CPU scaling has very little, if anything to do with DX11, and more to do with task based parallelism.