So nVidia can lower the CPU overhead with more CPU overhead.

There is no "software scheduling". They just got rid of some part of the hardware scheduler. Most of the work happens on the gpu otherwise it wouldnt work.
They cant feed the GPU from more threads. This concept doesnt exist under DX11 and only with multithreading drivers it is possible in certain ways.
Umm...
The*DirectX 11 API has the ability to use multiple CPU cores:
"DX11 adds multi-threading support that allows applications to simultaneously create resources or manage state and issue draw commands, all from an arbitrary number of threads. This may not significantly speed up the graphics subsystem (especially if we are already very GPU limited), but this does increase the ability to more easily explicitly massively thread a game and take advantage of the increasing number of CPU cores on the desktop."
http://www.anandtech.com/show/2716/3
DX11 adds multi-threaded capabilities to the pipeline when working with parallel loads:
" The major benefit I'm talking about here is multi-threading. Yes, eventually everything will need to be drawn, rasterized, and displayed (linearly and synchronously), but DX11 adds multi-threading support that allows applications to simultaneously create resources or manage state and issue draw commands, all from an arbitrary number of threads."
http://www.anandtech.com/show/2716/3
DirectX 11 adds the deferred context/command listing feature which allows for multi-workload management:
"A deferred contexts is a special ID3D11DeviceContext that can be called in parallel on a different thread than the main thread which is issuing commands to the immediate context. Unlike the immediate context, calls to a deferred contexts are not sent to the GPU at the time of call and must be marshalled into a command list which is then executed at a later date. It is also possible to execute a command list multiple times to replay a sequence of GPU work against different input data."
http://docs.nvidia.com/gameworks/c.../d3d_samples/d3d11deferredcontextssample.htm
Nvidia on the benefits of using deferred contexts under DX11:
" The entire reason for using or not using deferred contexts revolves around performance. There is a potential to parallelize CPU load onto idle CPU cores and improve performance.
You will be interested in using deferred context command lists if:
Your game is CPU bottlenecked.
You have a significant # of draw calls (>3000).
Your CPU bottleneck is from render thread load or Direct3D API calls.
You have a threaded renderer but serialize to a main render thread for mapping incurring sync point costs."
http://docs.nvidia.com/gameworks/c.../d3d_samples/d3d11deferredcontextssample.htm
So what are you talking about exactly?