DirectX introduced major features like higher texture resolution support, HDR compression formats, computing stuff like Direct Compute, Tessellationband some more stuff for improved performance and none of all those features are available under DX10.
Various features, including DirectCompute are now available on DX10 hardware through DX11 (most DX10 hardware exceeded the DX10 featurelist, and those features can now be used through DX11).
You can run the exact same code under DX9 and DX10 and of course both will run the same.
Uhh, my point was exactly that they do NOT run the same.
DX10/11 have a completely new driver/API design, which reduces CPU overhead compared to DX9. On the other hand, DX9 drivers and runtime are extremely optimized, DX10/11 are not fully mature yet. For example, on Intel IGPs I notice that shaders are a lot less efficient when compiled against DX10/11, compared to DX9.
There are just a lot of factors at work here. Is it the API? Is it the driver? Is it the code?
It isn't like creating code will magically have a boost in performance because of a simply change of API.
Yes, it is, in some cases. Or a drop in performance, like for example the port of Source to OpenGL on Mac.
Metro 2033 proved to run faster under DX11 than DX10, Far Cry 2 proved to run faster under DX10 than DX9, Assassin Creed proved to run faster under DX10.1 than DX10, and we all know that DX11 is a super set of DX10.1, the same goes to DX9.0C being a super set of DX9.0b, but both introduced optimizations to boost speed.
Thing is, you never know what you compare.
In Crysis the new DX10 functionality is mainly used for better image quality. In BioShock it is mainly used for better performance.
So comparisons of APIs based on games are useless.
As I already said before, it's not the API, it's what the developer does with the API that matters.
If you are gonna use the newer API, you will make sure to boost performance using optimizations and tweak your code, only lazy developers will port the same exact code to different API's and voila!, that's why PC gaming is stalled.
Many early DX10 games were actually rather 'lazy' ports of DX9, where DX10 was mainly used to run some extra eyecandy... at the cost of performance. This gave DX10 a bad name.
A proper DX10 engine needs to be redesigned from scratch, as the state management is completely different from DX9, and retro-fitting DX9-code into a DX10 framework is going to be horribly inefficient... yet this is what various early 'DX10' games did.
I approached it from the opposite direction... I started a new DX10 framework with a clean slate, then I retrofitted DX9 into that, backporting some of the new DX10 interfaces to DX9. This means that DX10/11 performance is optimal, and DX9 overhead is marginal. There's just some DX9 functionality that is 'cut off', because it doesn't exist in the newer APIs anymore, like fixed function T&L (although I have put in a backdoor so you can still use it in DX9... but then you can't use the other APIs anymore).
But proper DX9 engines have been purely shader-based for years anyway.
It would be a nice idea if you tweak your code with DX specific tweaks to see how much of a gain can you get, specially with DX10 or DX11 compared to DX9.
But that's exactly what I do. I create an engine that is optimized to the extreme to get the most from any of the three APIs (if it wasn't, there was no way DX10/11 could be anywhere near as fast as DX9... DX10/11 are considerably more difficult to use efficiently, as part of the runtime is just 'missing').
But I do compare the same algorithms, because as soon as you compare different algorithms, it no longer says anything about the API. And I was talking about the API.
Everything else comes down to 'what you do with it', as I already said. And then you can go both ways... better performance, or better quality.