the Q6600 results seem odd. the E8500 is the same basic architecture yet at 3.16 it can only get 23fps so that means at the same 2.4 speed as the Q6600 it would get only 17 fps. it just doesn't make sense that the Q6600 is well over twice as fast clock for clock.
You could have simply looked at the Phenom II X2 3.3 GHz vs Phenom II X4 3.5 GHz. 24 fps vs. 61 fps. If you scale the X2 up to 3.5 GHz, then that's a 6% increase in clockspeed. Assuming perfectly linear scaling, at the very best that would improve the X2 @ 3.5GHz to 26 fps, and that would still make the quad core well over twice as fast (234% as fast). That is very odd considering the X4 only has two extra cores and the chips are otherwise identical, including the amount of L3 cache.
Take a look here. 2600k @ 2.5ghz is only 10% slower than 2600k @ 4.0ghz in this game. Therefore, more likely than not this game doesn't care for clock speed beyond a certain point. In other words, you can't just assume that this game even benefits from E8500 clocked at 3.16ghz. Basically, it could be that beyond 2.8-3.0ghz that additional clock speed is simply irrelevant. Beyond a certain speed the 2 threads don't really run any faster, but the other 2 threads are sitting in idle. So what you need is more cores.
Maybe your point is valid, but the data you provided does not really back up your statement. Obviously the game is completely GPU bottlenecked with the 2600k @ 4.0 GHz, which is going to make your "clockspeed scaling argument" inconclusive at best.
It's really not possible to draw clockspeed scaling arguments when you're running into a GPU bottleneck. First, a game and particularly Crysis is dependent on both the CPU and video card to provide performance. And different scenes in a benchmark will put different loads each on the video card and CPU. You can be both CPU and GPU limited in the same benchmark depending on what is being drawn on screen - and how often this happens and what percentage of the time one is bottlenecked over the other varies on a very context specific situation.
The benchmarks you used both discredit the point you make and illustrate the one I just made, it is clear as day a 2600K @ 3.5 GHz is feeding the video card they are using as much as it can handle
at all times, since the same processor @ 4.0 GHz offers no performance improvements at all.
So I really do not like how you used the 4.0 GHz to back up your argument, because that is simply and purposefully ignoring context to make your argument seem even better than what it is. It's pretty sneaky tactic, but IIRC you have used the same tactic before. Maybe it's simply un-awareness, but I think it's a bit on purpose.
But then drop the speed down the 3.0 GHz, 2.5 GHz, and 2.0 GHz. Now since we know the 2600K is capable of keeping the video card they used completely fed with data, the non-linear trend of performance vs. clockspeed makes sense. What's happening is what I previously described. As the clockspeed increases, it is encountering
more times where the GPU is completely fed, until the clockspeed gets so high that it keeps the GPU fed all the time. Basically since we're looking at averages, you can think of the reported framerates also showing us that a faster processor is spending less time being the bottleneck than the GPU.
Here's the breakdown:
3.5 to 3.0 GHz: 14% less clockspeed, 3% less performance
3.0 to 2.5 GHz: 17% less clockspeed, 8% less performance
2.5 to 2.0 GHz: 20% less clockspeed, 15% less performance
Look at the ratio of clockspeed loss to performance loss. The lower you go, the ratio approaches 1.
Now back to the Q6600. The reason toyota had a had time believing the Q6600 vs. E8500 because the Q6600 is no where near being able to spend enough time keeping the video card they used fed, since the Q6600 is giving you half the peformance of a 2500K. And then the E8500 is giving you half the performance of a Q6600. That's a pretty huge discrepancy. I suppose it's possible, but I find it hard to believe.
I hope you've been following me so far, as most of my discussion has been about how your first statement doesn't collaborate your statement nor defeats toyotas. Something interesting is indeed going on with Crysis 2 here. Maybe your explanation about idle threads is correct, but I have a hard time believing if. Actually I don't buy your statement about clockspeed not making a difference above a certain threshold. It might be true, but I have a very difficult time believing that is the core of the issue.
If a CPU is providing a bottleneck 100% of the time, then increasing the clockspeed will help alleviate that bottleneck. So even if you have threads sitting idle, they shouldn't be sitting idle as long because the faster clockspeed should allow for other threads to finish quicker to do the idle threads. I can only see this not happening if something else is providing a bottleneck, like the memory subsystem or whatever.
I think there are probably multiple things happening here contributing to the unusualness we are seeing, since Crysis 2 is definitely the first game I've seen to great than (and significantly so) 100% performance increase with core scaling. One factor is probably the use of FRAPs to benchmark. Crysis on it's own is probably pushing a dual core 100% the entire time. FRAPs comes in and steals more performance than it should. Meanwhile a quad core has a much easier time coping. So basically the insertion of other processes and threads besides the ones the actual game produces could be a major factor in what we're seeing.