Those specs are on the low end of the the 2880 CUDA cores, 240 TMU 15 SMX GK110 die that many expected in K20. The clock speeds of 705mhz is what I estimated them to be a while back based on 1.2Tflop DP estimation in the white-paper and it looks to have been correct as well. At the same time NV may be more conservative with clocks of its Tesla cards to increase yields and keep the power consumption in check in the server environment. K10 is only clocked at 745mhz to stay at 225W TDP.
2496 CUDA cores @ 705mhz is just 8% faster than a 1058 mhz GTX680. Further evidence NV had no chance at all to launch a consumer GeForce GK100/110 in 2012 at reasonable clocks this year to make it worthwhile over the leaner 294mm^2 GK104 chip.
Of course 5-6 more months from today can make a lot of difference in the maturity of the 28nm node. If NV can get those clocks to 1Ghz at 2496 CUDA cores, this chip will be fast.
I wonder what the TDP is on that K20 2496 CUDA cores 705mhz chip?
My guess is Nvidia wanted to stay within the 225w TDP restrictions on the K20. Nvidia designs its Tesla high end SKU keeping in mind HPC server design restrictions. I expected a 384 bit memory controller with 2 SMX disabled but this is even worse.
Even with a higher TDP of 250w for desktop Geforce we can expect clocks around 775 - 800 Mhz . With watercooling this chip could be driven to reach 1 Ghz but the power consumption would definitely cross 300w. Also its not known how bad leakage power affects overclocking headroom for such a massive chip.
In H2 2013 a fully enabled GK110 chip with 800 - 825 Mhz chip could be possible. But currently its not looking so good. If HD 8970 can launch in Jan 2013 with a 25% higher performance I think AMD would be in a better situation than Nvidia.