It's called "thermal" design power. Thermal power transfer is very slow compared to electrical one, so (naturally) short peaks are irrelevant for cooling. PSU is another matter, but it's very naive to think that any graphics card/PSU combination is running at the limit regarding power delivery.
In conclusion:
Peaks are irrelevant. And if we are talking about efficiency, which we are, they are especially irrelevant since efficiency is all about average consumption during a longer benchmark/gaming session, not some microsecond value. I play longer than a couple of microseconds
I completely disagree. Those peaks will occur in many games, and for longer and longer periods of time as games get more demanding. The more demanding a game will be, the more power a card can use (to a limit of course) because the GPU will be loaded 99% more often (GPU limited case). You also ignored that people use their GPU outside of games where 99% GPU load for days at a time is common. If you only look at average power consumption of a videocard in a game, you aren't seeing what it can actually go to if say it's loaded to 99% in a particular section of a game/very demanding game/some other program. If someone is running 2-3 GPUs and is overclocking too, Peak load consumption in games is way more important to assess what PSU they need and if it's worth it to get a card with an after-market open air cooler. We have clearly seen this situation play out exactly this way with the GTX670/680 cards that would often peak in games and exceed 70*C as a result. The consequence was dropped GPU power boost during those cases in the game which lowered your performance.
Average power usage is relevant to arrive at your electricity costs. What I never agreed on was Furmark/power virus maximum power usage.
If GTX680 peaks at 183-186W in games like Crysis 2, it will probably do so in many areas of Crysis 3 and Metro Last Light and especially in more modern games slated in the future. Average power consumption of a GPU generally never has the GPU pegged at 99%. It stands to reason that next generation games will peg it at 98-99% level a lot more frequently, which implies that those Peaks being reported at h4t and TPU will be much closer to the average usage of a GPU in more demanding modern videogames.
This even goes back to your desire to have a 950mhz card with a 280W TDP. At 950mhz-1Ghz the Titan might have on average used 250W of power and peaked at closer to 280W and NV likely didn't want to go this route. With ~876mhz, it might on average use 225-230W and peak at 250W. Peak in games matters if this peak is repeatable and quantifiable on many occasions because GPU makers take this into account. There is a big difference in noise levels and coolers between a 230W GTX580 and a 270W GTX480. That extra 40W is the difference between a jet engine game experience and a decent one. That 40W delta occurs at Peak in games. This is why so many 480 owners were not happy with their cards (the early 6-8 months batch of 480's specifically).
I think the closer they can get average to peak the better. I'm pretty sure that was the intent of boost clocks with Kepler, they seem to need to do some work in that area.
Exactly. If NV tested the Titan in a CPU dependent game like Skyrim, they might have had a lot more headroom for GPU Boost. If they ran the Titan for hours in Crysis 3 and saw that it uses 30-40W more power, they would have needed to back off the GPU Boost from 1Ghz down because games such as C3 are going to be way more GPU limited. Average power consumption starts including areas in the game where the GPU is not fully loaded / CPU limited cases too. More importantly, when you are designing a GPU's heatsink/VRMs/power circuitry, you have to make sure everything can work for days/weeks at a time in 99% GPU loaded cases because people do not just use their GPUs for games. In those cases, the Average and Peak power usage will be extremely close. Usage scenarios such as distributed computing (Folding@Home, etc.) can load the GPU to 99%. NV
has to account for all of this, which is why they care a great deal about those 99% GPU loaded cases. They have to account for people who will run the GPU at 99% load for days at a time and that ultimately impacts how high they can clock the GPU.