RS, I am largely in agreement with you in that other thread where you talked about how the most stressful new games were the ones that really mattered, since older games would get maxed out more easily anyway. (Exception: games that you plan to play in Eyefinity. That's why I still care about DX9 performance, because I like to play Source games at 5040x1050. So even though Source games are not typically thought of as new or stressful, I like it when people bench new cards with Left4Dead2.)
I would like to add that canned benchmarks matter less than actual playthroughs (especially in the past, when NV/ATI optimized for game benchmarks more than they do now), hence why I read AT for hardware analysis and HardOCP for performance comparisons in *games* rather than *game benchmarks*. In fact, your recommendation of testing actual performance in a few new, stressful games, is basically what HardOCP already does:
http://www.hardocp.com/article/2010/10/21/amd_radeon_hd_6870_6850_video_card_review/ They just do it in a way that is a bit weird to some people, in that they try to push details and resolutions up to the limit of each card, rather than have apples-to-apples comparisons at specific resolution/setting combinations. They do have a few apples-to-apples charts up in each review, though, to appease people who want to see those kinds of things.
Something I would like to see is HardOCP-style reviews that compare oc vs. oc, but due to the variable nature of oc I understand why those kinds of reviews are rarer. I mean, you might luck out and get a GTX460 that hits 880MHz@stock volts, or you could get a card that hits a wall at 825MHz. (GURU3d tested six cards and that was the approximate range.)
On the other hand, with enough data points we can get a sense of what a likely range of max oc's at stock volts is, like how most stock-clocked and moderately-oc'd GTX460s max out at about 840MHz with good cooling (see Guru3d review), or how most stock-clocked Radeon 6850s max out at ~900Mhz with good cooling (I made a meta-analysis thread about this). So I'd like to see more reviews that pit, say, a GTX460-1GB@840Mhz@stockvolts vs. a HD6850@900@stockvolts, because those are "typical" maximum oc's without overvolting. The reviews should use actual gameplay, not canned benchmarks or synethic tests like Vantage.
I'd also like to see similar studies with overvolting, but all we have so far are reviews like HWC and TR where they maxed out volts on specially-binned GTX460s but did not max out volts on 6850s.
Edited to add, for those who have no idea why HardOCP does things the way it does:
http://enthusiast.hardocp.com/article/2008/02/11/benchmarking_benchmarks