A new look at game benchmarking

Will Robinson · Sep 9, 2011

A very good article by Tech Report's Scott Wasson.:thumbsup:
http://techreport.com/articles.x/21516

BFG10K · Sep 9, 2011

I did a theoretical analysis of microstutter a while ago (never published). It’s actually quite simple once you sit down and understand the math.

I even managed to come up with a formula that could make predictions when it would happen. It correctly predicted that more GPUs in the system would reduce the problem, as was later confirmed by the recent Tom’s Hardware article.

Anyway, it’s nice to have more practical numbers backing my theories. :awe:

Mr. Pedantic · Sep 9, 2011

Interesting. Because of the way the performance of the cards vary, I think a more effective method to portray the 50th-99th percentile frame times is to use the 50th percentile as baseline and then just show how as a percentage how much longer the 99th percentile takes to render. The way it's presented in the article is misleading because of the obvious inherent performance difference between the cards.

Ben90 · Sep 9, 2011

I'm writing this with caution because I know how controversial such topics end up becoming. This thread has the potential to help a lot of people learn a few things. I really request responses to this post and subsequent like it to be objective and worthwhile. Not just name calling.

Individual frame time has been a big issue with ATi cards vs their Nvidia counterparts dating back to the release of unified shaders. With average frame rates being equal, a Nvidia card almost always provides more consistent frame rates. This held true through all VLIW5 architectures. Its very possible things have changed with VLIW4 as I haven't been as involved in GPUs as the past, I do doubt it however.

Again, I'm not saying AMD cards are rubbish. It really wasn't THAT big of a deal to most people anyways.

WMD · Sep 9, 2011

Great article. The multi gpu test with Bad company 2 part really goes to shows why 2x6870 isn't as good as one Gtx580. It renders every odd frame at 47fps while gtx580 maintains 58fps. The other 2 tests are using badly chosen game though. Bulletsorm is capped at 62fps internally which is why MS dissappears entirely when using more powerful cards. Starcraft 2 is cpu bottlenecked even at 2560x1600

Pia · Sep 9, 2011

That's a great article. Just posting to encourage others to read.

Lonbjerg · Sep 9, 2011

Finally some good data on microstutter.

renz20003 · Sep 9, 2011

Interesting read, I read an article not to long ago on microstuttering that if your avg frames are 10 -20 fps above your refresh rate you shouldnt notice it. The game that ive played that it really stands out would be witcher 2 when you hit a checkpoint/autosave. Atleast in the article AMD and Nvidia both acknowledged it as a serious problem, so hopefully they will make some progress in elminating it in the future.

Aristotelian · Sep 9, 2011

BFG10K said:
I did a theoretical analysis of microstutter a while ago (never published). Its actually quite simple once you sit down and understand the math.

I even managed to come up with a formula that could make predictions when it would happen. It correctly predicted that more GPUs in the system would reduce the problem, as was later confirmed by the recent Toms Hardware article.

Anyway, its nice to have more practical numbers backing my theories. :awe:

More GPUs in the system 'reduce' the problem? That's an interesting result. Can you please post or PM the article? I'd be happy to read it.

BFG10K · Sep 9, 2011

Aristotelian said:
More GPUs in the system 'reduce' the problem? That's an interesting result. Can you please post or PM the article? I'd be happy to read it.

Like I said, my results were never published.

If youre talking about the Toms article: http://www.tomshardware.com/reviews/radeon-geforce-stutter-crossfire,2995.html

Aristotelian · Sep 9, 2011

I was asking you to publish it on your site, or even to PM me the content.

BFG10K · Sep 9, 2011

Aristotelian said:
I was asking you to publish it on your site, or even to PM me the content.

I dont write for ABT anymore, and the content is not in a state to be PMd.

SirPauly · Sep 9, 2011

Aristotelian said:
More GPUs in the system 'reduce' the problem? That's an interesting result. Can you please post or PM the article? I'd be happy to read it.

Here is the first article, imho, that investigated and an outstanding article:

http://www.rage3d.com/reviews/video/ati4870x2cf/index.php?p=2

Silverforce11 · Sep 9, 2011

I dunno if u guys read the conclusion, fraps is totally inaccurate in measuring frame latency for NV cards.

Also, fraps reported high latency spikes for the 6870 CF pair but in game, high fps constant, no noticeable jitters.

They need to get a high speed camera before considering any benchmarks on frame latency. Fraps is not the be all end all.

notty22 · Sep 9, 2011

Silverforce11 said:
I dunno if u guys read the conclusion, fraps is totally inaccurate in measuring frame latency for NV cards.

Also, fraps reported high latency spikes for the 6870 CF pair but in game, high fps constant, no noticeable jitters.

They need to get a high speed camera before considering any benchmarks on frame latency. Fraps is not the be all end all.

I agree with your point. Fraps does certain functions. But they may not always be pertinent data, to what ends up on the screen.

Poof. Mind blown.
Now, take note of the implications here. Because the metering delay is presumably inserted between T_render and T_display, Fraps would miss it entirely. That means all of our SLI data on the preceding pages might not track with how frames are presented to the user. Rather than perceive an alternating series of long and short frame times, the user would see a more even flow of frames at an average latency between the two.

RussianSensation · Sep 9, 2011

Read the article and 2 things stood out:

"1) For graphical applications like games that involve interaction, I don't think we'd want frame times to go much higher than that. I'm mostly just winging it here, but my sense is that a frame time over 50 ms is probably worthy of note as a mark against a gaming system's performance. Stay above that for long, and your frame rate will drop to 20 FPS or lower—and most folks will probably start questioning whether they need to upgrade their systems."

Just like Kyle, he chose an arbitrary "smoothness level", in this case 20 fps minimums. Let's say you have a situation where a card drops 5x to 20 fps min (>50 ms response) but 99% of the time maintains minimums of 35 fps / avg of 60. In other case, you can have a card that never drops to 20 fps but most of the time maintains a far lower average of 50 fps and minimums of 30 fps. Which card would I take? I would take the first one, but it would be worse performing based on the criteria he set.

"2) Presumably, a jitter pattern alternating between five- and 15-millisecond frame times would be less of an annoyance than a 15- and 45-millisecond pattern. The worst example we saw in our testing alternated between roughly six and twenty milliseconds, but it didn't jump out at me as a problem during our original testing. Just now, I fired up Bad Company 2 on a pair of Radeon HD 6870s with the latest Catalyst 11.8 drivers. Fraps measures the same degree of jitter we saw initially, but try as I might, I can't see the problem."

So he can measure the microstutter/jittering but in the real world he can't perceive the difference, later sighting his IPS monitor as "too slow" to capture this latency issue. So we have another component into the mix = microstuttering may differ across TN / IPS panels.

Basically short of a person actually trying CF/SLI for themselves, it seems it's impossible to discount CF/SLI as microstuttering mess because:

1) Micro-stutter may exist but depending on the speed of the monitor, it's may be a non-issue altogether
2) Micro-stutter may exist, but the user may or may not notice it
3) Micro-stutter differs across AMD/NV brands and even across games (i.e., game engines and GPU load too) <way too many factors all of a sudden)
***4) As Tom's Hardware pointed, Micro-stuttering is alleviated somewhat if you run faster GPUs and/or if you add a 3rd GPU into the mix, creating yet another variable.

So we are back to square 1: it makes sense why some report issues with micro-stutter and others don't really have them.

WMD · Sep 9, 2011

RussianSensation said:
"2) Presumably, a jitter pattern alternating between five- and 15-millisecond frame times would be less of an annoyance than a 15- and 45-millisecond pattern. The worst example we saw in our testing alternated between roughly six and twenty milliseconds, but it didn't jump out at me as a problem during our original testing. Just now, I fired up Bad Company 2 on a pair of Radeon HD 6870s with the latest Catalyst 11.8 drivers. Fraps measures the same degree of jitter we saw initially, but try as I might, I can't see the problem."

So he can measure the microstutter/jittering but in the real world he can't perceive the difference, later sighting his IPS monitor as "too slow" to capture this latency issue. So we have another component into the mix = microstuttering may differ across TN / IPS panels.

Basically short of a person actually trying CF/SLI for themselves, it seems it's impossible to discount CF/SLI as microstuttering mess because:

1) Micro-stutter may exist but depending on the speed of the monitor, it's may be a non-issue altogether
2) Micro-stutter may exist, but the user may or may not notice it
3) Micro-stutter differs across AMD/NV brands and even across games (i.e., game engines and GPU load too) <way too many factors all of a sudden)
***4) As Tom's Hardware pointed, Micro-stuttering is alleviated somewhat if you run faster GPUs and/or if you add a 3rd GPU into the mix, creating yet another variable.

So we are back to square 1: it makes sense why some report issues with micro-stutter and others don't really have them.

In that case with 6870 CF setup, half the frames are rendered at 7ms the other half at 20ms with avg fps at 74fps. But keep in mind even at 20ms the perceived visible fps is 50fps. Still quite acceptable enough for him to not notice a problem. The more important question is: The 6870CF setup at 74fps avg clearly outperforms 6970 in benchmarks at 50fps. But visually does it perform any better? Most likely not if our eyes are going to notice those slower frames that performs at 20ms.

CF micro-stutter in this case did not cause a performance issue as the frame rates are too high for problem to manifest but one could have gone with a single card of the same cost and have similar visible performance with less driver/ game compatibility issues.

cusideabelincoln · Sep 9, 2011

I've seen microstutter on my crappy TN panel with a 560 Ti. It would come and go when I was playing Deus Ex. I'd monitor my framerates and they were above 75 fps, but the motion I was seeing was not smooth at all and looked, to my eyes, like it a consistently normal 30 fps would look. Of the entire playtime it only happened less than 15% of the time, but when it did I noticed it.

So I would very much like to see this testing methodology for more games.

Also a side note about the frame number graphs they used: Since they used the same frame number range, but every card didn't render the same amount of frames, the graphs may not be entirely comparable at least directly to each other. Ideally you would want to graph the same realtime chunk of the benchmark, as this should ensure each card is rendering the same exact scene of the benchmark. But I can see why they didn't do this, and in the end I don't think it mattered for what they were trying to show. I did like the percentile graphs as a way to compare all of the cards to each other:

http://techreport.com/r.x/inside-the-second/bc2-percentiles.gif

blackened23 · Sep 9, 2011

Lonbjerg said:
Finally some good data on microstutter.

Yes....thank god nvidia and amd are forthcoming about admitting to the problem. Hopefully the next generation of cards will finally have a better solution for it.

SirPauly · Sep 9, 2011

http://www.rage3d.com/board/showpost.php?p=1336677578&postcount=5

Here is another link, where this investigation did bring in how the CPU effects micro-stuttering, too.

Termie · Sep 9, 2011

Fantastic article, and something I've been interested in for a quite a while (I started the thread on the Tom's article a few weeks ago).

What I really like about this article is that it applies the theory to single cards - I've always suspected that it wasn't just dual card systems that suffered from microstutter (it's just more noticeable when frame rates drop on those systems).

Of course, I was disappointed to see the results for the Radeons, but I'm fairly certain I've witnessed it myself on my HD5850 while playing BC2.

JAG87 · Sep 9, 2011

WMD said:

This is what happens when you use vsync and triple buffering on a 60Hz monitor with the 6870 CFX setup.

Clearly outperforms the single 6970 statistically and visually.

Food for thought.

Silverforce11 · Sep 9, 2011

Wait.. so whats going on, vsync and triple buffering on removes microstuttering??

<- Always play with vsync on already.

WMD · Sep 10, 2011

JAG87 said:
This is what happens when you use vsync and triple buffering on a 60Hz monitor with the 6870 CFX setup.

Clearly outperforms the single 6970 statistically and visually.

Food for thought.

Except when it drops below 60fps the perceived stutter will be even worse

Notice the 6970CF have alot of >50ms lag spikes. Those are the moments it will not keep up above 60fps.

truth_benchmark · May 7, 2012

Is there any possibility that AnandTech will adopt frame time benchmarking for GPU's and games ?

Tech Report's article about frame time benchmarking was published last September 2011. I'm wondering why most review sites still stick with using frame rates only

A new look at game benchmarking

Golden Member

Lifer

Diamond Member

Platinum Member

Senior member

Golden Member

Diamond Member

Platinum Member

Golden Member

Lifer

Golden Member

Lifer

Diamond Member

Lifer

Diamond Member

Elite Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Senior member

Junior Member