[TECH Report] As the second turns: the web digests our game testing methods

Page 28 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Lonbjerg

Diamond Member
Dec 6, 2009
4,419
0
0
If it has to be shown in slo-mo to be seen then what does it matter?

Nobody, not me, AMD, or anyone else is saying that the measured frame latencies don't exist. Of course AMD is going to be concerned. It's a benchmark that shows their card in a negative light compared to the competition. Please link to AMD saying there's a visible stuttering problem. They have said they are addressing the frame latencies, not stuttering. At least that's all I have seen. Maybe I missed it?

394881_535418656479150_1470810877_n.jpg
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
Look, I can see why you chose to take this line in the debate. If done well, it would provide a certain level of deniability of the issue if it can't be seen. Unfortunatly for you, you're not doing it well.
You've just claimed that because AMD did not claim to see it, it isn't there?
Your position is quite weak my friend. I don't know why you continue with this line. All I know is that you have been one busy beaver the last week.

Once again, I'm not claiming anything. Nowhere do I say it does or doesn't visibly stutter during gameplay. I have no position, because I'm not making any claims one way or the other.

I didn't claim AMD saw or didn't see anything. I just stated what they actually said, rather than making something up.

I just want realworld blind testing.
 

Keysplayr

Elite Member
Jan 16, 2003
21,219
55
91
Once again, I'm not claiming anything. Nowhere do I say it does or doesn't visibly stutter during gameplay. I have no position, because I'm not making any claims one way or the other.

I didn't claim AMD saw or didn't see anything. I just stated what they actually said, rather than making something up.

I just want realworld blind testing.

Yes we know. You didn't do, claim, say, suggest, or imply anything at all.
And you just made me laugh a little bit saying something was made up. Your position, although you have not physically typed it, is quite clear.
And if you want a blind test, why don't you coordinate one?
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
Yes we know. You didn't do, claim, say, suggest, or imply anything at all.
And you just made me laugh a little bit saying something was made up. Your position, although you have not physically typed it, is quite clear.
And if you want a blind test, why don't you coordinate one?

There you go putting words in my mouth. You can't actually address what I'm saying and choose to ignore what I'm suggesting. I have not claimed though that there is or isn't a stuttering issue and my position, just as your, is irrelevant. As far as me doing the tests, I wish I had the resources.
 

MrK6

Diamond Member
Aug 9, 2004
4,458
4
81
I'm not denying anything. I don't know what you're talking about.

Here's the scenario for you. If you did a double blind test like I described and can tell a difference then it proves there's a visible flaw in the reproduction. No problem, as I never said there wasn't one. Just that I'd like it tested under real world conditions without any chance of bias (which as humans we all have) dictating the outcome. Of course, if you couldn't tell which card was which with any statistical significance, then you are wrong. You are the only one between us who has predicted an outcome. It's no wonder you don't like the idea.
The problem is you're arguing real science against people who aren't knowledgeable enough to comprehend it or will simply deny it because they have an agenda to prove. I wouldn't waste my time, but good effort. :thumbsup:
 

Rikard

Senior member
Apr 25, 2012
428
0
0
It is unscientific to claim either way until we have the results. To claim nothing is the appropriate stance in this case, for now.
 

Granseth

Senior member
May 6, 2009
258
0
71
It would be cool if a review site (like Anandtech) would do a series of blindtests. But no more than 2 stations at each tests, and no more than one variable at each test.
It could be GPUvsGPU, CPUvsCPU, high settings vs low settings, HDD vs SSD etc.
Its always fun to read about subjective observation analyzed in a objective setting. I always like to know how much I'm fooling myself when I buy something based on a gut feeling.
 

Keysplayr

Elite Member
Jan 16, 2003
21,219
55
91
The problem is you're arguing real science against people who aren't knowledgeable enough to comprehend it or will simply deny it because they have an agenda to prove. I wouldn't waste my time, but good effort. :thumbsup:

Very well said.
 

f1sherman

Platinum Member
Apr 5, 2011
2,243
1
0
Do you have a problem with sitting people down at gaming rigs with 7950's and 660ti's who say that they can see a difference and compare the 2 for smoothness without knowing which card they are using?

Why don't you explain where my logic is faulty, rather than making accusations of my motives?

Yes I have a problem with it. At least with the part of it. And the part with which I don't agree is not blind-test per se.
But the mind-boggling idea that measurements done by fine Swiss-made micrometer should be tested by a goddamn stick.

Your persistent insisting on using a blunt, error-prone, extremely hard to control and fixate instrument like "blind-testing",
and using such instrument not only to represent finely grained variables like gameplay smoothness and responsiveness,
but to double-confirm and supersede frame-times statistical analysis is ludicrous.(Looking at you AMD and your ridic "blind-testing")

I am also curious about thresholds of human eye. But guess what - different systems perform differently.
That's why we have freaking benchmarks, and like...numbers.
Other than just 0 and 1, and blind-test subjects responses like "hmm...smoother?" and "uhh...more stuttery. I think..."

World-wide experiment on human eye thresholds, followed by another one in order to confirm the findings of the previous,
then another one on a different control group - they are just fine.
Just stop suggesting that frame-time analysis is worthless without 10 years of research on psycho-optics.

Different graphics cards will give you different FPS, different gameplay smoothness and responsiveness.
Wheter control group led by a guy called Fatal1ty, or the one with people
who say that they can see a difference
notices the difference or not.

Benchmarking has always been about quantifying. That means using things like rational numbers.
And there is a reason why rational numbers are superior to answers like YES or NO.
 
Last edited:

Jaydip

Diamond Member
Mar 29, 2010
3,691
21
81
Very well said Fish.Honestly if blind testing is the new way of reviewing hardware now we can safely say 680 and 7970 GHz are tied as almost no one will be able to discern 4-5 fps differences.Even the premise behind the method is ridiculous .
 

Rikard

Senior member
Apr 25, 2012
428
0
0
Yes I have a problem with it. At least with the part of it. And the part with which I don't agree is not blind-test per se.
But the mind-boggling idea that measurements done by fine Swiss-made micrometer should be tested by a goddamn stick.
What if the Swiss-made micrometer answers a different question than the stick? We are not talking or replacing one with the other mind you.

Honestly if blind testing is the new way of reviewing hardware now we can safely say 680 and 7970 GHz are tied as almost no one will be able to discern 4-5 fps differences.
Wouldn't that be good progress? If a customer can understand what is a meaningful difference, and what is not, he/she will be able to make a better informed decision. As a result hardware manufacturers will not spend money on useless gimmicks but on actual improvements that matter to people. Nowadays we have almost universally agreed that playable is 30 FPS and good is 60 FPS, but we are not more stupid than we realize that this depends on person and situational conditions. My hope is that we will get to that point also for microstutter, tearing, artifacts and other graphical imperfections. Why are you so against it?
Even the premise behind the method is ridiculous .
Why? You had the chance to deliver your ground breaking alternative method, but for some reason you avoided presenting it. You have a still the chance to do it you know...
 

Final8ty

Golden Member
Jun 13, 2007
1,172
13
81
I see a lot of twisting of what 3DVagabond is trying to say and i understand his point.

There is nothing wrong with the blind testing suggestion to find out what % of latency, micro stuttering is perceivable to most people, then at least most will know when looking at the graph that's its in or out of perceivable range for most.

None of that will change which brand in statistically smoother and nether is doing that trying to.
 
Last edited:

KingFatty

Diamond Member
Dec 29, 2010
3,034
1
81
It seems to me there is confusion between the accepted facts and the interpretation of those facts.

Facts: I think we all agree that you can identify the facts here, see the plot of frame times and see that there is a difference between the AMD and Nvidia plots. Those are the facts. The stutter exists in both cards. I don't think this is controversial, and it's plain to see that the AMD plot differs from the Nvidia plot?

Interpretation: when you look at the facts above, why do some people interpret the Nvidia facts as smooth, and the AMD facts as microstutter? Why do some people not notice? This is subjective and can come down to physical traits like vision and sensitivity. Again, I think we all agree about this variation.

Seems like 3DVag. is saying use blind testing to figure out where to draw the line - at what point do you subjectively interpret stutter vs smooth? When you get that, it could help when you buy a card to say whether it will likely be subjectively smooth or stuttery. That could be useful, like knowing that most games look nice at 60 FPS but at 30 FPS you start to push the boundary of smoothness.

It's good to know that subjective average FPS acceptance of most gamers, and similarly, it would be good to know a subjective average amount of stutter acceptance of most gamers.
 

Leadbox

Senior member
Oct 25, 2010
744
63
91
It seems to me there is confusion between the accepted facts and the interpretation of those facts.

Facts: I think we all agree that you can identify the facts here, see the plot of frame times and see that there is a difference between the AMD and Nvidia plots. Those are the facts. The stutter exists in both cards. I don't think this is controversial, and it's plain to see that the AMD plot differs from the Nvidia plot?

Interpretation: when you look at the facts above, why do some people interpret the Nvidia facts as smooth, and the AMD facts as microstutter? Why do some people not notice? This is subjective and can come down to physical traits like vision and sensitivity. Again, I think we all agree about this variation.

Seems like 3DVag. is saying use blind testing to figure out where to draw the line - at what point do you subjectively interpret stutter vs smooth? When you get that, it could help when you buy a card to say whether it will likely be subjectively smooth or stuttery. That could be useful, like knowing that most games look nice at 60 FPS but at 30 FPS you start to push the boundary of smoothness.

It's good to know that subjective average FPS acceptance of most gamers, and similarly, it would be good to know a subjective average amount of stutter acceptance of most gamers.

Well put:thumbsup:
 

Lonbjerg

Diamond Member
Dec 6, 2009
4,419
0
0
It seems to me there is confusion between the accepted facts and the interpretation of those facts.

Facts: I think we all agree that you can identify the facts here, see the plot of frame times and see that there is a difference between the AMD and Nvidia plots. Those are the facts. The stutter exists in both cards. I don't think this is controversial, and it's plain to see that the AMD plot differs from the Nvidia plot?

Interpretation: when you look at the facts above, why do some people interpret the Nvidia facts as smooth, and the AMD facts as microstutter? Why do some people not notice? This is subjective and can come down to physical traits like vision and sensitivity. Again, I think we all agree about this variation.

Seems like 3DVag. is saying use blind testing to figure out where to draw the line - at what point do you subjectively interpret stutter vs smooth? When you get that, it could help when you buy a card to say whether it will likely be subjectively smooth or stuttery. That could be useful, like knowing that most games look nice at 60 FPS but at 30 FPS you start to push the boundary of smoothness.

It's good to know that subjective average FPS acceptance of most gamers, and similarly, it would be good to know a subjective average amount of stutter acceptance of most gamers.

That would be all fine and dandy..if not for his wording...those that "claim" to percieve it.
 

f1sherman

Platinum Member
Apr 5, 2011
2,243
1
0
What if the Swiss-made micrometer answers a different question than the stick? We are not talking or replacing one with the other mind you.

There is nothing wrong with the blind testing suggestion to find out what % of latency, micro stuttering is perceivable to most people, then at least most will know when looking at the graph that's its in or out of perceivable range for most.

Well good luck to you.
Good luck with finding suitable control group,
GL with conducting controlled experiment with even semi-reproducible results(sorry you kinda need that :D),
and from there developing theory and algorithms that will pass 1/10 of scrutiny that TR has been subjugated to.

I am sure you'd want your findings to be reusable later, without the need to gather your control group again, amirite?
Then good luck with finding algorithm and values that will hold it's ground tested against different games, with different screen dynamics, across range of different gfx cards, FPS, details and resolutions.

In the meantime most of us will continue using simple assumption that faster and steadier frame output is... better than slow and jittery one.

Seems like 3DVag. is saying use blind testing to figure out where to draw the line - at what point do you subjectively interpret stutter vs smooth? When you get that, it could help when you buy a card to say whether it will likely be subjectively smooth or stuttery. That could be useful, like knowing that most games look nice at 60 FPS but at 30 FPS you start to push the boundary of smoothness.

See Zenon and his Grain of Millet paradox. There is no telling exactly how many grains is needed before you can hear them producing noise when hitting the ground.
Even when you can draw the line and say "stuttering begins here", the inherent uncertainty of any algorithm derived from blind-testing,
makes it highly difficult to employ and recommend next to a concept simple as "faster is better".

Did I mention that uncertainties need to be multiplied, and your already chunky one will only become fatter when employed across those who are supposed to benefit from your method, i.e. average users

So yeah... nice idea. See ya in 10 years!
 
Last edited:

Granseth

Senior member
May 6, 2009
258
0
71
(...)

See Zenon and his Grain of Millet paradox. There is no telling exactly how many grains is needed before you can hear them producing noise when hitting the ground.
Even when you can draw the line and say "stuttering begins here", the inherent uncertainty of any algorithm derived from blind-testing,
makes it highly difficult to employ and recommend next to a concept simple as "faster is better".

Did I mention that uncertainties need to be multiplied, and your already chunky one will only become fatter when employed across those who are supposed to benefit from your method, i.e. average users

So yeah... nice idea. See ya in 10 years!
The biggest problem you would have is that even though a lot of people might not hear the grains, it would still be an issue for those that does.

I would like to see some blind test studies, but not the discussions on the forums that will follow.
 

f1sherman

Platinum Member
Apr 5, 2011
2,243
1
0
The biggest problem you would have is that even though a lot of people might not hear the grains, it would still be an issue for those that does.

I have no doubt that same as there are people with mad skillz, there are those with over-the-chart perception/eyesight.
Blind-testing does not care about those with exceptional skillz.
So on top of everything, it assumes that their control group is "better" than you.

But that's the least of my issues with blind-testing...
In the time when reviewers and developers are beginning to fight latencies, we have some amongst us(*) asking the upside-down question:

With how much latency they can hit us, before we say Enough!

Well duh...quite some if properly trained and slowly desensitized!

(*)part which is a bit hard to swallow :hmm:
Don't they want latencies as small as possible, instead of as big as endurable??

ttz9fa6.png


I would like to see some blind test studies, but not the discussions on the forums that will follow.

I would not touch that thread with a 10-foot pole :ninja:
 

railven

Diamond Member
Mar 25, 2010
6,604
561
126
I would blind test the people who claim they see it first.

The whole point is to verify that it's visible during normal gameplay. Until it's verified it's not a fact.

Browsing this thread and found it odd that I'm being called a liar.

I saw the stutter. I went through the work to pull a latency log and from that data I was able to see the stutter.

I got a second person who saw the stutter.

But I guess we're just lying because 3DVagabond need someone else (possibly himself) to verify it for it to be factual?

Not even sure what you're trying to defend at this point but, keep up the good work, :rolleyes:
 

Elfear

Diamond Member
May 30, 2004
7,165
824
126
In the meantime most of us will continue using simple assumption that faster and steadier frame output is... better than slow and jittery one.

Ahh. But the problem isn't who would say a faster and steadier frame output is better because everyone would say it's better. The questions is if you see a graph of two competing cards and card A is noticeably faster than card B but card A has 2ms higher latency spikes (randomly chosen value), which card would you buy?

Not so simple a question then. If 2ms latency spikes bug you, then card B would be the better choice. If you don't notice the spikes, then card A would be better. Well how are you supposed to know if 2ms latency spikes would bug you or not? By establishing a general rule of thumb from either double-blind tests or capturing video somehow that shows the actual output the user would see on their monitor (or both).

Just like a FPS graph means nothing without some knowledge of what feels smooth and what doesn't so it is with frame latency graphs. I believe this is the point many posters are trying to make but others keep twisting it around.
 

f1sherman

Platinum Member
Apr 5, 2011
2,243
1
0
Ahh. But the problem isn't who would say a faster and steadier frame output is better because everyone would say it's better. The questions is if you see a graph of two competing cards and card A is noticeably faster than card B but card A has 2ms higher latency spikes (randomly chosen value), which card would you buy?

Not so simple a question then. If 2ms latency spikes bug you, then card B would be the better choice. If you don't notice the spikes, then card A would be better. Well how are you supposed to know if 2ms latency spikes would bug you or not? By establishing a general rule of thumb from either double-blind tests or capturing video somehow that shows the actual output the user would see on their monitor (or both).

Not so simple questions have no easy answers. How about no amount of blind-testing by other people can answer which card is better for ME?
How about There is no definite answer, because each one has it's good and bad sides?

What you guys are saying is:
"No wait, we are not ready for all this new data we are getting.
We need to assemble our imaginary never-going-to-happen control group, and come up with not-so-simple theory which will give us something as crude as "general rule(s) of thumb".
To which I say "Great. I'd love to hear more about it when you are done."
And then I would sift through what ever data is available, and make my judgment from there.

Just like a FPS graph means nothing without some knowledge of what feels smooth and what doesn't so it is with frame latency graphs. I believe this is the point many posters are trying to make but others keep twisting it around.

Good example.
Although no one has ever done any blind-testing work and come up with any established FPS theory,
and although 30fps can be perfectly playable, and 80fps can be mess, and although there are no "general rules of thumb" with FPS,
other than the one I just created: "somewhere between 30 and 80, but essentially THE MORE - THE BETTER",
we still use FPS as a general performance indicator.

So following this rule of thumb of mine, we have acceptable range between highest and lowest of 50fps.
Do you see what kind of uncertainty is that? Almost 200% of the lowest acceptable fps.
Some rule of thumb. Great precision... And yet FPS is just fine, and no need for blind-testing there.

So do we REALLY need to wait for these imaginary rules of thumb, before attempting to digest new data?
Do you except them to be any better, more precise than mine above, and hence actually usable?
 

KingFatty

Diamond Member
Dec 29, 2010
3,034
1
81
I'm thinking along the lines of how this would be exploring characteristics of people.

I mean, most gamers would agree that falling below 30 fps is going to affect your enjoyment.

We just need to know how to, ah, come up with a way to quantify stutter. Once you do that, I think gamers would come to some unwritten agreement about what is acceptable and what is not.

Sure you could do blind testing, but does anyone remember all the blind testing that was done for the industry to agree on 30 fps being the crossover between acceptable and unacceptable? I don't, it just happened. It's just universally agreed on for the most part (I think, maybe now it's bumped up with fancier cards?).

Anyway, I'm just saying that we are all exploring the frontier right here, it's like a game changing thing right now, we are seeing the birth of a new metric to rate video cards. It's happening now, and that's kind of cool. There will be a way to easily quantify it, and you'll see cards rated like that (X frames per second, Y stutter).