How far off are canned benchmarks?

taltamir

Lifer
Mar 21, 2004
13,576
6
76
I tried doing some benchmarking of company of heroes for a thread about the 9600GT and the value of shader power. I first benchmarked it with full shaders, then with reduced shader capabilities.

Min FPS: 13, average FPS 16.7, max FPS 42... actual gameplay at those settings? solid 4 fps, on action scenes it will dip to 1, on occasion it will spike to 7... both happened for less then a second.

Here is a review where they tested same video card (mine is XFX):
http://www.legitreviews.com/article/648/7/

check out their 1920x1200 highest quality score.. I tested mine on 1920x1200 with highest+ setting (set highest, go into advanced, increase 3 additional options disabled even in highest) and 16x Q CSAA, they used 8x... I used a weaker CPU and with background programs (anti virus, and stuff) and got 16.7... But the game is a slideshow at 4fps on actual gameplay.

That resolution is completely unplayable unless I completely disable AA. (which according to reviews should have given me a nice solid 60fps with vsync... not a barely playable FPS with stutter here and there).

If I would have bought this 300$ card to play CoH based on that review I would have been pissed.
This reminded me the HardOCP yammer about how canned benchies are bad. (not that their benchmarks are any better, as they are ALWAYS 30fps +/- 3fps... with different settings for each card, thanks a lot, this tells me absolutely nothing).

I was wondering how bad ARE canned benchmarks? or maybe how good?
If anyone has examples of games that are completely off by a lot, or games that are spot on.
Hard mentions crysis is several hundred percent higher on both nvidia and AMD on the timedemo.

I would like to basically get a list going of what games are accurate, and what games are completely off, and by how much.


Company of Heroes - Opposing fronts v2.201:
8800GTS 512: Gameplay was less then 1/4th the FPS of the built in benchmark on 1920x1200 max settings 16x Q CSAA
Verified by: taltamir

Crysis
HardOCP claims timedemo is several times the real FPS on some cards, both nvidia and AMD, most prominent on 3870x2. Need further investigation.
Verified by: HardOCP

More games to be added as posted.
 

ronnn

Diamond Member
May 22, 2003
3,918
0
71
Probably almost as far off as real life benchmarks done by people. People are notorious for adding bias - seems it is impossible not to.
 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
Well, there is only so much bias one can add... at less then 1/4th the FPS, you are looking at a case where at no time at the game the real FPS will EVER reach the MIN FPS measured on the benchmark. Much less the average or max. Playing I have a min 1, average 4, max 7... my max was lower then the min on the benchmark.
 

ronnn

Diamond Member
May 22, 2003
3,918
0
71
Originally posted by: taltamir
Well, there is only so much bias one can add... at less then 1/4th the FPS, you are looking at a case where at no time at the game the real FPS will EVER reach the MIN FPS measured on the benchmark. Much less the average or max. Playing I have a min 1, average 4, max 7... my max was lower then the min on the benchmark.

Would think it would be easy to add much more than that by just modifying your playing style.
 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
you can tweak the average fps, not the min and max.

In half life 2 for example, there are different spots where the FPS tanks or soars. The timedemo passess them all, If I was trying to bias it I could look at the spot where the FPS soars the whole playthrough, that still wouldn't change the fact that the max FPS was lower then the min FPS on the time demo...

Heck if trying to cheat, I could just run an antivirus scan in the background or flat out lie.

Anyways, I am not here to call people liars or cheaters (those leeches in advertising who make the decision to cheat on a benchmark don't count as people). I just want to see what different people get when they attempt to run a canned bechie, and then a real playthrough of the same game at the same settings. Get several people giving results on each game and you start getting a good picture.
 

apoppin

Lifer
Mar 9, 2000
34,890
1
0
alienbabeltech.com
Originally posted by: ronnn
Originally posted by: taltamir
Well, there is only so much bias one can add... at less then 1/4th the FPS, you are looking at a case where at no time at the game the real FPS will EVER reach the MIN FPS measured on the benchmark. Much less the average or max. Playing I have a min 1, average 4, max 7... my max was lower then the min on the benchmark.

Would think it would be easy to add much more than that by just modifying your playing style.

change your FoV ... First look 'up' slightly and 'down' very slightly the next time ... then try to get it identical, time after time

let me know what results you get

Anyways, I am not here to call people liars or cheaters (those leeches in advertising who make the decision to cheat on a benchmark don't count as people). I just want to see what different people get when they attempt to run a canned bechie, and then a real playthrough of the same game at the same settings. Get several people giving results on each game and you start getting a good picture.
You better not :p
it won't get you much cooperation ...


i believe it is *impossible* to actually do it with a very good degree of repeatability ... i actually asked that *we* test this before and have tried it on my own rig with very mixed results.
 

apoppin

Lifer
Mar 9, 2000
34,890
1
0
alienbabeltech.com
i believe it is *impossible* to actually do it with a very good degree of repeatability ... i actually asked that *we* test this before and have tried it on my own rig with very mixed results.

i'd better add - there IS a way to do it ... but you need to be able to write scripts and macros that 'trigger' as you play thru a 'canned demo' in real time ... i am actually researching this with a little help from my friends
- it is still difficult to keep your view absolutely without any variance - as little as a single degree may have an effect
 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
you are aware that company of heroes is an RTS are you? there are no "playstyles"m in a game like the witcher where you can choose first or third persona view, or birds eye you have playstyles. There is one angle, and you can look right, left, and zoom in and out. And zoom should be as far out so you can see what is going on, in fact it should be twice as far as is allowed. I couldn't pan the camera up or down. Only zoom in and zoom out, and pan right and left. Once the initial mess died down (the airplanes and attacks) and I was just looking at a single unit on the screen without ANYTHING performing ANY actions the FPS doubled to 8. At 100% zom in where I was looking at a pixilated patch of grass with nothing on it I managed to peak at 20fps.

http://forums.tweakguides.com/...read.php?t=1898&page=9
And turns out panning makes the FPS go down, a lot. I was looking at the best FPS from a camera that sit still...

How about instead of arguing you go and do it yourself? install the game, update it, benchmark it, and then try playing it with fraps and see what you are getting. I would gladly revise my note on CoH saying everyone else tested it to be exactly the same. But until you do some actual testing I would appreciate it if you get off that high horse.

This isn't a 10% difference I am talking about here. The only valid argument you could have made, but didn't, is about the MAP being used... I just loaded the first level. (started a new game).
So yea, I will GLADLY discard my benchmarks or take yours into account when looking at the overall picture. But how about you actually get some instead of repeatedly demanding I defend the validity of mine? I am not completely ignorant on the process of benchmarking, thank you vary much.
 

AzN

Banned
Nov 26, 2001
4,112
2
0
Originally posted by: ronnn
Probably almost as far off as real life benchmarks done by people. People are notorious for adding bias - seems it is impossible not to.

Some maybe bias but people see what they only see. Everyone sees things a little different but usually come up with the same conclusions.
 

apoppin

Lifer
Mar 9, 2000
34,890
1
0
alienbabeltech.com
Originally posted by: taltamir
you are aware that company of heroes is an RTS are you? there are no "playstyles"m in a game like the witcher where you can choose first or third persona view, or birds eye you have playstyles. There is one angle, and you can look right, left, and zoom in and out. And zoom should be as far out so you can see what is going on, in fact it should be twice as far as is allowed. I couldn't pan the camera up or down. Only zoom in and zoom out, and pan right and left. Once the initial mess died down (the airplanes and attacks) and I was just looking at a single unit on the screen without ANYTHING performing ANY actions the FPS doubled to 8. At 100% zom in where I was looking at a pixilated patch of grass with nothing on it I managed to peak at 20fps.

http://forums.tweakguides.com/...read.php?t=1898&page=9
And turns out panning makes the FPS go down, a lot. I was looking at the best FPS from a camera that sit still...

How about instead of arguing you go and do it yourself? install the game, update it, benchmark it, and then try playing it with fraps and see what you are getting. I would gladly revise my note on CoH saying everyone else tested it to be exactly the same. But until you do some actual testing I would appreciate it if you get off that high horse.

This isn't a 10% difference I am talking about here. The only valid argument you could have made, but didn't, is about the MAP being used... I just loaded the first level. (started a new game).
So yea, I will GLADLY discard my benchmarks or take yours into account when looking at the overall picture. But how about you actually get some instead of repeatedly demanding I defend the validity of mine? I am not completely ignorant on the process of benchmarking, thank you vary much.

i am not arguing with you .. just giving what is going on in my "real world" testing of FPS and RPG games

unfortunately for your only specific example, i don't play RTS and i don't have CoH :p

If i was able, i would like your "save" and also a FRAPS video of your "run" so i could attempt to duplicate it exactly.

Also, i would note that the AI calculations of RTSes especially depend on the CPU ... and benchmarking a RTS is a bit different than an FPS
--so now what were you saying?
:confused:
 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
my save is "start a new game"
And there is no need for fraps video.. I just didn't touch the mouse OR the keyboard... just started a new game and watched it... solid 4FPS, spikes to 1 fps when a plane is actually shooting, after the first two minutes or so the plane attacks end it goes up to solid 8. bogs down whenever anything does anything.

honestly i dont think its so much cheating as it just a case of CoH not updating its benchmark since v1... they have since introduced DX10 support and completely new campaigns (with new maps).
It desperatly needs a new benchmark. I remember back in the day CoH benchmark was pretty accurate.
 

apoppin

Lifer
Mar 9, 2000
34,890
1
0
alienbabeltech.com
if that is the way you are doing it is is IMPOSSIBLE for us to get anything bur "relative" results and it falls completely outside of scientific testing into Kyle's "voodoo" realm

But! ... and it is a big one

if we do *enough* runs and comparisons, however, we may have some valid numbers - assuming we can account for the differing HW and other testing variables

but i am a little unsure what you are saying ... you are just "watching" a battle? ... not a cutscene ... if so, then you have the CPU involved much more so than you would like to admit

What are we going to get from your "benchmarking"?
:confused:
i don't think i understand ... at all
 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
i am getting ONE FOURTH the FPS then what's in the canned benchmark WHICH I RAN.. I expect each of us to get relative and different results... maybe you will get one third... thats not exactly the point here.

If it was 10 or 20 or even 30% off I wouldn't be making a fuss about it and just assumed its normal variance.

What system varience IS There exactly? I am running all the tests on the same system, same background processes, same versions of everything, same hardware, same clockspeeds, same abient temperatures, same possible DRM infestation...

The default camera gameplay max FPS never reaches the min FPS from benchmark done on the same system in the same settings.

And didn't I specifically ASK for more people to do their own runs here? this is kinda the whole point of the thread... not "look at my benchies, they proove this!", but "here are my benchies, what are yours?"

Honestly, if you still don't feel there is anything to get from it, then leave it alone. Other people who DO feel like there is something to see will contribute their own results...
 

apoppin

Lifer
Mar 9, 2000
34,890
1
0
alienbabeltech.com
hmm, i am not attacking you whatsoever .. you do not have to be defensive ... i am trying to UNDERSTAND the discrepancy between your run and the demo
--and please forgive me if i question you because i can't quite picture what you are doing - i do NOT have the game ... just my imagination ... i do know about RTS, i just don't care much for them.

Now ... what is in your run that is not in the canned demo, or visa versa?
-are they from the exact same map? You say you 'started' the game and just ran.

EDIT: reread your own first post and tell me if is just me? --or is there not the slightest doubt that you might have left some room for confusion in what you are trying to get across?
 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
the benchmark is found inside the video settings tab, it is used to test your settings before you play... To be blunt, it is a rendering of the first cut scene from the pre-expansion campaign. There are:
1. Pre-rended cut scenes
2. Real time rendered cut scenes (think like warcraft 3, where it zooms in, scripts units movements, and plays audio recording)
3. Actual gameplay.

A year ago the benchmark was fairly accurate. Now it is completely off.

The FPS on the rendered cut scenes in the new campaign is also atrocious, I can only assume that it is because they actually have light sources and other DX10 things that didn't exist in the original rendered "cut scenes", But the new campaign was created after the engine was upgraded from DX9 to DX10.

My point is basically, when you go online and read a review that gives the FPS for CoH they are DEAD ON for what you would get with that card running the benchmark. (I checked, IT IS dead on). It is also way WAYYY beyond what you would experience when trying to actually play, you will have to lower your quality, a lot, to make the game playable.

You tube has lots of movies of people playing COH
Here is one of a guy demoing it on a SLI 8800GTX... he is going really slowly so you get a good look of what the game is like. (others zip around so fast that you would be lost as a non RTS player)
He also zooms in and out all the way. Basically zooming in servers no purpose other then to admire the graphics. The entire game is ALWAYS played at maximum zoom out. Its just how RTS games are played:
http://www.youtube.com/watch?v=abWtkAUASgQ

You asked what I was doing. I wasn't watching a battle, the first mission starts with some air attacks and then I was just sitting there looking at my units standing around in the base doing nothing. I didn't even try to engage the AI, zip around, attack, build units, or anything of the sort, I just looked at my base.

Sorry if I was being too defensive there...
 

apoppin

Lifer
Mar 9, 2000
34,890
1
0
alienbabeltech.com
Originally posted by: taltamir
the benchmark is found inside the video settings tab, it is used to test your settings before you play... To be blunt, it is a rendering of the first cut scene from the pre-expansion campaign. There are:
1. Pre-rended cut scenes
2. Real time rendered cut scenes (think like warcraft 3, where it zooms in, scripts units movements, and plays audio recording)
3. Actual gameplay.
A year ago the benchmark was fairly accurate. Now it is completely off.

The FPS on the rendered cut scenes in the new campaign is also atrocious, I can only assume that it is because they actually have light sources and other DX10 things that didn't exist in the original rendered "cut scenes", But the new campaign was created after the engine was upgraded from DX9 to DX10.

My point is basically, when you go online and read a review that gives the FPS for CoH they are DEAD ON for what you would get with that card running the benchmark. (I checked, IT IS dead on). It is also way WAYYY beyond what you would experience when trying to actually play, you will have to lower your quality, a lot, to make the game playable.
First of all, you need to realize that while you are just "looking", the game's AI is active! AND it will make a huge difference over a canned demo with probably No AI whatsoever. :p

OK, and i THINK i NOW understand. CoH is an old benchmark that runs great on your card BUT in the actual game you get crap FPS ... and you might have found the reason yourself ... the benchmark depended too much on "cut scenes" that now are easy to run on modern HW

Well then, i agree with you .. benchmarks DO get outdated ... the game often is patched to include more features and "options" and more filtering often get increased while the benchmark remains the same. iMO, the benchmarks should stay relevant and current .. IF you look at Far Cry there are at least a dozen different fairly representative ones

As far as Crysis goes, it is a different beastie entirely ... i have my own "theory" if i may present it here more fully here with you indulgence:

i can't imagine both nvidia and amd intentionally "cheating" ... but i know human nature .. if you are optimizing a game, you might just *start* with the official time demo. In the course of that optimization, you will also get some other parts of the game automatically fixed - and break others. Crysis is not expected to be optimized for about another year by the CryTek devs - WtF do we expect "miracles" from nvidia and amd?

Besides, the "official benchmark" looks suspiciously like a "dev tool" to
me. It looks like a tool made-over into a pretty tech demo. What driver team could resist not looking at 0 FPS at frame 2230? and then NOT fixing it? Are they supposed to ignore it and concentrate on the last chapter instead?

After all that, i would ALSO like to see what other posters get comparing the canned benchmark with the actual game play - keeping in MIND that in actual play the AI is active - usually unlike in most demos.

And i would like to compare more "canned demo" vs. "gameplay FPS" than just in CoH and Crysis ;)
 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
mmm, that is a very, very good notion. Also the time demo is a repeatable, measureable test... its not like they are going to have technicians play through the first level with fraps and say to each other "look, see, I think the FPS went down" "what are you talking about, you were totally looking at the water there, the FPS really went up while you were shooting at that guy"... etc... Instead they try a tweak, run a timedemo, viola, improvement, that must means we did something good.. where in actual game play said improvement changes nothing.

But this isn't so much about cheating as just RELEVENCE. I want to know which benchmarks are accurate representations of a card's real power in a game, and which aren't. Not weather AMD or nVidia cheat to do so.
CoH benchmark was dead on a year ago, its completely wrong now. At least on an E8400 with 8800GTS 512 running at 1920x1200 on max settings.

But what about other games? I can now look at websites showing 20 or 30 fps on CoH at max settings on 1920x1200 and know they are 4x what you would get in actual gameplay since they are running the benchmark. I want to know the same thing about other games, like crysis and bioshock. To better understand how powerful cards are.

Better yet, I want to be able to look at the FPS a 9800GTX gets when its released and figure if its worth upgrading from my 8800GTS... is this really a 20 to 25 fps improvement in CoH, or is it a 4 to 5.... Can it really max out game X that is just release and I have yet to get, and thus should be purchased first, or is it just gonna let me go from medium quality at 30fps to medium at 40... but will die at high quality...

I want to know what I am really getting, and translating benchmarks into real value will help
 

apoppin

Lifer
Mar 9, 2000
34,890
1
0
alienbabeltech.com
Originally posted by: taltamir
mmm, that is a very, very good notion. Also the time demo is a repeatable, measureable test... its not like they are going to have technicians play through the first level with fraps and say to each other "look, see, I think the FPS went down" "what are you talking about, you were totally looking at the water there, the FPS really went up while you were shooting at that guy"... etc... Instead they try a tweak, run a timedemo, viola, improvement, that must means we did something good.. where in actual game play said improvement changes nothing.

But this isn't so much about cheating as just RELEVENCE. I want to know which benchmarks are accurate representations of a card's real power in a game, and which aren't. Not weather AMD or nVidia cheat to do so.
CoH benchmark was dead on a year ago, its completely wrong now. At least on an E8400 with 8800GTS 512 running at 1920x1200 on max settings.

But what about other games? I can now look at websites showing 20 or 30 fps on CoH at max settings on 1920x1200 and know they are 4x what you would get in actual gameplay since they are running the benchmark. I want to know the same thing about other games, like crysis and bioshock. To better understand how powerful cards are.

i think you are right ... we need to determine what is out of date and update the benchmarks ... another FIRST for ATF ... unreal ... we were really working for the same thing all this time ... just not communicating it to each other very well. :p ... i am pretty sleepy
:moon:

IF our preliminary results are interesting, i think we should let Derek Wilson know. Perhaps he can add it somehow to his own testing to make his own articles even more relevant!
 

taltamir

Lifer
Mar 21, 2004
13,576
6
76
So is approximation and multiple experiemence...
Some things are not going to be the same, ambiant gravitational forces change with the movement of the planets.
Temperature changes with the day and season and other factors
The exact molecular structure of each component varies and causes them each to perform uniquely, albeit similarly to each other.

So we do a set of experiments with whats given. Our tools are imperfect, you just make due.