HardOCP says anandtech has bad methods

Page 7 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Endgame124

Senior member
Feb 11, 2008
974
706
136
Originally posted by: myocardia
Originally posted by: Endgame124
Gee, card 2 cheats the timedemo because the driver is optomized for timedemo runs of game 1 and game 2.

No offense, but you obviously know nothing about video cards. If you did, you'd know that both nVidia and ATI cheat equally in these canned benchmarks. They have for many years now, and it's not about to stop anytime soon.

No offense taken -- I won't claim to be an expert about current video cards, but I'd like to think I'm competient. My point wasn't that one side cheats and the other doesn't, it was what happens if one side cheaters better than the other, but doesn't perform as well in actual game play. Thats why I just said card 1 vs card 2, not 8800 vs X2.

If "everyone" agrees that the Video Card companies cheat the timedemos, why use them at all? Thats what I don't get -- many posters say that the Hard|OCP method is wrong by not using timedemos, but then in the next breath acknowledge that the video card compaines cheat the timedemos.

So does that mean that AT gives video card awards to who cheats the time demo the best, not who plays games the best?

I've never recorded my own timedemo, but I understand that even custom timedemos don't render everything that actually playing the game does (such as weapons, some enemy effects, etc). If thats truely the case, should even custom timedemos be used?

Perhaps the only fair way to test video cards is to play an entire game through with both cards and record frame rates experienced there in. 20-30 hours of gameplay through multiple levels ought to give a good average of performance per card. Only problem is, multiply that by 8 games, and you're looking at 4 business weeks per card for testing. It might be reasonable to split that testing among 4 testers to get a review done in 1 week, but I imagine that most websites don't have the ability / money to do that kind of testing...

 

ShadowOfMyself

Diamond Member
Jun 22, 2006
4,227
2
0
Originally posted by: Endgame124

Perhaps the only fair way to test video cards is to play an entire game through with both cards and record frame rates experienced there in. 20-30 hours of gameplay through multiple levels ought to give a good average of performance per card. Only problem is, multiply that by 8 games, and you're looking at 4 business weeks per card for testing. It might be reasonable to split that testing among 4 testers to get a review done in 1 week, but I imagine that most websites don't have the ability / money to do that kind of testing...

Yep thats what I was saying, except going through the whole game would introduce alot of variables on its own since no run through is the same... You might do worse and get 20 guys firing at you with one card and only 5 with the next... Second card will obviously show much higher numbers... All in all the ONLY way to do it 100% is record the whole run through and repeat it with the other card... That way no one can complain - its real world, its the whole freaking game, no variables :D
 

Paratus

Lifer
Jun 4, 2004
17,519
15,558
146
Originally posted by: Adam12176
Web traffic must be down over there.

Once in a while it's nice to just look at a bar graph of cards in the SAME settings and make conclusions. Not this apples and oranges crap. At least provide standard benchmarks for people to make their own conclusions.

And when was the last time hardocp even did anything 'extreme.' Place is more of a glorified IGN site than anything.

Damn 11 posts since 2001.........

I agree with your assessment however. There's nothing fundamentally wrong with either type of review method as long as you understand what you are getting. Bragging about, calling out other reviewers just to get page hits is nothing more than attempted epeen ++
 

apoppin

Lifer
Mar 9, 2000
34,890
1
0
alienbabeltech.com
Originally posted by: richwenzel
so there is no IQ difference between ATI and nVidia? I have read that some say there are...

i literally don't know, nor do i truly care....i just buy something that that masses say are pretty good value, look at my game and decide if it is good enough and go on with my life...

nope none ... at least not any more ... ATi used to have a possible - but disputable - IQ advantage. G80 and r600 are nearly identical and i am certain there is NO one that who can tell by "looking" if the graphic is powered by AMD or nvidia on an otherwise identical rig and display

===================

Originally posted by: jaredpace
isn't the reason his "real time gameplay benchmarks" are lower than the "canned benches" because he is using Fraps? I remember using fraps and it knocking like 30% off my framerates while i had it running

i hope not. FRAPs does take a performance hit period. But i am guessing less than 5% on a decent system. And of course, is should be equally taxing on both video cards so that pretty much takes away any disadvantage there is to benchmarking with it.


Heck, KyleB is no Galileo fighting ignorant Church Hierarchy ... in this case the roles are reversed ... he is not only asking us to *trust* him and suspend disbelief as he attempts to show us by many words that the world is not only flat but nvidia is the center of the universe.

Nope, he will have to do better than that. Galileo was able to PROVE beyond a shadow of a doubt using mathematics and observations that the Church was Wrong ... that their stupid and outdated Aristotle concept was not even supported by their Bible. Of course everyone "knew" but it was Church politics that shut up the truth as being official for awhile.

Well, it's KyleB this time with his voodoo unscientific and unrepeatable "real world" testing - nvidia marketspeak unless you don't recognize it - that has a self-created "Holy Scripture of Fud and Doubt" to assail science and credible benchmarking.

And i say, 'well and good' PERHAPS Kyle IS a *prophet* ... i want to believe ... please ... *show us a sign* ... give us evidence that cannot be reasonably disputed and i will worship at your green altar. :p

:roll:

until then, i will assume that it is PROFIT that motived your 'revelation' - and more likely a call from a nvidia rep that turned you to your "real" fake-world benchmarking in exchange for any credibility you used to have.
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,002
126
This makes me torn between GPUs. I like Nvidia GPUs because I can never get picture scaling to work how I want it to work on ATI GPUs (if an older game doesn't support widescreen, I WANT the black bands on the sides).
Truth be told I don't trust either vendor to scale LCDs properly which is why when I eventually get one I'll be sure to get one with a good hardware scaler.
 

Sunrise089

Senior member
Aug 30, 2005
882
0
71
Regardless of methodology (which gets shaky for hardOCP the moment they stop comparing cards as equal settings), their reviews are less pleasing to read and look at. Maybe it's the liberal-arts graduate in me, but so long as AT's site isn't providing me with false information, it's always going to be superior.
 

NiViK

Member
Feb 24, 2000
81
0
0
Holy crap apopin...you sure do have a chip on your shoulder when it comes to kyle...

I really LIKE how you capitalize some words for EFFECT...(j/k)


As someone posted in this thread earlier...if you only read one review site and base your opinions solely on that..your a moron.

 

apoppin

Lifer
Mar 9, 2000
34,890
1
0
alienbabeltech.com
Originally posted by: NiViK
Holy crap apopin...you sure do have a chip on your shoulder when it comes to kyle...

I really LIKE how you capitalize some words for EFFECT...(j/k)


As someone posted in this thread earlier...if you only read one review site and base your opinions solely on that..your a moron.

No chip on my shoulder ... since he was responding to "us" in his article, i'd thought i'd just let him know how i really feel about his "real world" testing.

It is not real. It is an illusion. If he wanted to and if he was "above board", he would do as i demanded - give us the video of his benchmark run with FRAPs running - for all i know it is ten seconds long; and give us his 'save' so we can attempt to replicate it. Perhaps he made an error.

It is only fair he show us. i already have the Crysis benchmark and have pretty much dismissed it as a 'pretty tech demo' created from a dev tool. No, it is not terribly representative of Crysis Demo gameplay. i always see AMD cards *choke* at the same frame and i imagine they DO optimize to get rid of the BUGS.
-no sh!t sherlock ... and while they are at it, this optimization also has the effect of making the entire game run better. It IS a "game" between nvidia and AMD as they race to write better drivers for the AAA 'benchmark games'. It is not inherent "dishonesty" ... it is the nature of the industry ... they have their *priorities* and limited resources and of course they will "fix" the demo issues ... and the AAA titles FIRST.

i have to admit ... i was curious and i DID take a peek at their now LOCKED thread. Kyle doesn't allow dissent PERIOD. All he has is a bunch of people that agree with him and complement him over and over. i'd say "ass kissers" if i was being impolite.

No wonder THIS is the place to come to learn. i ALSO want to add real world testing to MY benchmarks ... but they will be OPEN and repeatable ... i just need to figure out how to replicate with a higher success rate.
:D

and yeah ... i TALK like this :p



 

bryanW1995

Lifer
May 22, 2007
11,144
32
91
And i say, 'well and good' PERHAPS Kyle IS a *prophet* ... i want to believe ... please ... *show us a sign* ... give us evidence that cannot be reasonably disputed and i will worship at your green altar.

I took a dump today that looked kind of like kyle...does that count as a sign?
 

keflex

Junior Member
Jan 28, 2008
6
0
0
Originally posted by: ViRGE
It's basically a half-truth mixed with FUD. It's true, a timedemo isn't always going to produce results that are the same as playing a game (you try coming up with a timedemo that exactly replicates how every person will possibly play the game) but I also think it's the more scientific manner to do such testing. With timedemos you get exactly the same situation every single time, the only variable changed is the video card; with manual control in spite of your best efforts you're not getting the same test. And while it's acceptable given no other option (i.e. Oblivion), I find it more acceptable to stick to timedemos so that you are for sure only changing the video card as the variable.

This has got to be the worst logic I've ever heard in my life. Seriously.



 

SickBeast

Lifer
Jul 21, 2000
14,377
19
81
I just read the HardOCP article on Gears of War performance and actually enjoyed it.

It seems that they are capable of producing the odd good article. They just need to carry it over to their hardware reviews.

Their old-school reviews were good.

The "highest playable settings" stuff only seems to work well when they're reviewing performance for a particular game in detail.

I agree with Apoppin in that they should be willing to publish their timedemoes. It's not like it's rocket science to create one. :light:
 

sskk

Junior Member
Dec 1, 2007
19
0
0
Originally posted by: UltimaBoB
Obviously, objective benchmarking trumps subjective benchmarking *if it is possible*.

But if the tools don't exist to objectively benchmark actual gameplay, and if that which can be scientifically benched is being manipulated, then inherently subjective benchmarks - done as "objectively" as you can - may be the best you can hope for.


I think this is the best reply, and one of the few ?OBJECTIVE" ones in this thread.

Many of you may like to believe objective benchmark are, well, objective, but more often than not, they are NOT. So to say Hardocp's way is meaningless is just wrong as saying it's flawless.

It's a mixed world and the best way is look at both type, and better yet user report of real gaming exp on forums.
 

apoppin

Lifer
Mar 9, 2000
34,890
1
0
alienbabeltech.com
Originally posted by: sskk
Originally posted by: UltimaBoB
Obviously, objective benchmarking trumps subjective benchmarking *if it is possible*.

But if the tools don't exist to objectively benchmark actual gameplay, and if that which can be scientifically benched is being manipulated, then inherently subjective benchmarks - done as "objectively" as you can - may be the best you can hope for.


I think this is the best reply, and one of the few ?OBJECTIVE" ones in this thread.

Many of you may like to believe objective benchmark are, well, objective, but more often than not, they are NOT. So to say Hardocp's way is meaningless is just wrong as saying it's flawless.

It's a mixed world and the best way is look at both type, and better yet user report of real gaming exp on forums.

wrong - their *way* is NOT "meaningless" ... absolutely not what i am trying to say.
--i want to ADD it to my own benchmarking. i would like to see our forum use it to confirm AT articles. :p

i am saying the WAY Kyle is doing it is wrong UNLESS he will "open up" and show us EXACTLY what he is doing by posting the run and save that he uses for Crysis.

He is the one that opened this can of worms and now he is just going to close the subject like he does on his own forum?
:confused:

i think not ... and i also think if i was an advertiser on his site i would want to know if his benchmarking is smoke and mirrors - or not.

 

ronnn

Diamond Member
May 22, 2003
3,918
0
71
To come in very late... Hardocp way of benchmarking has good reasoning, but not being able to replicate their results leaves it open to unintentional bias. Anandtech generally does a nice job but maybe use more aa at times. I also assume anandtech does play the games to make sure that they are playable with no micro stutter or whatever.


edit: Kyle attacking others in the industry is not really that new.
 

keflex

Junior Member
Jan 28, 2008
6
0
0
Originally posted by: ronnn
To come in very late... Hardocp way of benchmarking has good reasoning, but not being able to replicate their results leaves it open to unintentional bias. Anandtech generally does a nice job but maybe use more aa at times. I also assume anandtech does play the games to make sure that they are playable with no micro stutter or whatever.


edit: Kyle attacking others in the industry is not really that new.

Not bashing you specifically, but the concept of "Methodology > Real Application" that's so prevalent on this forum is ridiculous.

There's a very good reason why In Vivo studies trump contradictory In Vitro results...
 

ronnn

Diamond Member
May 22, 2003
3,918
0
71
Ok I know keflex is broad spectrum, but in vitro? Don't think computers are defined as life just yet. :D
 

keflex

Junior Member
Jan 28, 2008
6
0
0
Originally posted by: ronnn
Ok I know keflex is broad spectrum, but in vitro? Don't think computers are defined as life just yet. :D

Hah, nah, just making the point that real world results should always trump what occurs in a completely controlled environment. After all, none of us exists in a vacuum and outside variables always have to be kept in mind. ;)
 

ronnn

Diamond Member
May 22, 2003
3,918
0
71
All these results are hopefully in the real world. Just a case of which ones are most believable.
 

apoppin

Lifer
Mar 9, 2000
34,890
1
0
alienbabeltech.com
Originally posted by: keflex
Originally posted by: ronnn
Ok I know keflex is broad spectrum, but in vitro? Don't think computers are defined as life just yet. :D

Hah, nah, just making the point that real world results should always trump what occurs in a completely controlled environment. After all, none of us exists in a vacuum and outside variables always have to be kept in mind. ;)

so what is Kyle hiding?
:confused:

if *i* made the broad claims that HE makes - attempting to cast doubt on everyone else - i'd have ability to prove it and i would show my exact methodology instead of saying "trust me". Maybe his little fans at his forum do but we see no compelling reason to put blind trust in someone with his track record.

So far Kyle is ALL TALK
 

narzy

Elite Member
Feb 26, 2000
7,006
1
81
Originally posted by: ViRGE
It's basically a half-truth mixed with FUD. It's true, a timedemo isn't always going to produce results that are the same as playing a game (you try coming up with a timedemo that exactly replicates how every person will possibly play the game) but I also think it's the more scientific manner to do such testing. With timedemos you get exactly the same situation every single time, the only variable changed is the video card; with manual control in spite of your best efforts you're not getting the same test. And while it's acceptable given no other option (i.e. Oblivion), I find it more acceptable to stick to timedemos so that you are for sure only changing the video card as the variable.

Picking a fight with AnandTech specifically probably isn't the smartest thing either. It's one thing to do an article about this kind of stuff to explain why you do something the way you do it, but dragging in someone else is at the very least petty.

Hard OCP has never exactly been one for journalistic integrity.
 

superbooga

Senior member
Jun 16, 2001
333
0
0
Regardless of how good or sucky Kyle's methods are, there's one thing that's true at this moment: the built-in Crysis GPU benchmark is a POS. Both companies heavily optimize to score the well in that test, with no real, or even negative, performance gains in actual gameplay.
 
Dec 21, 2006
169
0
0
I agree wholeheartedly with Appopin. Regardless of the canned vs real-world benchmarking arguement, if Kyle is going to slam a competitor's benchmarking style, he should at least have the courage to post his own saves and allow us to reproduce. AT has no burden to answer this call. The burden of proof is on Kyle.

I personally prefer canned benchmarks because of reproducibility. I think that in 90% of the cases, even if a canned benchmark provides higher FPS than real world gaming, the relative performance of the cards should remain about equal, especially in cases where custom timedemos are used. After all, its quite hard to optimize specifically for a timedemo that did not exist until testing. However, I do see the appeal of real world benchmarks at times, and never base my purchases off just one review site.
 

awesomedude

Junior Member
Feb 13, 2008
1
0
0
First off I would like to say that I am a frequenter of HardOCP. My post is likely biased in favor of HardOCP, but I am going to do my best to remain unbiased and looks solely at the "facts".

Some of the posts here are absurd, especially in light of the fact on how everyone is claiming superiority for the scientific method and people able to back up claims and have proof for your statements.

For all the people claiming that crysis is one game, and that because HardOCP did not show the same results for other games that they are wrong. In scientific theories it only takes one case to prove an entire theory wrong, even if there are million cases that support the theory all you need is one case where the theory is wrong to invalidate the entire theory. Scientific method/theory doesn't work on a only when its convenient for us basis.

Then there are the people who are claiming they need 100% repeatable proof before they are convinced of anything, but then they turn around and make absurd claims that Kyle probably picks the best run for card a and the worst run for card b, or picks a scenario to favor card a over card b without any sort of proof. If you guys are going to stick to your proof guns at the very least don't make up stuff about Kyle cause you don't like him calling AT out. I am not saying Kyle is right for calling AT out, but for you guys to say Kyle calling out AT was wrong and unprofessional and then to turn around and say that Kyle picks and chooses data to fit the results they want, without any sort of proof, is outright hypocritical.

To the guys discussing population and samples, the population would be all the frames one could create in the entire game, or at least that level, and Kyles run-through would be a sample of that. Now granted both samples are likely different and it is impossible to recreate the exact same sample by hand. But two random samples from the same population should give the same averages. Now I understand they aren't taking random samples, but we can conclude that two similar samples from the same population, will give similar results.

To the people saying that because HardOCP compare apple to oranges that it is impossible to distinguish performance differences because you could run card a at 600 x 800 and card b at 1600 x 1200 and get the same frame rate and determine that the cards are equally fast is again absurd. First HardOCP determines the "fastest" card as the one that can play at the highest settings IE: 1600 x 1200 would be pronounced a faster card then one run at 600 x 800. HardOCP doesn't rate the fastest card by the best average frame rates in their apple to oranges comparison like everyone seems to going off about. Since HardOCP gives its readers all the setting information readers can come to their own conclusions IE: card a has an average fps of 20 and card b has an average fps of 20 at the same resolution, but card b is running with 4xAA and 16x AF while card a is running no aa or af, anyone can easily see that card b is the faster card. There is usually enough information present that you can tell which card is faster even though they are using apple to orange comparisons.

http://www.hardocp.com/article...wzLCxoZW50aHVzaWFzdA==

We can see here that the geforce has an avg fps of 28 and the radeon has an avg fps of 26.2. We can also see that the shader quality on the geforce is set at high, while the 1.8 avg fps difference is negligible especially in RW testing, we can conclude that the geforce is a faster card cause it is running at a higher setting.

Also note on this page that they describe the exact map they choose to play, and some of the various effects, affecting the graphics card, and the length in which they played. This is for all the people claiming they chose specific effects for one video card over another, and for those saying they probably just played for 10 seconds. I understand that this isn't a save point or a video but it does give a lot of data about their run through the game. Someone could easily go to the selected map, and make a 10 or so minute run through containing the listed effects and if the data gotten was drastically different in that scenario then HardOCP's we could conclude that something was wrong with someones data.

And while what the highest playable setting is subjective to each reader, for the most part the criteria for a game being unplayable should be the same for either card for the reviewer. Also Kyle many times addresses why the card wasn't capable of running at higher settings.

Also making demands that someone do something or all there data is false, does not make it so. And if all data was held to the meet my demands or your data is no good standard no data would be good, as everyone would have ever increasing demands. While I would love to see more openness from HardOCP reviews, including save points and videos of the run through, the fact that they don't doesn't make there data somehow false. Making demands on how things be done on someone else's forum, and making threats (if you don't do it, your a bunch of pussy's/liars/corporate whores/etc...) has never been a good way to get things done the way you want them. Although I don't frequent the HardForum from what I have read Kyle seems pretty open about there testing procedure and want to do what their audience would like to see from them. Perhaps if you had written Kyle a nice message saying you would like to see the save points and videos added to the gfx card reviews on the HardForum you might have gotten a better response. But making demands from the AT forums is surely not going to get you what you want, and you'll continue to claim the HardOCP sucks because they didn't bow down to your every demand.

Again going back to the data HardOCP gives its users, if the fps are within 5 frames of each other or 10% at 50fps, that it could be chalked up to human error, but usually the data presented in that card a is clearly faster then card b or that card a and b perform similarly in this game. Again there data is their for the user to come to their own conclusions. While I agree that without apples to apples it may be hard to directly compare how much faster card a is then card b, but you can always tell which card is faster, and what kind of performance in you can expect with card a in game c.

Some people said it already, and I'll repeat it if card a performs better in game, but card b benches better, and AT says card b is the superior card, and when you go to play game c and card b isn't faster then the card a you upgraded from your going to feel cheated and misled.

I realize that this is biased in favor of HardOCP. Also I know I gave Kyle the benefit of the doubt in this thread, but it only seems fair to me to give someone the benefit of the doubt unless there is proof stating otherwise.