[H] Battlefield 4 Video Card Performance and IQ Review

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Gloomy

Golden Member
Oct 12, 2010
1,469
21
81
Yeah I don't think any serious review site is going to go on reddit looking for willing participants for a review. Furthermore, what would they really do differently? On a 64 player server, it would be to play normally engaging in battles...kinda of like...i dunno..a public server?

I'm just saying. 64 video gamers isn't a lot. There are probably more than 64 BF4 players in /r/battlefield_4.
 

(sic)Klown12

Senior member
Nov 27, 2010
572
0
76
I'm just saying. 64 video gamers isn't a lot. There are probably more than 64 BF4 players in /r/battlefield_4.

Getting that many people who are willing to do the same thing over and over for hours on end is a completely different story. If it was even half as simple as that then it would have been put into wide practice by now.
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
Looking at the mantle q&a with the devs it looks like debugging for mantle is quite faster than dx because dx is more a black box where you dont know if the problem is in the driver or in programming. Eg with dx the devs dont even know if the texture can fit in mem. Crazy old dx ah heck.

Hopefully in the future we can get mantle versions faster than this buggy dx crap like seen in bf4. The dx that is way to old the handle complex games like bf series. Right now loads of ressouces is used on debugging and tons of driver updates. Hardly consumer friendly. In fact its crazy people is accepting it but when the alternative is no gaming we have to accept or we can wait :)

Yes, and one would assume that this debugging carries over to the DX version and can save them time with the DX version as well.
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
You know that most games they benchmark will run at ~60 fps at 1080p with settings turned up. Black list, AC4, BF4 with any kind of scaling. Even at 0% scaling, BF4 runs at 90fps for me with a 780ti. Anything that runs under 120 fps is very relevant to most people.

I'm OK with people showing 4k gaming, it's good to tell future-proofing, but way, way more people use 1080p. It should definitely be included.

The same argument could be used that very few people run 120Hz monitors. It's also very few that buy $550 to $700 cards. You can't then try to use the mass market example for what resolution is more important.

There's nothing wrong with the way [H] does reviews. They offer a different perspective than other sites. What would be the point of 12 sites doing the same review and getting the same numbers? I also find TPU's reviews helpful and they have to be the antithesis of [H]'s techniques.
 

DiogoDX

Senior member
Oct 11, 2012
757
336
136
To review BF4 multiplayer:

- 2 pcs with the same hardware, software and network only diferent video cards
- the 2 guys enter in the same jeep/car and make a tour in the map

Oher way:

- One of the pcs enter in spectator mode and see the first person view of the first pc
 
Last edited:

Will Robinson

Golden Member
Dec 19, 2009
1,408
0
0
I'm pretty sure that no one would be calling for [H] to change the entire way they review multiplayer performance if NVDA were dominating in BF4.
They've had heaps of experience over generations of cards to know the best way to achieve sound and verifiable results.
Just as HAWX results were annoying to AMD video card users a few years ago.....so too goes BF4 for NVDA fans.:)
 

5150Joker

Diamond Member
Feb 6, 2002
5,549
0
71
www.techinferno.com
I'm pretty sure that no one would be calling for [H] to change the entire way they review multiplayer performance if NVDA were dominating in BF4.
They've had heaps of experience over generations of cards to know the best way to achieve sound and verifiable results.
Just as HAWX results were annoying to AMD video card users a few years ago.....so too goes BF4 for NVDA fans.:)

"Heaps of experience" means nothing if the testing methodology is flawed. I'm not saying their particular method is flawed but pointing out that just because a site like Hard|OCP has been around a long time doesn't mean they don't make mistakes.
 

blackened23

Diamond Member
Jul 26, 2011
8,548
2
0
"Heaps of experience" means nothing if the testing methodology is flawed. I'm not saying their particular method is flawed but pointing out that just because a site like Hard|OCP has been around a long time doesn't mean they don't make mistakes.

Yeah. The MP testing method could definitely be improved. If there's variance in testing methods at all, it should be re-evaluted - IMO the best suggestion is what your'e saying in terms of creating a small squad to mimic actions similarly through each run through on an empty server. Going to a high population server with tons of things going on creates way too much variance on a run by run basis.
 

blackened23

Diamond Member
Jul 26, 2011
8,548
2
0
I'm pretty sure that no one would be calling for [H] to change the entire way they review multiplayer performance if NVDA were dominating in BF4.
They've had heaps of experience over generations of cards to know the best way to achieve sound and verifiable results.
Just as HAWX results were annoying to AMD video card users a few years ago.....so too goes BF4 for NVDA fans.:)

It's common sense. Going to a random 64 player server to benchmark GPUs creates far too much variance. I didn't catch earlier that the 760 was beating the 770 in their tests, but now that i'm aware of this - this is proof positive that variance skewed the results of their testing. Therefore their followup should look into improved testing methods, period.

You can remove that variance by using small squads on empty servers, but you cannot just go to random 64 player servers and expect consistency on every run-through. Common sense.
 
Last edited:

OCGuy

Lifer
Jul 12, 2000
27,224
37
91
Yes random Multiplayer testing is not a good idea..especially in games like BF4 with a cutting-edge interactive environment, even if they are staged events.

Test 1: Building comes crashing down in Shanghai, loading up everyone's VRAM and tanking FPS

Test 2: Running from spawn



Oops...
 

RaulF

Senior member
Jan 18, 2008
844
1
81
Yes but how realistic is it to find 64 willing participants for each review? That's why the above scenario I mentioned is the best way.

I agree with your first idea about his. Two identical rigs side by side and have them join the same server and just follow each other.
 

Will Robinson

Golden Member
Dec 19, 2009
1,408
0
0
So if the 780Ti had been included and was leading the field then those results would have been invalid as well? :confused:
I'm not sure blaming the umpire is beneficial to those looking to buy video cards for BF4.
 

OCGuy

Lifer
Jul 12, 2000
27,224
37
91
So if the 780Ti had been included and was leading the field then those results would have been invalid as well? :confused:
I'm not sure blaming the umpire is beneficial to those looking to buy video cards for BF4.

I'm pretty sure that no one would be calling for [H] to change the entire way they review multiplayer performance if NVDA were dominating in BF4.
They've had heaps of experience over generations of cards to know the best way to achieve sound and verifiable results.
Just as HAWX results were annoying to AMD video card users a few years ago.....so too goes BF4 for NVDA fans.:)


People are discussing variability in MP testing, including with 2 NV cards.
 

Haserath

Senior member
Sep 12, 2010
793
1
81
It would be great if they provided a tool that recorded the experience to a tee(reproduces cpu/gpu loads as well) then you could save it and replay it on any system.
 

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
So if the 780Ti had been included and was leading the field then those results would have been invalid as well? :confused:
I'm not sure blaming the umpire is beneficial to those looking to buy video cards for BF4.

It has nothing to do with this. Simply put any test where there appears to a significant deviation away from expected puts all the results of said test into doubt. It is a poor experimental design.

Did Hard OCP do multiple runs and average results?

You can look right away in their apples to oranges comparison (they are using the same settings) and see that the 760 and the R270X perform similarly with the 760 at a slight lead. Look at the apples to apples comparison and the 760 is well ahead of the R270X (54 vs 44 fps). The gtx 770 underperforms compared to the 760 in the apples to apples test by a significant amount when it should be getting significantly higher framerates.

They themselves even mention that there are occasional rainstorms that cause 10-15 fps dips.

Hard OCP has never been good with the statistical analysis. Quite frankly some of their analysis is complete garbage. You do not say 'takes the lead by 1.6%" because you get more than 1.6% variation in your test when you run it multiple times. 1.6% is nothing. For all intents and purposes the two are equal. (They seem to do this quite a bit).

They are also stupidly in love with 4K when its pretty much the least useful metric of performance, with monitors not being available and terribly expensive. Does anyone on this forum even have a 4K monitor.

The quality of writing on that site is generally poor as well.
 

Owls

Senior member
Feb 22, 2006
735
0
76
Let's be honest with ourselves here HardOCP has never been the go to site for accurate reviews. They don't try to explain technology behind the products or try to get to the bottom of anomalies or go the extra mile. As far as relevance goes they lost that about 10 years ago. Their reviews are of the 'me too' variety
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
Yeah. The MP testing method could definitely be improved. If there's variance in testing methods at all, it should be re-evaluted - IMO the best suggestion is what your'e saying in terms of creating a small squad to mimic actions similarly through each run through on an empty server. Going to a high population server with tons of things going on creates way too much variance on a run by run basis.

The other part is you have to somehow make sure that as much as possible all aspects of game play are covered. Indoors, outdoors, lighting, explosions, etc...
 

Will Robinson

Golden Member
Dec 19, 2009
1,408
0
0
It has nothing to do with this. Simply put any test where there appears to a significant deviation away from expected puts all the results of said test into doubt. It is a poor experimental design.

Did Hard OCP do multiple runs and average results?

You can look right away in their apples to oranges comparison (they are using the same settings) and see that the 760 and the R270X perform similarly with the 760 at a slight lead. Look at the apples to apples comparison and the 760 is well ahead of the R270X (54 vs 44 fps). The gtx 770 underperforms compared to the 760 in the apples to apples test by a significant amount when it should be getting significantly higher framerates.

They themselves even mention that there are occasional rainstorms that cause 10-15 fps dips.

Hard OCP has never been good with the statistical analysis. Quite frankly some of their analysis is complete garbage. You do not say 'takes the lead by 1.6%" because you get more than 1.6% variation in your test when you run it multiple times. 1.6% is nothing. For all intents and purposes the two are equal. (They seem to do this quite a bit).

They are also stupidly in love with 4K when its pretty much the least useful metric of performance, with monitors not being available and terribly expensive. Does anyone on this forum even have a 4K monitor.

The quality of writing on that site is generally poor as well.

Surely the ability to perform well at high resolutions is a measurement of the sheer "horsepower"/rendering performance of the cards in question?
With reference to the quality of writing...by what metric are you basing your opinion?
Against that of say...a modern fiction writer,a historical tome of non fiction/a random Internet blogger or perhaps another tech web site with which its conclusions are more broadly in line with your own?
Writing quality is rather subjective and no metric that I have been able to find has been applied to articles of a technical nature designed to reach a broad readership ranging from teenage boys just entering the PC market to mature adults who may have experience that vastly exceed that of the writer in question.
 

Will Robinson

Golden Member
Dec 19, 2009
1,408
0
0
People are discussing variability in MP testing, including with 2 NV cards.
People "maybe" discussing variability of MP testing but I would refer you to the thread topic which is in fact "Video performance and IQ review"...
 

Obsoleet

Platinum Member
Oct 2, 2007
2,181
1
0
"Heaps of experience" means nothing if the testing methodology is flawed. I'm not saying their particular method is flawed but pointing out that just because a site like Hard|OCP has been around a long time doesn't mean they don't make mistakes.

This is true, but, between you and HardOCP.. I'm going to side with HardOCP on this one.
 

96Firebird

Diamond Member
Nov 8, 2010
5,738
334
126
People "maybe" discussing variability of MP testing but I would refer you to the thread topic which is in fact "Video performance and IQ review"...

...and?

This thread is about HardOCP's BF4 multi-player testing, which many believe is flawed. We can discuss that here, it is very relevant to the conversation. Just because you don't agree with it doesn't mean it shouldn't be posted in the thread.

What 3D said is also correct, the time spent indoors vs. outdoors is critical. I get much better FPS on my card indoors, and it is something I notice because I am already at very low FPS. Sometimes when I am outside and running I stare at the ground so the game runs smoother, even that can make a huge difference.

Edit - After reading 3D's post again, I see that isn't really what he is saying. But I do now see what he is saying, and that holds merit too. Be sure to get all normal aspects of the game in the "benchmark".
 
Last edited:

blackened23

Diamond Member
Jul 26, 2011
8,548
2
0
I think it's pretty obvious that variances in MP testing are not acceptable. If one card has 27 explosions in a run through, and another has 3 - or if one card has 30 players on screen while another has 5. What's the obvious conclusion here? I'm not sure what these guys are even arguing against, these tests NEED TO BE CONSISTENT. Period.

I really feel like it may partially be due to David having done these tests as i've never seen him do the prior tests with BF3 and what not (which seemed to have consistent results), and no disrespect to him because I'm sure he put a ton of work into it - but prior tests from what I remember were done by Brent and Grady. I think that seeing the 760 ahead of the 770 in these tests should have prevented publishing. Maybe he was caught off guard or something. I don't really know. I don't think that any errors were intentional, and the guys that are saying that are basically trolling. HardOCP has tested like this forever, so...whatever. People see what they want to see and claim "bias" when their vendor doesn't win which is really freaking stupid. IMO.

That said, the bottom line is, there is value in MP testing. I completely agree with HardOCP on this, this is how people play BF4. But, I also think that MP testing should be highly controlled perhaps with a squad of 5-10 people that do consistent actions throughout a run. Just joining a 64 player MP server and hoping for the best (and i'm not saying this is what happened, but the 760 being ahead of the 770 makes you wonder) clearly won't cut it.

Consistency and lack of variance is crucial to any type of benchmarking.
 

Pandora's Box

Senior member
Apr 26, 2011
428
151
116
Only way I could see multiplayer testing in BF4 being done fairly is if you have 2 systems setup. Have both systems join a server as spectators and bench the same area's. Obviously this would mean a limit of 2 cards being compared at the same time.
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
Only way I could see multiplayer testing in BF4 being done fairly is if you have 2 systems setup. Have both systems join a server as spectators and bench the same area's. Obviously this would mean a limit of 2 cards being compared at the same time.

I bet they have more rigs than just two.

Oh... and they probably can afford another bf4 key.