Help Design The Next AnandTech GPU Benchmark Suite

Ryan Smith

The New Boss
Staff member
Oct 22, 2005
537
117
116
www.anandtech.com
Hi everyone, your friendly neighborhood GPU editor here;

I'm currently in the process of building our next GPU benchmark suite. Ostensibly this is for when 28nm GPUs hit the market over the next year, but this will also cascade down in parts into our other GPU and IGP reviews.

Anyhow, as some of you may know, before any major overhaul of the GPU benchmark suite I like to gather some feedback from you guys on what you'd like to see in our benchmark suite, so that we aren't overlooking something that you guys may find critical. Given that we now have our Bench comparison system, feedback is all the more important this year since the data will be available through Bench for years to come.

So if you guys have any games or GPGPU applications that you'd like to see on our next GPU benchmark suite, now's the chance to speak up; we're interested in what you guys think. With that said if you could please provide a basic rationale/reason behind your request it would help the process a lot. And on a side note, for GPGPU benchmarks they need to be cross-platform (e.g. F@H is out because they don't have a proper AMD client right now).

Finally, just to give you guys an idea of what we're roughly looking at so far, we're looking at 7-9 games (one of them will be Crysis). New this fall will be a SSAA benchmark on one of our DX9 titles, since we have a couple different games that both NV and AMD SSAA work on. Our load testing methodology will also be changing slightly since both AMD and NV now use intricate throttling systems on their latest cards.

Anyhow, please let us know what you'd like to see. The more feedback we get, the better we can tune things to meet your needs.

-Thanks
Ryan Smith
 

mnewsham

Lifer
Oct 2, 2010
14,539
428
136
By Crysis do you mean Crysis, Crysis warhead, or Crysis 2? (or some mixture of them) If Crysis 2 I would like to see a multi part test (for each review) Dx10 and Dx11. Also is there an ETA on multi-monitor reviews?
 

Mr. President

Member
Feb 6, 2011
124
2
81
Lost Planet 2 and Crysis 2 both seem like solid choices. Capcom's Framework engine has traditionally scaled well with both GPUs and increased CPU cores. I would also like to see some tests of AMD's tessellation 'optimizations' which completely wreaked havoc on the geometry in Crysis 2.

Somewhat as a sidenote, if you're going to be using the original Crysis then I'd like to see you look into the oft-mentioned draw call issue at some point. From what I've noticed, the 'object detail' and shadow settings can send draw calls through the roof in that game and seem to be (by far) the biggest performance killers. They also seem more dependent on those subsystems other than the GPU.
 

Ryan Smith

The New Boss
Staff member
Oct 22, 2005
537
117
116
www.anandtech.com
mnewsham: Crysis Warhead. Crysis 2 does some weird things that I'm not comfortable with, but I admit I need to do some more investigation to be sure.

Mr. President: AMD's tessellation optimizations will be disabled. In principle I think it's a useful tool for some users, but it violates the spirit of apples-to-apples testing.
 

mnewsham

Lifer
Oct 2, 2010
14,539
428
136
mnewsham: Crysis Warhead. Crysis 2 does some weird things that I'm not comfortable with, but I admit I need to do some more investigation to be sure.

Thanks for clearing that up! Crysis is such a vague term these days :p
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
I believe BF3 will be included, i will try and think a few things for the GPU Benchmarks suite and let you know soon ;)
 

Gryz

Golden Member
Aug 28, 2010
1,551
204
106
I would like to explain the way I use benchmarks.

The way they are set up now, it looks like the target audience is people who have already decided that they will upgrade their videocard. The benchmarks help them to understand what performance they get when they buy Brand1-typeX versus Brand2-typeX or Brand1-typeY. So they can make an informed decision what to buy.

But I am one step behind.
The first question I want to answer is: is it worth it to upgrade my current videocard ?

And the way the current benchmarks are set up, the answer is not always easy to find. Reviews of new cards seem to compare mostly only latest generation cards. And they test mostly only the newest games (Crysis1 being the exception). That is often not enough information for me to decide whether to upgrade or not.

And the tricky thing is: the more we move forward, the more chance that I want to upgrade my current videocard. But also the more chance I will not be able to compare my current videocard against the new generations. Because reviews will not include my videocard any more.

So when Anandtech starts reviewing the new gtx600 series and the new AMD7000 series, I would like to see some old videocards for comparison. E.g. include a gtx560, a gtx460, a gtx260 and a gtx8800. That way, everybody can have a rough idea on how his current card compares to the new ones. Or how his generation compares to the new generation.

This might also mean that the newer cards should be tested with some older games. Because the latest games might not run (in DX1[01]) on the old videocards. E.g. I'd love to see Fallout3 benchmarks on new videocards. Or a similar older videogame that can be demanding with texturepacks and addons.

Hope this helps. I love the current benchmarks. With a little "history" they can be even better.
 

Jacky60

Golden Member
Jan 3, 2010
1,123
0
0
Why not include Arma 2 benchmark. Despite most Anand readers apparently never having heard of it the game is the most demanding PC game out there. Also its DX9 but enabling ssaa on my humble rig can drop frame rates to alarmingly low levels.
 
Last edited:

Grooveriding

Diamond Member
Dec 25, 2008
9,147
1,329
126
Is it viable to have Tri-SLI and Tri-CF benches added into the suite of benchmarks ? There are a lot of interesting results to be seen when comparing vendors with multi-gpu scaling under different resolution conditions. As well with that said will we see eyefinity/surround benchmark comparisons as well ?

I understand there may be hindrances due to not having the hardware available, whether you have three of each card from the two vendors when you are testing we don't know. P67/Z68 are not the best platforms currently for 3+ multi-gpu setups. But, perhaps when 2011 arrives, that platform will be the testbed for GPU testing ?

I think the extreme end setups hold a lot of 'sex' appeal for video card enthusiasts to peruse and will get even more interest in the GPU benches here. It also would be great for some of us who purchase these setups to get the quality benchmarking here at Anand put towards some extreme setups.

Gamewise I agree with the choice of Warhead over C2, it still in my book is the gold standard of PC visuals. Hopefully we will see BF3 and perhaps Witcher 2 for some DX9 benches. Witcher 2 really pushes some great visuals and is a real system killer, as well as being an excellent example of a 'purist' PC game - much like Crysis Warhead is as they were both developed solely for the PC platform.
 
Last edited:

mnewsham

Lifer
Oct 2, 2010
14,539
428
136
Gamewise I agree with the choice of Warhead over C2, it still in my book is the gold standard of PC visuals. Hopefully we will see BF3 and perhaps Witcher 2 for some DX9 benches. Witcher 2 really pushes some great visuals and is a real system killer, as well as being an excellent example of a 'purist' PC game, much like Crysis Warhead is, in it being developed solely for the PC platform.

I was thinking Witcher 2 as well, seconded.
 

Madcatatlas

Golden Member
Feb 22, 2010
1,155
0
0
crysis warhead, BF3, something like Heroes of Might and Magic 6/Settlers 6+ (or Diablo3/DS3/DA2 type game, Civ5 or Shogun type game. Dirt 3 or other racing type game.

Make more out of SOUND and Heat characteristics aswell as the physical layout.

Eyefinity/Surround is a must, at minimum 3x 1920x1080.


Please get rid of games where a single 3 year old card can get scores above 60fps. There are plenty of other sites that test bunches of irrelevant and at the same time oldie but goodie games with new hardware. Alienbabeltech comes to mind.

In my opinion stuff like F@H and Bitcoining COULD be interesting to test for and compare cards at, but it would just be the cherry on top.
 

mnewsham

Lifer
Oct 2, 2010
14,539
428
136
Bitcoin mining i dont think can be done really as Nvidia is getting really bad performance. And there is no standardized client etc. etc.
 

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
One question I have is whether to include games that are known to function better on either nVidia or AMD. Examples might be Lost Planet 2, Starcraft II, Dragon Age II, etc. I think there are pros and cons to either approach.

Pros:
(1) People play these games, and they want to buy the graphics card that will play these games the best.
(2) The purpose of benchmarking is to get a broad view of the prowess of a card, not just to determine whether it will "play Crysis."

Cons:
(1) Some of these games are developed in conjunction with a single manufacturer, which means that optimizations will not be available early on for the other manufacturer. Reviewers really aren't doing the gaming community a service when they benchmark a card on day one and find it performs terribly in a game, only to have the drivers updated for that game a few days later.
(2) Some of these games are not very popular, so readers might just dismiss the scores as irrelevant. Obviously, unpopular games can still be great graphics benchmarks, but if they aren't getting equal attention from nVidia and AMD because they aren't popular, are they really valid measures of graphics card ability? They may actually be a better measure of the dedication of a particular manufacturer to driver development, which is also a relevant issue.

For games to include, I'd go for a cross-section of game types. Obviously BF3 and Crysis Warhead (I like that particularly for cross-generation comparison), Metro 2033, Dirt 3, Starcraft II, Civ 5, Mass Effect 2/3, Witcher 2, Mafia II, etc. I know many of these are already in the Anandtech lineup, so no big change there. Examples of less popular games in the current Anandtech bench that I sometimes wonder about are Hawx (I own it, so I know it's not particularly good, and it favors nVidia strongly) and Battleforge. I'm also wondering about Wolfenstein - was that in there for testing an older engine?

By the way, is Anandtech going to be looking at the issue of CPU-limitation in its benchmarking suite? What rig are you going to use, and would you consider having two separate rigs to represent two separate classes of users? At this point, the forum is full of people wanting to upgrade their dual-core machines to GTX570s and HD6950s. These same folks will want to do the same when the 28nm cards come out. I think it will help to really look at CPU performance a bit more in graphics card reviews, since even today's GPUs are far too powerful for a lot of the CPUs out there.
 
Last edited:

Madcatatlas

Golden Member
Feb 22, 2010
1,155
0
0
Bitcoin mining i dont think can be done really as Nvidia is getting really bad performance. And there is no standardized client etc. etc.


I realise that for both F@H and Bitcoin, one of the two brands is at a disadvantage. But if this is ment for a future GPU bench, i think stuff like that should be considered as the cherry on top of all the other good stuff.

For 28nm, AMDs GCN architecture and Nvdias Kepler might improve or balance the scores for applications such as these.
 

happy medium

Lifer
Jun 8, 2003
14,387
480
126
I know this is a video card benchmark but somehow ,someway, could you include performance numbers with different types of cpu's?

Running an i7 at 4.0 with 20 different cards, and 10 different games, only tells half the story.

I would use cpu catagories. for instance if I were to write up a benchmark today...

Make it with 4 pull down boxes .
1. pick cpu
(Try to use cpu's that will give the reader a broader picture
Athlon II Regor and/or e5400 dual core catagory.
e8500/ core i3 and the Phenom II 555 catagory.
x4 955 and q9550/core i5 catagory
x6 1100t and 2500k )

2. pick Nvidia gpu ( I would only use the gtx2xx, 4xx and 5xx series.)
3. pick AMD gpu (use 48xx, 5xxx and 6xxx series)
4. pick game

You need more like 12 games. And nothing older than 3 years for gods sake and keep it updated.
1. top 4 shooters/action
2. top 3 racers
3. top 3 RPG's
4. top 2 action adventure

This would be the one stop shopping review spot if you did it this way. :)
 
Last edited:

busydude

Diamond Member
Feb 5, 2010
8,793
5
76
I have just one request: no canned benchmarks please.

And one more game suggestion would be DiRT 3 or soon to be releasing F1 - 2011.
 

notty22

Diamond Member
Jan 1, 2010
3,375
0
0
I have no problem with games built in benchmark, or timedemo's.
These are often very accurate to gauge a gpu's performance. Allowing repeatable results and then allowing for changing quality settings, AA levels and its impact on the gpu. It also allows for gauging driver performance improvements or not.
 

Spikesoldier

Diamond Member
Oct 15, 2001
6,766
0
0
In my opinion stuff like F@H and Bitcoining COULD be interesting to test for and compare cards at, but it would just be the cherry on top.

I agree, adding GPGPU tasks such as bitcoin mining or F@H numbers would be an excellent supplement to 3D results.

I believe this would give our bench an edge over competition as these GPGPU applications have significantly been rising in popularity in the past year.
 

Ryan Smith

The New Boss
Staff member
Oct 22, 2005
537
117
116
www.anandtech.com
So when Anandtech starts reviewing the new gtx600 series and the new AMD7000 series, I would like to see some old videocards for comparison. E.g. include a gtx560, a gtx460, a gtx260 and a gtx8800. That way, everybody can have a rough idea on how his current card compares to the new ones. Or how his generation compares to the new generation.
Bench helps out a great deal with this. We aren't always going to publish that data in our in-article charts (we have to balance the amount of data with keeping it readable), but results for some older cards will be in Bench. Realistically we're looking at a large number of current and past generation cards (e.g. 7000/6000 and 600/500 series will have good representation), while we'll have to limit older cards (5000/4000, 400/200) to a couple per generation because older (slower) cards take disproportionally longer to run on the more complex tests. Of course everyone has their own idea of what older cards should be in there, so I can't promise to please everyone. The catch will be that when there's a major launch we have a triage list of cards, so older cards may not be benchmarked in time for an article but will be added to Bench as time permits.

Why not include Arma 2 benchmark. Despite most Anand readers apparently never having heard of it the game is the most demanding PC game out there. Also its DX9 but enabling ssaa on my humble rig can drop frame rates to alarmingly low levels.
The last I heard of Arma was that it was incredibly buggy. I don't play it myself, so I'm curious to hear as to whether this is still the case.

Is it viable to have Tri-SLI and Tri-CF benches added into the suite of benchmarks ? There are a lot of interesting results to be seen when comparing vendors with multi-gpu scaling under different resolution conditions. As well with that said will we see eyefinity/surround benchmark comparisons as well ?
Tri-SLI/CF is something we're going to periodically look at, but it won't be in every article. We're rarely sampled 3 cards at launch, and it takes some time to setup those tests.

Make more out of SOUND and Heat characteristics aswell as the physical layout.

Eyefinity/Surround is a must, at minimum 3x 1920x1080.
Could you please go into more detail and what you mean with regards to sound and heat characteristics? As for surround gaming, you will be seeing more of it. It's a specialty setup though, so it won't be in every article.

One question I have is whether to include games that are known to function better on either nVidia or AMD. Examples might be Lost Planet 2, Starcraft II, Dragon Age II, etc. I think there are pros and cons to either approach.
Good question. Our game selection process is not a binary process - there are several factors. Popularity is a big factor, but also general performance (is it worthwhile as a high-end GPU test?), and whether it uses any interesting technologies.

Whether it performs better on one GPU or another is technically a factor, but it's a very low priority so long as the benchmark is technically sound and there aren't any significant bugs. There doesn't have to be any shenanigans going on for a game to perform better on one GPU architecture than another; AMD and NV's GPU architectures are very different in some important ways.

Whether the developer received co-marketing support is not a real factor, as we would simply have to drop a massive number of games if we removed everything with co-marketing. Such games will be evaluated based on whether they're technically sound benchmarks. One thing that helps us though is that we do not immediately adopt games; everything is vetted by us, a process that usually takes a month and avoids launch-day shenanigans.

And Wolfenstein is in the current benchmark suite because I believe it's important to have an OpenGL benchmark. OpenGL is not dead yet.
 

Ryan Smith

The New Boss
Staff member
Oct 22, 2005
537
117
116
www.anandtech.com
I know this is a video card benchmark but somehow ,someway, could you include performance numbers with different types of cpu's?
CPUs really aren't my beat. If you want CPU bottlenecking information, Anand is your best source; he usually has the parts on hand (i.e. CPUs), though not always the time.

I have just one request: no canned benchmarks please.
By "canned benchmarks" I assume you mean timedemos and the like? If so, that's not a request I'm going to be able to meet.

Our (AnandTech's) principle testing methodology is to stick to the scientific method as close as humanly possible. That means running the same exact test on the same exact hardware and only varying the hardware being tested. And that means timedemos and other automated forms of testing when possible. Anything where I can influence the results of a test by my actions is used as a last resort.

There's a place for such testing and I'm not disparaging it, but it's not a methodology we consider suitable for our needs.

I agree, adding GPGPU tasks such as bitcoin mining or F@H numbers would be an excellent supplement to 3D results.
Like I said in my first post, F@H is immediately DQ'd for the lack of a modern AMD client. The situation is such that results for AMD GPUs are not even meaningful. Bitcoin will also be passed on for some technical reasons (if I wanted to prove AMD GPUs were good at hashing we would have been using the Dnet client for years), and because the shady economics of it are not something I wish to promote. 6990s are for playing Crysis!:p
 

Termie

Diamond Member
Aug 17, 2005
7,949
48
91
www.techbuyersguru.com
CPUs really aren't my beat. If you want CPU bottlenecking information, Anand is your best source; he usually has the parts on hand (i.e. CPUs), though not always the time.

Would you consider having two bench rigs, one for all games, and for a small selection of games just to show the role of the CPU? It would be very useful for us readers. While that's covered in the CPU articles to some extent, it's often done with a GTX580 or a HD6990. Therefore it overcompensates in the other direction with graphics cards most people don't have. Also, while you cover GPUs, not CPUs, I think there is some evidence that certain cards require more CPU power (dual-GPU cards, and perhaps nVidia in general, to some extent). Therefore I'd argue that the consideration of CPUs is somewhat central to your analysis.

On the issue of game selection, it's great that you point out Anandtech doesn't add games on launch day. That's what can really skew results. I like that you have a set of games you use consistently, ones that have been out long enough to be optimized for any hardware.

Thanks for answering my other questions above. Just to state the obvious, this thread is an example of why Anandtech is the best enthusiast tech site out there, bar none.
 

gorobei

Diamond Member
Jan 7, 2007
3,959
1,444
136
Bitcoin will also be passed on for some technical reasons (if I wanted to prove AMD GPUs were good at hashing we would have been using the Dnet client for years), and because the shady economics of it are not something I wish to promote.
Is milkyway@home any more or less valid then? GPGPU is already the direction amd/nv are going, whether it is for deferred rendering performance or double precision distributed computing.

Since the point of a review is to weigh the factors associated with purchasing, hash performance is relevant. Bitcoin's "shadiness" aside, if I know that a new card can earn back even a small portion of the money I spend on it then that may skew whether I buy a 250$ hd7870 or a 400$ hd7970 or even a pair of hd7970's.
[The majority of bitcoin's issues right now stem from abuse of stolen bank and creditcard numbers. Will you take a stand against visa or mastercard as well?]
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,002
126
From what I've noticed, the 'object detail' and shadow settings can send draw calls through the roof in that game and seem to be (by far) the biggest performance killers.
Not true, the biggest performance killer by far is shader quality, by a factor of 3-4 times. The two you mentioned aren’t even close.