6950 vs GTX 460 768MB - Why does Nvidia beat the Radeon in Civ5???

Discussion in 'Video Cards and Graphics' started by Black96ws6, Apr 5, 2011.

  1. A5

    A5 Diamond Member

    Joined:
    Jun 9, 2000
    Messages:
    4,873
    Likes Received:
    1
    I've never had that long IBT, but I've also never had that many civs on the map either. I do agree that the day-to-day user experience in the game is affected more by CPU speed and RAM than the GPU, though.
     
  2. Ryan Smith

    Ryan Smith The New Boss
    Staff Member

    Joined:
    Oct 22, 2005
    Messages:
    457
    Likes Received:
    6
    Hopefully one day NVIDIA will allow me to explain what they did to improve Civ V so much. I found out what they did, however I'm not allowed to talk about it (and boy I'm dying to). It makes all the CPU limitations make sense though, and it somewhat reshaped my view on DX11. Honestly I'm surprised the eggheads over at Beyond 3D haven't already figured this one out; it seemed kind of obvious in retrospect.
     
  3. Ryan Smith

    Ryan Smith The New Boss
    Staff Member

    Joined:
    Oct 22, 2005
    Messages:
    457
    Likes Received:
    6
    After talking things over with NVIDIA, they've agreed to allow me to discuss the precise changes they made to boost their Civ V performance by so much. So gather around children, crazy uncle Ryan has a story to tell.

    ---

    In our description of Civ V, I've mentioned that it uses a slew of DirectX 11 technologies" but I've never gone in to great detail on what those are. I'm not going to go into deep detail on that now - there's a good article over at PC Games Hardware that contains an interview with Firaxis about that - but I will quickly explain the ins and outs.

    Often from a gamer standpoint it's natural to look at the immediate visual benefits of a new API. With DX11, the big feature is tessellation with a secondary feature of contact hardening shadows. However there's also a great deal of stuff going on in the backend for developers to make things faster - making things faster allows developers to use new graphical effects that may not have been practical before. So for DX11 on top of tessellation and contact hardening shadows there's also things like multithreaded rendering, compute shaders, support for larger textures, and the implementation of a pull model for certain attribute evaluation.

    So why do I like Civ V? Because the LORE engine it's based on implements so many of these features. Sure, something like AvP will have tessellation added, or Bad Company 2 will implement contact hardening shadows, but most of the DX11 games today are adding one or two graphical features that improve the look of the game, but only begin to scratch the surface of the API. LORE goes much, much deeper. Firaxis uses multi-threaded rendering, they use compute shaders for texture compression, and they use tessellation. Today it's probably the most extensive AAA DX11 game that has been released so far. This makes it a great GPU benchmark, as it's a real game we can use to test features other games don't touch.

    So what then is going on that made Civ V so much faster for NVIDIA? Admittedly I had to press NVIDIA for this - performance practically doubled on high-end GPUs, which is unheard of. Until they told me what exactly they did, I wasn't convinced it was real or if they had come up with a really sweet cheat. It definitely wasn't a cheat.

    If you recall from our articles, I keep pointing to how we seem to be CPU limited at the time. Now if you go back to the list of DX11 features Civ V uses, a light bulb should light up: multithreaded rendering. Civ V uses multi-threaded rendering, in fact it uses it quite extensively. Now why do we have multi-threaded rendering in the first place? Half of this is to better mesh with multi-threaded games by enabling additional threads to directly contribute without having to go through a master thread first. But a second purposes is because multi-threaded rendering helps the GPU just as much as it helps the CPU.

    [​IMG]

    Traditionally, rendering is a very serial process. The program needs to setup a bunch of objects and then pass that on to the video drivers and finally to the GPU. There's a high degree of submission overhead, meaning it's possible to choke the CPU while submitting a large number of objects to the GPU. In DirectX 11, multi-threaded rendering is achieved by turning the D3D pipeline into a 3 step process: the Device, the Immediate Context, and the Deferred Context. The important bit here is that the deferred context is full of things that have yet to be sent to the GPU, and that you can have a deferred context for each thread. When developers talk about multi-threaded rendering with DX11, this is what they're referring to. When you use DX11s multi-threaded rendering capabilities correctly, you can have several threads assemble their deferred contexts, and then combine them into a single command list once it comes time to render the scene.

    So Civ V uses proper multi-threaded rendering, that's great! So why isn't this the end of the story?

    It turns out that you don't actually need to support all these nifty multi-threading features to be DX11(or rather D3D11) compliant - those features are optional - and that's what happened. And this is what changed my perspective on DX11, as before now I've never realized that anything in the API/spec was optional. Previously we had all the pieces to understand what was going on, but without knowing that AMD and NVIDIA did not fully support multi-threaded rendering, it was never clear what the bottleneck was.

    But let's be clear here: multi-threaded rendering is a massive undertaking on the driver and hardware side. You're doing the GPU equivalent of inventing the multi-tasking operating system. NVIDIA and AMD have not until this point supported multi-threaded rendering in their drivers, as they have needed time to implement this feature correctly in their drivers. If you have the DX SDK installed, in the DX Caps Viewer this is visible in the D3D11 section under the title "Driver Command Lists".

    [​IMG]

    So in a nutshell, 4 months ago Civ V supported multi-threaded rendering. AMD and NVIDIA did not.

    Can you guess then what changed?

    With the Release 265 series drivers, NVIDIA enabled partial support for DX11's multi-threaded rendering features. At the time this support was limited to just Civ V, and while it was beyond the experimental stage it was clearly limited to Civ V as that allowed NVIDIA to deploy it against a single known program while they collected feedback and finished the other aspects of multi-threaded rendering.

    With NVIDIA's drivers now allowing Civ V to use multiple deferred contexts, Civ V's performance shot way up. With high-end GPUs performance damn near doubled at lower resolutions. Civ V was in fact CPU limited - it was CPU limited because it was only able to use a single thread to assemble its contexts, and that thread was maxing out the single GPU core it could use. This is why drivers played such a big part in Civ V's performance, because how drivers handled D3D11 contexts was the key to unlocking Civ V's performance.

    At this point in time we appear to be GPU limited, but we may also be CPU limited. Firaxis says Civ V can scale to 12 threads; this would be a hex-core CPU with hyperthreading. Our testbed is only a quad-core CPU with HT, meaning we probably aren't maxing out Civ V on the CPU side. And even with HT, it's likely that 12 real cores would improve on performance relative to 6 cores + HT. Firaxis believes they're GPU limited, but it's hard to definitively tell which it is.

    [​IMG]
    Image from Firaxis GDC11 presentation

    In any case, full support for multi-threaded rendering was finally enabled in NVIDIA's Release 270 drivers, which were released last week. At this point any game or application can take advantage of the feature, and not just Civ V. This is also why NVIDIA has finally allowed me to write about what they're previously told me, as they no longer consider it a secret. Finally, on a side note the fact that Civ V had this feature enabled in NVIDIA's drivers early is why performance does not appear to have changed between Release 265 and Release 270.

    Anyhow, as far as I know, AMD does not currently offer fully support for multi-threaded rendering (I don't have an AMD card plugged in right now to run the DX Caps Viewer against). I'm not sure where they are on it, though I doubt they're very far behind.

    So in conclusion, the reason NVIDIA beats AMD in Civ V is that NVIDIA currently offers full support for multi-threaded rendering/deferred contexts/command lists, while AMD does not. Civ V uses massive amounts of objects and complex terrain, and because it's multi-threaded rendering capable the introduction of multi-threaded rendering support in NVIDIA's drivers means that NVIDIA's GPUs can now rip through the game.

    This is the true power of DX11. When properly implemented in both drivers and games, DX11's multi-threaded rendering capabilities are going to allow developers to push a lot more stuff out to the GPU without immediately bottlenecking the CPU.

    On a future note, while Civ V is the first game to use DX11 multi-threaded rendering, it is not going to be the last. Battlefield 3 will most likely use it - DICE was lamenting the lack of driver support last month at GDC. The Capcom team responsible for Lost Planet 2 also mentioned how they would have liked to have this feature working before LP2, though I can't find the article at this time.

    Coincidentally, last month's interview with AMD's Richard Huddy at Bit-Tech also has a lot in common with this. AMD says DX11 multi-threaded rendering can double object/draw-call throughput, and they want to go well beyond that by bypassing the DX11 API.

    Further Reading: AnandTech, Revealing The Power of DirectX 11
     
    #28 Ryan Smith, Apr 8, 2011
    Last edited: Apr 8, 2011
  4. alcoholbob

    alcoholbob Diamond Member

    Joined:
    May 24, 2005
    Messages:
    5,243
    Likes Received:
    1
    I don't usually read a wall of text, but that was very informative. Good read :)
     
  5. jimbo75

    jimbo75 Senior member

    Joined:
    Mar 29, 2011
    Messages:
    223
    Likes Received:
    0
    Yeah no doubt that will repaid in the next Geforce review.

    The problem with Civ 5 is the benchmark is not indicative of ingame play and it should never have been used. If it had been AMD at the top it wouldn't have been, that's why there is no Dragon Age 2 either.
     
  6. notty22

    notty22 Diamond Member

    Joined:
    Jan 1, 2010
    Messages:
    3,376
    Likes Received:
    0
    Thanks Ryan, good stuff to digest and give some life back in to dx11 relevance. Now hopefully Crytek is working on some of this to implement in Crysis 2.
     
  7. Ryan Smith

    Ryan Smith The New Boss
    Staff Member

    Joined:
    Oct 22, 2005
    Messages:
    457
    Likes Received:
    6
    I'm assuming this is referring to our current test suite?

    The GPU test suite is refreshed roughly every 6 months. The last time it was refreshed was in late October of 2010, and as such there are not any games newer than that in the suite. It will be refreshed here in the next month or so.
     
  8. AtenRa

    AtenRa Lifer

    Joined:
    Feb 2, 2009
    Messages:
    11,258
    Likes Received:
    41
    Nice work Ryan, ;)

    From all the PC games, Civ V was the last one i would expect to see such features as it isn't an FPS and yet it was the first, NICE WORK Firaxis ;)

    Edit: It really puts the Big FPS boys in to shame (Crytek ??)
     
    #33 AtenRa, Apr 8, 2011
    Last edited: Apr 8, 2011
  9. MustangSVT

    MustangSVT Lifer

    Joined:
    Oct 7, 2000
    Messages:
    11,533
    Likes Received:
    0
    great post. informative!
     
  10. Silverforce11

    Joined:
    Feb 19, 2009
    Messages:
    10,458
    Likes Received:
    1
  11. Lepton87

    Lepton87 Platinum Member

    Joined:
    Jul 28, 2009
    Messages:
    2,500
    Likes Received:
    0
  12. Lonyo

    Lonyo Lifer

    Joined:
    Aug 10, 2002
    Messages:
    21,939
    Likes Received:
    0
    I have to question a benchmark that puts the GTX480 being faster than the GTX580 by 10%.
     
  13. Chiropteran

    Chiropteran Diamond Member

    Joined:
    Nov 14, 2003
    Messages:
    8,841
    Likes Received:
    1
    That was a very interesting read. IMO this a very important technology, as it seems to be a big key for making games that properly scale to more than 2 CPU cores. I hope AMD gets their act together soon.
     
  14. jimbo75

    jimbo75 Senior member

    Joined:
    Mar 29, 2011
    Messages:
    223
    Likes Received:
    0
    Maybe you should check the older Civ V benchmarks? More often than not the gtx 460 came top ahead of the 480. That didnt matter because they were all ahead of the radeons. :whiste:
     
    #39 jimbo75, Apr 8, 2011
    Last edited: Apr 8, 2011
  15. AtenRa

    AtenRa Lifer

    Joined:
    Feb 2, 2009
    Messages:
    11,258
    Likes Received:
    41
  16. Martimus

    Martimus Diamond Member

    Joined:
    Apr 24, 2007
    Messages:
    4,384
    Likes Received:
    1
    Civ 5 is the only game that uses a multithreaded graphics engine. I would guess that nVidia just released multithreaded DX11 rendering drivers first, and AMD has yet to release multithreaded DX11 rendering drivers. (http://www.pcgameshardware.com/aid,...h-Interview-What-DirectX-11-is-good-for/News/)

    EDIT: I just went through and read the rest of the posts, and Ryan actually verified what I had thought all along. It is nice to see that this feature really does increase performance that much. We'll probably start seeing it more in games int he future. (Although I read that 4 other games canceled development on multithreaded rendering due to lack of support for it from current drivers).
     
    #41 Martimus, Apr 8, 2011
    Last edited: Apr 8, 2011
  17. Lonyo

    Lonyo Lifer

    Joined:
    Aug 10, 2002
    Messages:
    21,939
    Likes Received:
    0
    But isn't that the whole point of this entire thread?
    NV has fixed its driver problems -> performance increases -> the odd capped situation where the GTX460 is faster than the 480 is solved -> performance should now reflect, well, performance -> GTX580 should be ahead of GTX480.

    [​IMG]
    470 and 460 faster than 480 on old (not fully functional) driver.

    [​IMG]
    580 > 570 > 480 > 560 > 460
    Properly functional driver. Properly reflective scores because they aren't capped by driver limitations.

    The benchmark quoted was suggesting that AMD outscores NV even with the fixed driver because the performance of the built in benchmark doesn't reflect game performance in terms of NV vs AMD difference.
    The implication of the other graph is not that NV has a lead, but that the fixed driver is in fact not fixed, or the benchmark is wrong, because with a fixed driver it should be that the GTX580 is faster than the 480, especially when the 580 and 570 have different performance to each other and it scales with the GTX590, meaning performance isn't capped.

    Therefore the high GTX480 is completely unusual and not what you would expect under any circumstance. If it was because the cards were capped, the GTX590 shouldn't scale and the 580 and 480 should be at about the same level, the 480 shouldn't have a 10% lead.

    They also have some odd results in other Civ 5 benchmarks, like the 6790 being soundly beaten by the 5770 and sometimes almost the 5750. Since they use the same architecture, and have the same functional units, and the 6790 is improved in many ways, it's odd that it would ever be slower, especially when you crank up AA and res (1920x1200/8xAA and the 5770 is faster than the 6790, despite the massive bandwidth advantage of the 6790 and the improved tess performance and the equal in every other aspect specs, on paper at least).
     
    #42 Lonyo, Apr 8, 2011
    Last edited: Apr 8, 2011
  18. jimbo75

    jimbo75 Senior member

    Joined:
    Mar 29, 2011
    Messages:
    223
    Likes Received:
    0
    Or maybe it's because AMD fixed the game drivers and not the benchmark drivers?

    Even SKYMTL doesn't use Civ V because "In addition, most sites use the Civ benchmarking tool which I find IS NOT representative of the in-game performance."

    http://www.xtremesystems.org/forums/showpost.php?p=4791582&postcount=81
     
  19. Lonyo

    Lonyo Lifer

    Joined:
    Aug 10, 2002
    Messages:
    21,939
    Likes Received:
    0
    I'm not talking about NV vs AMD, I'm talking about AMD vs AMD and NV vs NV.
    Internally within product families the performance shown in the benchmarks from that Polish site don't add up with what would be expected based on specs and performance in everything else.
    6790 > 5770 in specs and in every other game.
    580 > 480 in specs and every other game.
    In the Polish sites Civ 5 benchmarks, it's the other way round.
     
  20. jimbo75

    jimbo75 Senior member

    Joined:
    Mar 29, 2011
    Messages:
    223
    Likes Received:
    0
    Yes and they didn't match up for nVidia in the benchmark until a driver fix.

    My point is, they match up for AMD ingame and they don't for nVidia. To me that points to nVidia optimising for a benchmark while AMD optimised the actual game.

    Would you *really* be surprised to find out this is the case?
     
  21. Lepton87

    Lepton87 Platinum Member

    Joined:
    Jul 28, 2009
    Messages:
    2,500
    Likes Received:
    0
  22. Genx87

    Genx87 Lifer

    Joined:
    Apr 8, 2002
    Messages:
    39,451
    Likes Received:
    6
    Ryan, from what it sounds like. Is this multi-threaded rendering like out of order execution on the CPU side?
     
  23. AtenRa

    AtenRa Lifer

    Joined:
    Feb 2, 2009
    Messages:
    11,258
    Likes Received:
    41
  24. jimbo75

    jimbo75 Senior member

    Joined:
    Mar 29, 2011
    Messages:
    223
    Likes Received:
    0
    That is it?

    http://translate.googleusercontent....le.com&usg=ALkJrhiHa5XXx-4A_fr8JeQGsJ5L1baVZg

    Click on any of the Civ 5 benchmarks. In fact, look at all of them - they are all exactly what you would expect to see.

    The only problem is with Civ 5, and only because the 480 beats the 580 in that. To me thats a clear case of nVidia not bothering to fix the actual game issues when it was easier and more profitable to fix the benchmark instead.
     
  25. notty22

    notty22 Diamond Member

    Joined:
    Jan 1, 2010
    Messages:
    3,376
    Likes Received:
    0
    Imo, thats not what is going on. Ryan's whole explanation was to explain about multi-threading for better performance.
    The benchmark work load is very high. Its probably very similar to putting Heaven Benchark on extreme tessellation.
    The resulting fps gap between AMD and Nvidia gpu's widens with that setting.
    Optimizing for a benchmark hints at cheating, and would be done by lessening the work load on the gpu.
    Some reviewers may feel the benchmark work load is never seen in the game. I don't know. Every reviewer is different . TPU used to only test Dirt2 in dx9, for his own reasons.