AVX2 and FMA3 in games

Discussion in 'CPUs and Overclocking' started by Carfax83, Jan 8, 2016.

  1. Carfax83

    Carfax83 Diamond Member

    Joined:
    Nov 1, 2010
    Messages:
    5,641
    Likes Received:
    379
    Simple question, but do any games use AVX2 and FMA3 yet? I know that some are using AVX, especially since the PS4 and Xbox One both support it.. And several physics engines also use AVX..

    I did a google search and the only thing I could find was that Serious Sam 3 might use AVX2, and this was speculative more than anything.
     
  2. Loading...

    Similar Threads - AVX2 FMA3 games Forum Date
    Ryzen Locking on Certain FMA3 Workloads CPUs and Overclocking Mar 15, 2017
    Ryzen's halved 256bit AVX2 throughput CPUs and Overclocking Mar 9, 2017
    What common desktop applications are using AVX and AVX2? CPUs and Overclocking Feb 5, 2017
    AVX2, FMA, TSX in Haswell CPUs and Overclocking May 16, 2013
    Question on AVX2 and FMA (Haswell) CPUs and Overclocking Mar 23, 2013

  3. NostaSeronx

    NostaSeronx Platinum Member

    Joined:
    Sep 18, 2011
    Messages:
    2,138
    Likes Received:
    191
    #2 NostaSeronx, Jan 8, 2016
    Last edited: Jan 8, 2016
  4. Carfax83

    Carfax83 Diamond Member

    Joined:
    Nov 1, 2010
    Messages:
    5,641
    Likes Received:
    379
    It's great to know that it will be mandated for DX12! :D
     
  5. TheELF

    TheELF Platinum Member

    Joined:
    Dec 22, 2012
    Messages:
    2,528
    Likes Received:
    174
    It being available is far from it being mandated...
     
  6. itsmydamnation

    itsmydamnation Golden Member

    Joined:
    Feb 6, 2011
    Messages:
    1,702
    Likes Received:
    601
    you have to remember generally speaking both fma and AVX for games is only going to be an incremental improvement over SSE. In would be interesting to see what PC devs actually target today? SSE4? SSSE3?
     
  7. zlatan

    zlatan Senior member

    Joined:
    Mar 15, 2011
    Messages:
    553
    Likes Received:
    227
    SSE4.2
    The problem with AVX is that the majority of the best selling processors are not support it. For example every Celeron and Pentium. Even if it might be a useful way to improve the application performance the publishers don't finance the research. They want an alternative optimization strategy to improve the performance for the Celeron/Pentium users also.
    The consoles might help for AVX, but AVX2 is still a not really useful option.
    I personally think that HSA runtime is the best way to get AVX2 and even AVX512 support for the applications. That platform is cheap and we can target a lot of extensions/accelerators with the same codebase. I really think that SYCL 2.1 will be also a revolutionary step for the programers.
     
    #6 zlatan, Jan 9, 2016
    Last edited: Jan 9, 2016
  8. superstition

    superstition Platinum Member

    Joined:
    Feb 2, 2008
    Messages:
    2,219
    Likes Received:
    216
    Game developers aren't going to want to lose sales to people who don't have AVX2 CPUs. This is one of the reasons why game engines aren't being made that fully take advantage of eight threads. They need the games to run well on an i3.

    How willing developers are to jettison non-AVX1 customers I don't know. Perhaps the solution is to offer code that uses non-AVX for processors that lack it but AVX for those that have it. That's more work but it's more reasonable than cutting out a substantial base.
     
  9. TheELF

    TheELF Platinum Member

    Joined:
    Dec 22, 2012
    Messages:
    2,528
    Likes Received:
    174
    Oh so that's the reason why all games of the last 2-3 years run so well on any CPU... (not)
     
  10. superstition

    superstition Platinum Member

    Joined:
    Feb 2, 2008
    Messages:
    2,219
    Likes Received:
    216
    Asking devs to make everything for the Anniversary Pentium is a bit of a stretch. As for 2-3 years of i5s and i7s, I fail to see the issue. Some of the slowest i3s may have difficulty but those with high enough clocks should still be viable even if they're two or three years old.

    An i3 gained a lot of FPS in DX12 in Ashes, so it seems that developers are even targeting the i3 for DX12 titles.
     
  11. NTMBK

    NTMBK Diamond Member

    Joined:
    Nov 14, 2011
    Messages:
    7,977
    Likes Received:
    892
    Even SSE4.2 is a bit risky. Phenom didn't have it, and there are a lot of those still out there.
     
  12. Carfax83

    Carfax83 Diamond Member

    Joined:
    Nov 1, 2010
    Messages:
    5,641
    Likes Received:
    379
    I'm no programmer, but I was under the impression that extensions such as AVX2 were backward compatible with older extensions. For example, a new CPU like Haswell or Skylake would run the fastest codepath with AVX2, while a CPU like Sandy Bridge would use the same codepath but with less throughput/performance due to lacking AVX2..

    I really have to wonder though at some of the massive performance gains on the CPU side seen in recent games, such as Dying Light for instance. They went from this at the game's launch:

    [​IMG]

    To this 11 months later, a more than doubling of performance for many CPUs on that list..

    [​IMG]

    And the game definitely was CPU limited when it first shipped no doubt, but now it performs very well. So I wonder, did they get these gains by exploiting more vectorization, or was it all due to better multithreading?

    It seems more the latter, as CPUs with more threads/cores gained more performance.
     
  13. ShintaiDK

    ShintaiDK Lifer

    Joined:
    Apr 22, 2012
    Messages:
    20,395
    Likes Received:
    128
    Yes they do. An "easy" way is to see when Haswell/Broadwell/Skylake enters the increased AVX2/FMA mode with a higher voltage. But its not something programmers is going to tell you in a list.
     
  14. zlatan

    zlatan Senior member

    Joined:
    Mar 15, 2011
    Messages:
    553
    Likes Received:
    227
    Phenom is not a target now. It's too old.
     
  15. NTMBK

    NTMBK Diamond Member

    Joined:
    Nov 14, 2011
    Messages:
    7,977
    Likes Received:
    892
    Phenom II is still listed as minimum for plenty of recent games. (Llano also lacked SSE4.2.) SSE4.1/2 isn't that essential, to be honest- the blend instructions are nice, but you can get the same effect with an (and)|(andnot) sequence.
     
  16. zlatan

    zlatan Senior member

    Joined:
    Mar 15, 2011
    Messages:
    553
    Likes Received:
    227
    SIMD is not implemented well in x86. It is forcing you to deal with multithreading, prefetching, and small registers. MMX, SSE and AVX is very inefficient compared to other SIMD models. Also every new SIMD register sizes will just make your original code outdated. The best way to do SIMD is to generate a specialized code for each CPU model, and this will be relatively fast, but most publishers just don't finance it. For today it is much more logical to use one or more IR, and from there you can compile the code to a lot of hardware targets. The real question is that which IRs and compiling paths gives the best overall results.

    This is pretty much normal when you build a new engine. The reason why you won't see it in most games is that most publishers don't really finance the engine optimization backports to the released titles. :(
    The primary optimization strategy for a new engine should focus to reverse engineering the graphics kernel drivers. Knowing where are the stalls in the code/drivers is a huge success, and you can optimize accordingly.
     
    #15 zlatan, Jan 9, 2016
    Last edited: Jan 9, 2016
  17. zlatan

    zlatan Senior member

    Joined:
    Mar 15, 2011
    Messages:
    553
    Likes Received:
    227
    That's why SSE2 is mainstream now, but SSE4.2 is much more logical choice for the actual projects.
     
  18. jhu

    jhu Lifer

    Joined:
    Oct 10, 1999
    Messages:
    11,915
    Likes Received:
    7
    That's not how it works. If a program is compiled to only use AVX2, it won't run on a processor without AVX2 support (eg on Linux it will seg fault). If it's compiled to only use AVX, then processors that support AVX2 will also be able to run it.

    And then there's dispatching that can be done where the code detects CPU flags and runs different code paths depending on what's found.
     
  19. Arachnotronic

    Joined:
    Mar 10, 2006
    Messages:
    11,543
    Likes Received:
    1,852
    I am truly and utterly amazed that Intel tries to segment its processors by disabling ISA features. This is the product of a marketing department that has no idea of the ramifications of its actions. They are literally hindering the already naturally slow pace of the adoption of new ISA features that could make their processors run a hell of a lot faster.
     
    CatMerc, cytg111, Headfoot and 3 others like this.
  20. Thala

    Thala Senior member

    Joined:
    Nov 12, 2014
    Messages:
    579
    Likes Received:
    107
    They are neither binary compatible, as the instruction encoding is different nor logical/semantical compatible as the instruction operate on different/wider data-types.
     
  21. lamedude

    lamedude Golden Member

    Joined:
    Jan 14, 2011
    Messages:
    1,198
    Likes Received:
    3
    Visual Studio 2013/15 C runtime will use FMA3 for some math functions.
    Jaguar only supports AVX 128 so no use in hand writing 256bit vector code for BoneStation.
    MS's compiler only supports SSE/SSE2/AVX/AVX2 and PhysX/Skyrim has shown us you have to be a wizard to change that setting (thankfully the default changed to SSE2 in VS2012).
     
  22. Tuna-Fish

    Tuna-Fish Senior member

    Joined:
    Mar 4, 2011
    Messages:
    860
    Likes Received:
    91
    This is not true at all. If you put an AVX2 isntruction into a program and run it on an older CPU, the program crashes with illegal instruction exception.
     
    Arachnotronic likes this.
  23. ShintaiDK

    ShintaiDK Lifer

    Joined:
    Apr 22, 2012
    Messages:
    20,395
    Likes Received:
    128
    Its not an issue to have AVX2 support and still run it on CPUs without AVX2. This is essentially what all the "intel compiler cheats" is about in the old days.
     
  24. Nothingness

    Nothingness Golden Member

    Joined:
    Jul 3, 2013
    Messages:
    1,788
    Likes Received:
    153
    That still means an increase in validation effortsas you have to ensure all paths are correct. And not everyone uses icc, the vast majority of the industry relies on MS compiler.
     
  25. ShintaiDK

    ShintaiDK Lifer

    Joined:
    Apr 22, 2012
    Messages:
    20,395
    Likes Received:
    128
    Everything performance oriented uses ICC.
     
  26. jhu

    jhu Lifer

    Joined:
    Oct 10, 1999
    Messages:
    11,915
    Likes Received:
    7
    Not everything. Most Windows PC games use MSVC (and certainly console games don't use icc at all).