AVX and FMA

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
With the pending release of AVX (SB and BD) and the future release of FMA (BD and Haswell), I want to better understand what benefits we can expect for various programs.

I have read that Win7 SP1 will include support for AVX, but the beta out now does not. Does this mean that any application developed for AVX use will not work unless you have OS support? And does OS support mean it will be used by OS layers and not just 3rd party applications?

Does the same apply to FMA instructions? Does it need OS support? I know most video cards use it, but that comes from the driver to my understanding.

Will FMA and AVX help gaming? video editing? productivity applications? graphic design? general OS features?
 

mutz

Senior member
Jun 5, 2009
343
0
0
I know most video cards use it, but that comes from the driver to my understanding.
third party drivers for any software usage and future Win 7 support..?

so what's the pressure..?
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
I guess my main question is if drivers do support such instructions, why does Windows need to add it to a SP? I guess I am a little confused there. If I wrote an application that uses AVX, does that mean I can not run it until I get Win 7 SP1? (assuming I have a capable CPU)
 

jones377

Senior member
May 2, 2004
450
47
91
AVX uses a new set of registers that the OS need to support properly. Everytime there is a context switch between threads the processor state (registers, flags, etc) is saved for the thread that is switched out and loaded for the thread switched in. So the OS needs to know to save/reload the new AVX registers.
 

Cogman

Lifer
Sep 19, 2000
10,277
125
106
AVX uses a new set of registers that the OS need to support properly. Everytime there is a context switch between threads the processor state (registers, flags, etc) is saved for the thread that is switched out and loaded for the thread switched in. So the OS needs to know to save/reload the new AVX registers.
It goes a bit beyond that, but in general, yes that is correct.

With new extensions, intel has specified that instructions and registers have to be "activated" and allowed by the OS at ring 0. If you try and execute a non-activated instruction (SSE2) you'll get a nice error informing you that you tried to use an invalid instruction.

As jones said, the OS needs to know
A. how to handle the new registers when a thread switch/interrupt occurs.
B. what effects the instruction has on things like the flags register ect.

Hence the reason you just can't start using new instructions when they are released. Generally, MS just has to do a small update to support them. It usually isn't too big of an issue.
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
Thank you for the response. So is it safe to assume that MS will need to put out an update to support FMA for BD and Haswell? (Similar to AVX)
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
Thank you for the response. So is it safe to assume that MS will need to put out an update to support FMA for BD and Haswell? (Similar to AVX)

Yep. Just as the BIOS needs to be updated even if the cpu is socket compatible.
 

jondeker

Junior Member
May 30, 2010
18
0
0
What will be the benefits of avx and fma?

Just performance boost in specific applications like video encoding?
 

aphorism

Member
Jun 26, 2010
41
0
0
a little more in depth question: will MS support FMA3, FMA4 or both? SB will have FMA3 and BD will have FMA4 but will FMA4 be backwards compatible w/ FMA3?
 

Cogman

Lifer
Sep 19, 2000
10,277
125
106
What will be the benefits of avx and fma?

Just performance boost in specific applications like video encoding?

FMA stands for fused multiply add. so a = a + (b * c).. I don't know what fma3 and 4 are, nor do I know MS's support plan (it is likely, they usually support new instruction sets.) Honestly, it doesn't seem all that breath taking to me, but I guess there must be some demand for it.

AVX is basically an extension to the SSE architecture. Adding 256-bit registers (SSE support 128 bit registers.) Most likely anything that uses SSE instructions will benefit.
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
Honestly, it doesn't seem all that breath taking to me, but I guess there must be some demand for it.

Unless I am misunderstanding it, FMA can provide close to double the floating point performance if the code is optimized for it. I find that rather interesting.

FMA4 = AMD's version
FMA3 = Intel's version
 

jones377

Senior member
May 2, 2004
450
47
91
FMA3 and FMA4 refers to the number of registers the instructions operate on. FMA3 is destructive for one of the input registers (a = a*(b+c)) while FMA4 uses a 4th register to avoid writing over an input register. FMA4 reduces the number of load/store operations needed in the code, but I wonder if it will work that great with only 16 architectural registers in total?

BTW, x86 has historically only used 2-operand instructions so it always writes over one of the input registers. I am not sure about AVX (sans FMA), does it use 3-operands?
 

jondeker

Junior Member
May 30, 2010
18
0
0
So does average customer benefit from avx and fma aside from video encoding?

From a quick search it seems like mostly HPC apps (finance, oil, etc) would benefit.
 

Cogman

Lifer
Sep 19, 2000
10,277
125
106
So does average customer benefit from avx and fma aside from video encoding?

From a quick search it seems like mostly HPC apps (finance, oil, etc) would benefit.

It all depends on if the application takes advantage of it... So its a firm maybe. You never know when an application will use a new instruction.
 

aphorism

Member
Jun 26, 2010
41
0
0
BTW, x86 has historically only used 2-operand instructions so it always writes over one of the input registers. I am not sure about AVX (sans FMA), does it use 3-operands?
the avx paper is sort of vague on differentiating what the cpu's actually support in avx but apparently with vex prefix you can explicitly encode the registers so it can either be destructive or not. you can choose reg allocation with sse intrinsics too but it is not encoded like avx.
So does average customer benefit from avx and fma aside from video encoding?

From a quick search it seems like mostly HPC apps (finance, oil, etc) would benefit.
multimedia benefits from avx as well as scientific and engineering code.