• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

MMX, SSE(1,2,3), 3DNow!

Im not sure but this seemed a bit advanced for the processor forums yet not really advanced enough for the HT.

Well ever since these things came out i have never known what theyve done. I have never known what they really do.

Do they physically add parts to the silicon, or is it purely code.

What does it do?

-Kevin
 
they are nothing more than instructions for processing multimedia, there is nothing really added per say, just different way video, etc, etc, is processed by the processor
 
They are vector instructions. They're added to the instruction set because many multimedia applications deal with vectors, not scalars. A scalar would be a single value like 15, a vector would be multiple values like [5, 2, 6, 10]. These instructions allow operations on various type of vectors. So for instance, in your video codec, each pixel has 4 values, red, green, blue and alpha value. You want to add 2 of these pixel values together. For the sake of example, let's say you wanted to add one pixel which was: [4, 26, 3, 10] to another pixel which was [5, 7, 11, 13]. Using scalar instructions, you'd have to have 4 instructions to add the pixel values:

4 + 5;
26 + 7;
3 + 11;
10 + 13;

Using vector instructions, it would simply be one instruction:
[4, 26, 3, 10] + [5, 7, 11, 13];

Vector instructions use larger data sets. So if your normal scalar value (say 4) was a 32-bit number, a vector would be 128 bits (containing 4 32-bit values). With one instruction, you accomplish what would otherwise have taken 4 instructions. The advantage here is since those 4 instructions would've been the same (they're all add) and only the data is different, you don't need to issue the instruction 4 times, just issue it once and have it work on multiple pieces of data. Hence, Single Instruction Multiple Data, or SIMD.

Vector instructions can be incredibly useful and they're not limited to arithmetic. Vector data types can also be very good for performance (good locality) but I won't explain the details here. Suffice it to say, all those instruction set extensions add more and more versatile vector instructions (MMX offered support for integer vectors, SSE offered support for 32-bit Floating Point numbers, SSE2 offered 64-bit Floating Point and SSE3 offered a few horizontal instructions).
 
Ok so are programs actually using this. Also without it obviously it would do multiple instructions instead of one fster larger one but would we see a huge performance drop.

So on SSE basically it is MMX with 64bit support.

What about 3dNow! what is that doing.

-Kevin
 
Originally posted by: Gamingphreek
Ok so are programs actually using this. Also without it obviously it would do multiple instructions instead of one fster larger one but would we see a huge performance drop.

Yes. Look at Lightwave, 3D Studio Max, the DiVX codec, Windows Media codec, etc. I would expect more to use SSE somewhat considering newer compilers (ICC 8 and GCC 3 too I think) contain autovectorize options so programmers won't have to hand-assemble code to use these instructions.

So on SSE basically it is MMX with 64bit support.

MMX - added support for 64-bit integer vectors (8x8-bit, 4x16-bit or 2x32-bit integer values)
SSE - added support for 128-bit Single Precision FP vectors (4x32-bit FP values)
SSE2 - added support for 128-bit Double Precision FP vectors (2x64-bit FP values) and also 128-bit Integer vectors (2x64-bit Integer values).
SSE3 - added horizontal instructions and multithread synchronization instructions.

What about 3dNow! what is that doing.

-Kevin

3dNow! was AMD's SIMD extension. Does pretty much what SSE does (a bit more even) but never caught on. Instead, the industry went on to use SSE.

 
SO why dont they make all applications optimized for all 4 of these? Too much work or just wont work for all apps.

-Kevin
 
Originally posted by: Gamingphreek
SO why dont they make all applications optimized for all 4 of these? Too much work or just wont work for all apps.

-Kevin

too much work. it's difficult to extract such parallelism using conventional high level languages. so programmers have to resort to assembly to get the most out of simd.
 
Originally posted by: Gamingphreek
SO why dont they make all applications optimized for all 4 of these? Too much work or just wont work for all apps.

-Kevin

Because they're not always needed. You'll rarely find applications that do intensive work on all integer vectors, SP FP vectors, DP FP vectors and, especially, 64-bit Integer Vectors. It's usually one or the other.
 
So if they did optimize this (im asking a lot of theoretical question sry if im pestering) how much of a speed boost would we get.

-Kevin
 
Originally posted by: Gamingphreek
So if they did optimize this (im asking a lot of theoretical question sry if im pestering) how much of a speed boost would we get.

-Kevin

Depends on the program. Again, it's all about whether you need it or not. But think about it. For 32-bit SP floating point calculations, how much faster is:

[a,b,c,d] + [e,f,g,h];

vs

a + e;
b + f;
c + g;
d + h;

Not to mention more subtle (and much more speed advantageous) use of vectors. Easily 4x the speedup if not more.
 
Originally posted by: Gamingphreek
So the longer the program per se the more benefit these would offer in a nutshell.

-Kevin

no, the more parallelism you can extract from your algorithm, the better it will benefit. of course, this is assuming that the way the processor implements these simd instructions makes it worthwhile to use them. ie, i don't think simd works as well on via or transmeta chips versus amd and intel chips. come to think of it, anyone with a via or transmeta chip want to do some simd vs. no simd fpu benchmarks?
 
Back
Top