Originally posted by: jaredpace
I know it's used in a few encoding programs like gordion knot & certian codecs. maybe x264 video libs? In two years, maybe these instructions will be used by more programs. You will not benefit from performance enhancements if you don't have a cpu capable of these instructions.
Not a huge dealio
Where it is supposed to be used is in encoding programs. However the place where it was specifically designed for, many encoding programs have already mathematically optimized to the point where their optimization is faster then using the SSE4 instruction set.
Ultimately, it is used nowhere. SSE2 was a very useful release. SSE3 was a meh release, and SSE4 is something that is just extra gates with no speed increase.
x264 does not use it at all. DivX was the only codec to implement it and they did a demonstration where a straight non-mathematically optimized full motion search was about twice as slow as the one implemented with SSE4. But again, when you mathematically optimize your code this no longer becomes the case.