zephyrprime

Diamond Member
Feb 18, 2001
7,512
2
81
Yeah, alright, sorry for the delay but I am teh lazy.

So here is intel's page on SSE4: sse4

SSE4 has about 50 news instructions which is a lot more than the rumors going around were talking about. It is set to appear in menryn which is the 45nm followup to the memrom (core duo) I believe.

Anyway, a brief explanation of sse first to get everyone up to speed. SSE is the successor to MMX. It's also supposed to be used as a replacement for the living dinosaur that is the x87 FPU. The basic idea behind SSE and things like SSE is simple: 1 instruction operates on multiple pieces data rather than just 1 piece of data. (Data must be contiguous in memory). For example, the SSE instruction MULPS can multiply a pair of 4x32bit floating point numbers packed into a single 128bit SSE register. For comparison, fmul multiples one pair of floating point numbers together. So you can see the obvious performance advantage.

So looking at the "new-instrustions-paper.pdf", I see that sse4 consist of mostly sse instructions plus 6 non-sse instructions that are just bundled under the SSE4 moniker for convenience.

Here are the highlights in my opinion:
> 4x32bit integer multiply instruction. Finally. Previously, only crappy 2x32bit integer multiple was available.
> Floating point dot product. Remeber math class? dot(a,b)= a.1*b.1+a.2*b.2+a.3*b.3+a.4*b.4. Useful for physics can other things.
> Register insertion/extraction. Wow. I guess intel is serious about improving data moving between SSE registers and the general purpose registers which currently slow.
> Packed format conversion. Thank god. I will never have to use those crappy shuffle instructions from sse1 again.
> Other boring stuff: packed blending, packed integer min/max, floating point rounding, set and test, compare for equal, unsigned dword to signed dword conversion. Some of these instructions are useful to avoid branching in some situations.

And now for the sse4 instructions unrelated to the sse registers
>String handling instruction. Hmmm. Haven't been any new string handling instructions in the x86 instruction set in a long time. Should be useful. Not sure if this involves the sse registers or not.
> CRC instruction. I'm surprised that they put such a specialized instruction in. I've taken a look at adler crc code in the past and as I recall, it has horrible IPC potential and generates pipeline bubbles in just about every line of code so a specialized instruction should be able to improve the speed of CRC generating a lot.
> count 1's. Counts the number of bits set to 1.

All in all, I'm very pleased with SSE4. It adds some important instructions that fill in the gaps in SSE. For the first time, I feel that sse is essentially complete. Yeah you could add more instructions, but they wouldn't be essential instructions.
 

Hyperlite

Diamond Member
May 25, 2004
5,664
2
76
has anyone noticed that in the conroe/merom descriptions on newegg it says SSE4 under the instruction sets?
 

cmdrdredd

Lifer
Dec 12, 2001
27,052
357
126
Originally posted by: Hyperlite
has anyone noticed that in the conroe/merom descriptions on newegg it says SSE4 under the instruction sets?


I did, but as far as I know there is no apps to take advantage of it yet.
 

theteamaqua

Senior member
Jul 12, 2005
314
0
0
i dont think he knows ... or if u know and dont wanna share with people dont start a thread

and yeah conroe and meron has SSE4
 

Cooler

Diamond Member
Mar 31, 2005
3,835
0
0
Originally posted by: cmdrdredd
Originally posted by: Hyperlite
has anyone noticed that in the conroe/merom descriptions on newegg it says SSE4 under the instruction sets?


I did, but as far as I know there is no apps to take advantage of it yet.

Intel may not have enabled them on the chips.
 

dexvx

Diamond Member
Feb 2, 2000
3,899
0
0
Originally posted by: cmdrdredd
Originally posted by: Hyperlite
has anyone noticed that in the conroe/merom descriptions on newegg it says SSE4 under the instruction sets?


I did, but as far as I know there is no apps to take advantage of it yet.

Thats a total moot point.

Willamette had SSE2, which was useless back then. Now? x86-64 basically *requires* you to use SSE2 in place of the traditional x87 FPU. What does that mean? Longevity. If you have SSE2 based FP code, the Willamette will spank the AthlonXP, which used to spank it. Later down the line when SSE3/4 gets implemented, your computer will have a longer relative life-span.

It wont matter much to enthusiasts, but its definitely one thing to consider if you're more casual and buy a computer to last you 4-5 years.
 

cmdrdredd

Lifer
Dec 12, 2001
27,052
357
126
Originally posted by: dexvx
Originally posted by: cmdrdredd
Originally posted by: Hyperlite
has anyone noticed that in the conroe/merom descriptions on newegg it says SSE4 under the instruction sets?


I did, but as far as I know there is no apps to take advantage of it yet.

Thats a total moot point.

Willamette had SSE2, which was useless back then. Now? x86-64 basically *requires* you to use SSE2 in place of the traditional x87 FPU. What does that mean? Longevity. If you have SSE2 based FP code, the Willamette will spank the AthlonXP, which used to spank it. Later down the line when SSE3/4 gets implemented, your computer will have a longer relative life-span.

It wont matter much to enthusiasts, but its definitely one thing to consider if you're more casual and buy a computer to last you 4-5 years.


I think my computer will last 4-5years as it is now. Maybe I can get a new video card to keep up with games, or a new soundcard in the future. Otherwise I don't see CPU to be a big factor these days. I guess it depends on what happens later on.
 

Aluvus

Platinum Member
Apr 27, 2006
2,913
1
0
Originally posted by: theteamaqua
i dont think he knows ... or if u know and dont wanna share with people dont start a thread

and yeah conroe and meron has SSE4

No, Conroe and Merom (and Woodcrest) have SSSE3 (Supplemental SSE3), which is not SSE4.
 

theteamaqua

Senior member
Jul 12, 2005
314
0
0
Originally posted by: Aluvus


No, Conroe and Merom (and Woodcrest) have SSSE3 (Supplemental SSE3), which is not SSE4.

"which is not SSE4."

http://www.hkepc.com/bbs/itnews.php?tid=676345

The current Intel Core architecture supports part of it, but not all, and that?s why some utilities said Core 2 Duo has SSE 4.

so technically yes conroe has SSE4 not all, but part of it. But it sitll has SSE4.
 

Aluvus

Platinum Member
Apr 27, 2006
2,913
1
0
Originally posted by: theteamaqua
Originally posted by: Aluvus


No, Conroe and Merom (and Woodcrest) have SSSE3 (Supplemental SSE3), which is not SSE4.

"which is not SSE4."

http://www.hkepc.com/bbs/itnews.php?tid=676345

The current Intel Core architecture supports part of it, but not all, and that?s why some utilities said Core 2 Duo has SSE 4.

so technically yes conroe has SSE4 not all, but part of it. But it sitll has SSE4.

So, um, which part? They just sort of throw this out there and don't explain it. I can't find any other source that really corroborates this with anything. AFAICT, those utilities report that because they think SSSE3 is SSE4, despite what Intel has already stated.

But as to SSSE3 and SSE4 not being the same thing... they really, seriously, are not. Intel's SSE4 whitepaper PDF (remove the erroneous http:// from the link) goes out of its way to note SSSE3 as a separate entity. I don't have a definitive list of instructions introduced in SSSE3, but none of those in the almost-certainly-incomplete-or-wrong list on Wikipedia appear in Intel's draft list for SSE4.
 

Hyperlite

Diamond Member
May 25, 2004
5,664
2
76
Originally posted by: Aluvus
Originally posted by: theteamaqua
Originally posted by: Aluvus


No, Conroe and Merom (and Woodcrest) have SSSE3 (Supplemental SSE3), which is not SSE4.

"which is not SSE4."

http://www.hkepc.com/bbs/itnews.php?tid=676345

The current Intel Core architecture supports part of it, but not all, and that?s why some utilities said Core 2 Duo has SSE 4.

so technically yes conroe has SSE4 not all, but part of it. But it sitll has SSE4.

So, um, which part? They just sort of throw this out there and don't explain it. I can't find any other source that really corroborates this with anything. AFAICT, those utilities report that because they think SSSE3 is SSE4, despite what Intel has already stated.

But as to SSSE3 and SSE4 not being the same thing... they really, seriously, are not. Intel's SSE4 ftp://download.intel.com/technology/architecture/new-instructions-paper.pdf">whitepaper PDF</a> (remove the erroneous http:// from the link) goes out of its way to note SSSE3 as a separate entity. I don't have a definitive list of instructions introduced in SSSE3, but none of those in the almost-certainly-incomplete-or-wrong list on Wikipedia appear in Intel's draft list for SSE4.


they don't have any part of it according to anand's idf article...that was my point, all news from idf said SSE4 would be enabled on 45nm. additional instructions on conroe/merom are just a modification of SSE3.
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
Originally posted by: zephyrprime
Well Intel has let the cat out of the bag regarding SSE4. Does anyone want to hear about it? I'll explain it here if people care.

*still waiting*
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
Originally posted by: zephyrprime

It is set to appear in PENRYN which is the 45nm followup to the MEROM (core 2 duo) I believe.

Thanks for following up.
 

gobucks

Golden Member
Oct 22, 2004
1,166
0
0
i think the SSSE3 is just the improvement to existing SSE instructions that makes them full 128-bit, allowing them to be executed in 1 clock cycle instead of 2. That's what is implemented on Conroe and Allendale. I think originally some programs like CPU-Z called the new instructions SSE4 because they didn't know any better.

SSE4 on the other hand is like 50 new multimedia and web development instructions, which means it will be a pretty substantial improvement. I'm not surprised they're waiting for the 45nm process to implement SSE4 though - the transistor count on a quad-core CPU with 50 new instructions is likely to be astronomical.
 

BrownTown

Diamond Member
Dec 1, 2005
5,314
1
0
Originally posted by: gobucks
SSE4 on the other hand is like 50 new multimedia and web development instructions, which means it will be a pretty substantial improvement. I'm not surprised they're waiting for the 45nm process to implement SSE4 though - the transistor count on a quad-core CPU with 50 new instructions is likely to be astronomical.

The execution parts of the cores arent all that big, adding 50 new instructiosn wont increase the overall transistor count as much as you might think. Also, Penryn is a dual core and not a quad. As was mentioned before some of the SSE4 instructions are in Conroe, and I wouldn't be surprised if the rest are there in some form but disabled. Its no secret that Intel rushed Core 2 into production, so its plausible that they pushed alot of the instructions back a cycle in order to get the chip out sooner.
 

zephyrprime

Diamond Member
Feb 18, 2001
7,512
2
81
Originally posted by: gobucks
i think the SSSE3 is just the improvement to existing SSE instructions that makes them full 128-bit, allowing them to be executed in 1 clock cycle instead of 2. That's what is implemented on Conroe and Allendale. I think originally some programs like CPU-Z called the new instructions SSE4 because they didn't know any better.

SSE4 on the other hand is like 50 new multimedia and web development instructions, which means it will be a pretty substantial improvement. I'm not surprised they're waiting for the 45nm process to implement SSE4 though - the transistor count on a quad-core CPU with 50 new instructions is likely to be astronomical.
SSE3 is not an improvement to the execution speed of existing SSE instructions. It is a set of new instructions. After SSE3 came Supplemental SSE3 which actually includes more instructions than original SSE3. Supplemental SSE3 was rumored to be SSE4 and that is why it is reported as such in cpu-z. It's really just a matter of semantics. Conroe includes Supplemental SSE3 but not SSE4.