Roughly, how much faster will Sandy Bridge be over i7?

gevorg · Apr 29, 2010

I realize that this is just speculation, but how big do you think the difference will be on per core and equal Ghz basis? 10%? 20%? 30%?

Let's say the benchmark is PCmark productivity.

MagickMan · Apr 29, 2010

I expect the difference will be about the same as C2Q>i7.

formulav8 · Apr 29, 2010

No one but Intel and some 'special' parters know for sure. And even Intel won't know 100% until they start getting production sili. I'm sure they have a good estimation right now though...

Jason

MJinZ · Apr 29, 2010

20% since Sandy is pure evolutionary.

No need to upgrade until the next revolutionary architectural change (like Pentium 4 to Pentium M/Core)

Idontcare · Apr 29, 2010

silicon is out there, enough of it to the point that the sanctioned leaks are already doing their thang: http://www.xtremesystems.org/forums/showthread.php?t=250145

the IPC improvement expectations appear to range from 2% to 20% depending on the application involved (more specifically depending on the instruction mix represented in the application as not all instructions are getting improvements, but for example the improvements done in reducing L1$ latency will pretty much help everything)

IntelUser2000 · Apr 29, 2010

When Nehalem first came out and reviews were flooding everywhere, one site did instruction latency measurements. It mentioned that although latency improved from Merom, it increased from Penryn. I'm guessing they might/can remedy that with Sandy Bridge.

The Nehalem chief architect said there was 2-3% performance penalty from extra 1 cycle latency on the L1 data cache alone.

If Nehalem was the version that sacrificed a bit of client performance to gain a lot on servers, Sandy Bridge should put the focus back on client. They don't even seem to be planning true Westmere-EX successor until Ivy Bridge!

SHAQ · Apr 29, 2010

I wouldn't anticipate much of any improvement over i7. I haven't heard of any new instructions or efficiency improvements. Maybe the quad-channel memory would offer 2-3% better performance. Not until games\programs utilize 6 threads would it be worth an upgrade over an i7. Hopefully I'm wrong though because upgrading is always fun. It doesn't even offer USB 3/Sata 3 natively. Also, there may be some SLI improvements with the extra lanes, but that is only 2-3% faster too.

Vette73 · Apr 29, 2010

SHAQ said:
I wouldn't anticipate much of any improvement over i7. I haven't heard of any new instructions or efficiency improvements. Maybe the quad-channel memory would offer 2-3% better performance. Not until games\programs utilize 6 threads would it be worth an upgrade over an i7. Hopefully I'm wrong though because upgrading is always fun. It doesn't even offer USB 3/Sata 3 natively. Also, there may be some SLI improvements with the extra lanes, but that is only 2-3% faster too.

That and it seems the CPUs are being made more for OEMs with the video intergrated and other items. And by that I mean cheaper and harder for anybody to make chipsets for Intel.

So maybe 10% give or take. Yea some things will be optomized and run fater but overall I don;t think it will hugh. Now AMDs bulldozer seems like it might be a good leap over that CPUs they have now.

yh125d · Apr 29, 2010

10-15% at most, imo. Not too excited for SB (yet, at least)

Nemesis 1 · Apr 30, 2010

This thread is better than the threads after Intel demoed conroe. On average for apps that don't use AVX expect a 20% gain . For apps that run AVX expect 30 to 110% increase . This is bigger move than P4 to conroe. Don't let people school you on how no apps use AVX As we speak now thousands of developers have SB right now and are recompiling code as we speaK . Also SB has a jitcompiler and when allowed some programs can be run on the fly all thats required for recompile is the Vec prefix.

VirtualLarry · Apr 30, 2010

Nemesis 1 said:
This thread is better than the threads after Intel demoed conroe. On average for apps that don't use AVX expect a 20% gain . For apps that run AVX expect 30 to 110% increase . This is bigger move than P4 to conroe. Don't let people school you on how no apps use AVX As we speak now thousands of developers have SB right now and are recompiling code as we speaK . Also SB has a jitcompiler and when allowed some programs can be run on the fly all thats required for recompile is the Vec prefix.

On average? I recall reading that someone from Intel was quoted as saying "significant Ipc improvements", but 20%, as an average, seems a bit high. I would be willing to believe 20% as a best-case, but not an average.

I don't buy this JIT compiler mumbo-jumbo. The CPU itself doesn't contain a JIT compilier, that's for certain. So where does it reside? Is it a driver, that you have to load on an OS, that recompiles programs when you load them? I think such an approach is too messy, and doesn't really work in the general case.

Nemesis 1 · Apr 30, 2010

Ya jit compiler may be a little (A lot) bady worded. Vex compiler would be more correct I guess. You don't have to believe the 20% on average doesn't matter to me one way or another.

There has been both ISA and microarth. improvements

extra · Apr 30, 2010

Nemesis 1 said:
This thread is better than the threads after Intel demoed conroe. On average for apps that don't use AVX expect a 20% gain . For apps that run AVX expect 30 to 110% increase . This is bigger move than P4 to conroe. Don't let people school you on how no apps use AVX As we speak now thousands of developers have SB right now and are recompiling code as we speaK . Also SB has a jitcompiler and when allowed some programs can be run on the fly all thats required for recompile is the Vec prefix.

Hmmm I really have my doubts about this. Bigger move than P4 to Conroe? I really, really, doubt it. I remember all your posts predicting how awesome larrabee was going to be and that it would crush everything, etc, etc, and that didn't turn out (in fact, the reality was completely opposite to all your posts about it), so i'm very skeptical of your claims lol. (no offense, i always like your posts lol).

I also doubt programs will suddenly be optimized for a just-released chip. It's never worked that way in the past, why would it suddenly be any different?

Sandy bridge will probably be 0-20% faster per clock than the i5/i7 stuff we have now.

Whenever there has been any predictions of huge performance improvements in the past, it has never came true. It's always a slow gradual improvement. This release isn't going to change that. IMHO! (Also, just the same with AMD. Bulldozer isn't going to blow intel out of the water and zomg destroy them. It'll be competitive, though).

Nemesis 1 · Apr 30, 2010

Were not talking about larrabee right now . larrabee is alot better than you want to believe. But intel needed both more development time for drivers and tools. You won;t here about larrabee again until intel just releases it on 32nm. Could be alot sooner than you think. As I said befor when intel moved it. they already new that fermi was going to be late hot and a power hog. So they decided to go 32 and not talk about it.

I already said I don't care if you believe 20% average or not on SB . But Andy did say SB was worth waiting for . and he knows. He just can't talk on it. We have had tons of these thread and its always SB isn't much this that . WE talk about BD and its like going to revelotionize the Computer industry.

SB is the Baby of the Team that brought us Dothan . Dothan was AMDs Conquer.

This team Didn't get Merom But they got SB and for me 20% is a disappointment . But AVX apps might make it live up to what I expected.

Nemesis 1 · Apr 30, 2010

I really like that intel is doing on the sandy release. First the lowend and later the high end and Intel has finely made the high end actually the highend with this move intel seperates the high from the low even further. You may not like it . But as long as Intels lowend (Midrange high couch) Beats what AMD has out core for core whats the differance if there priced within reason. The guy that pays deserves to have more than you can get with the low buck system. You see it from a selfish perspective.

VirtualLarry · Apr 30, 2010

Nemesis 1 said:
But they got SB and for me 20% is a disappointment . But AVX apps might make it live up to what I expected.

20% IPC improvement would be an INCREDIBLE amount. I have no idea why you would be disappointed in that. The pipeline of today's CPUs is so incredibly optimized, there are really very few avenues left for improvement, speaking from a computer-science perspective.

That's why we're getting more and more cores, rather than improvements in single-threaded IPC. If Intel was able to tweak, and extract 20% more performance, especially if that's an average (which I don't believe for a second), then they are practically miracle workers.

VirtualLarry · Apr 30, 2010

Nemesis 1 said:
The guy that pays deserves to have more than you can get with the low buck system. You see it from a selfish perspective.

How do you feel about someone that gets more performance than they "deserve"? Eg. Overclockers.

IntelUser2000 · Apr 30, 2010

Nemesis 1 said:
This team Didn't get Merom But they got SB and for me 20% is a disappointment . But AVX apps might make it live up to what I expected.

Is it just me or does he sound like a spoiled kid? If you look at architectural predecessor to Merom, which is Yonah, it had 20% improvement. Considering these CPUs are basically ahead of everyone else, 20% is an amazing improvement.

When I went back and looked at mobile Core 2 benchmarks, it was only 15% faster than Core Duo. On desktops it was able to do 20% because additional FSB bandwidth reduced the bottleneck more than it did on Yonah.

Idontcare · Apr 30, 2010

IntelUser2000 said:
Is it just me or does he sound like a spoiled kid? If you look at architectural predecessor to Merom, which is Yonah, it had 20% improvement. Considering these CPUs are basically ahead of everyone else, 20% is an amazing improvement.

When I went back and looked at mobile Core 2 benchmarks, it was only 15% faster than Core Duo. On desktops it was able to do 20% because additional FSB bandwidth reduced the bottleneck more than it did on Yonah.

What were the high-level microarchitectural differences between P4 and Core2 that resulted in the substantial boost in IPC at the time?

Being that this occurred nearly 5yrs ago I have surprisingly (or not) little recollection of "why" P4 was such a cluster-f whereas Core2 turned out to be the bee's knees.

And given that info, what would it take (microarchitecture-wise) to provide a similar boost in IPC over westmere? 1-cycle L1$?

IntelUser2000 · Apr 30, 2010

Idontcare said:
What were the high-level microarchitectural differences between P4 and Core2 that resulted in the substantial boost in IPC at the time?

Being that this occurred nearly 5yrs ago I have surprisingly (or not) little recollection of "why" P4 was such a cluster-f whereas Core2 turned out to be the bee's knees.

And given that info, what would it take (microarchitecture-wise) to provide a similar boost in IPC over westmere? 1-cycle L1$?

A lot.

Pipeline stages-

The original Willamette(and Northwood too) CPU had 20 pipelines when there was a hit on the Trace Cache. From what I read there were 8 more stages when it was a "miss". The number of pipeline stage didn't affect it enough to make all the differences in performance, but indirectly I guess you can say it did.*

Trace Cache-

Usually when performance features are added, it isn't done to replace whatever it exists, but in the Pentium 4's case, it did. When there's a miss on the Trace Cache, it has only 1 decoder to fall back to, and essentially turns into a 1-issue CPU. I know there's jokes regarding how ILP era ended and such, but 1-issue is a problem. Trace Cache also worked in place of the L1 I cache, because Pentium 4 did not have one.

Execution Units-

There was a lot that Intel did to save die space on the Pentium 4. There was 3 ALUs on the Pentium 4, 1 Full "slow" ALU that ran at clock speed, and 2 Simple "fast" ALU which ran at 2x the clock speed. Trace Cache can only feed 6 instructions every 2 cycles which means it might not be able to feed when two fast ALU worked at once. Although that should be a small thing.

Misc-

*And then there's replay. Because of the pipeline stages were so long they needed very aggressive speculation to keep the pipeline fed with data. Replay is essentially a "clone" pipeline that was brought to work in case that speculation failed. I read although the idea itself was nice it brought various problems, one for example which could effectively increase pipeline stages, and thus misprediction penalty.

Replay was also thought to be the reason SMT on Pentium 4 did not work as well as it should have.

Core 2 managed to increase performance by ~90% over Pentium 4 at the same clock. To do it again compared to Westmere would be quite a sight.

techforums · May 1, 2010

I kinda figure some people are expecting a p4 to core 2 leap from each new cpu generation

Nemesis 1 · May 1, 2010

VirtualLarry said:
How do you feel about someone that gets more performance than they "deserve"? Eg. Overclockers.

Exactly what I am referring to . o/Cer might get = perfomance in some apps . But not ALL. that is the seperator. Its about time

Nemesis 1 · May 1, 2010

IntelUser2000 said:
Is it just me or does he sound like a spoiled kid? If you look at architectural predecessor to Merom, which is Yonah, it had 20% improvement. Considering these CPUs are basically ahead of everyone else, 20% is an amazing improvement.

When I went back and looked at mobile Core 2 benchmarks, it was only 15% faster than Core Duo. On desktops it was able to do 20% because additional FSB bandwidth reduced the bottleneck more than it did on Yonah.

I didn't start at Yonah . The first CPU I had that kicked AMD to curb was Dothan so thats my starting point. Everthing started with Dothan and the Isreal team choosing the Pro Core to start with.

tweakboy · May 1, 2010

You wont notice a difference, so there you go. Ok ull save 10 seconds off a render for example. have you ever gone to 100 percent ? When I render audio it takes 68 percent,

dac7nco · May 2, 2010

Can someone explain (shortly) what the AVX instructions are? I have an understanding of vector systems in older big iron, and somewhat in the IBM power systems, but not much else.

Daimon

Roughly, how much faster will Sandy Bridge be over i7?

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Elite Member

Elite Member

Senior member

Lifer

Diamond Member

Lifer

No Lifer

Lifer

Golden Member

Lifer

Lifer

No Lifer

No Lifer

Elite Member

Elite Member

Elite Member

Senior member

Lifer

Lifer

Diamond Member

Senior member