Roughly, how much faster will Sandy Bridge be over i7?

gevorg

Diamond Member
Nov 3, 2004
5,070
1
0
I realize that this is just speculation, but how big do you think the difference will be on per core and equal Ghz basis? 10%? 20%? 30%?

Let's say the benchmark is PCmark productivity.
 

formulav8

Diamond Member
Sep 18, 2000
7,004
523
126
No one but Intel and some 'special' parters know for sure. And even Intel won't know 100% until they start getting production sili. I'm sure they have a good estimation right now though...


Jason
 

MJinZ

Diamond Member
Nov 4, 2009
8,192
0
0
20% since Sandy is pure evolutionary.

No need to upgrade until the next revolutionary architectural change (like Pentium 4 to Pentium M/Core)
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
silicon is out there, enough of it to the point that the sanctioned leaks are already doing their thang: http://www.xtremesystems.org/forums/showthread.php?t=250145

the IPC improvement expectations appear to range from 2% to 20% depending on the application involved (more specifically depending on the instruction mix represented in the application as not all instructions are getting improvements, but for example the improvements done in reducing L1$ latency will pretty much help everything)
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
When Nehalem first came out and reviews were flooding everywhere, one site did instruction latency measurements. It mentioned that although latency improved from Merom, it increased from Penryn. I'm guessing they might/can remedy that with Sandy Bridge.

The Nehalem chief architect said there was 2-3% performance penalty from extra 1 cycle latency on the L1 data cache alone.

If Nehalem was the version that sacrificed a bit of client performance to gain a lot on servers, Sandy Bridge should put the focus back on client. They don't even seem to be planning true Westmere-EX successor until Ivy Bridge!
 
Last edited:

SHAQ

Senior member
Aug 5, 2002
738
0
76
I wouldn't anticipate much of any improvement over i7. I haven't heard of any new instructions or efficiency improvements. Maybe the quad-channel memory would offer 2-3% better performance. Not until games\programs utilize 6 threads would it be worth an upgrade over an i7. Hopefully I'm wrong though because upgrading is always fun. It doesn't even offer USB 3/Sata 3 natively. Also, there may be some SLI improvements with the extra lanes, but that is only 2-3% faster too.
 
Last edited:

Vette73

Lifer
Jul 5, 2000
21,503
9
0
I wouldn't anticipate much of any improvement over i7. I haven't heard of any new instructions or efficiency improvements. Maybe the quad-channel memory would offer 2-3% better performance. Not until games\programs utilize 6 threads would it be worth an upgrade over an i7. Hopefully I'm wrong though because upgrading is always fun. It doesn't even offer USB 3/Sata 3 natively. Also, there may be some SLI improvements with the extra lanes, but that is only 2-3% faster too.


That and it seems the CPUs are being made more for OEMs with the video intergrated and other items. And by that I mean cheaper and harder for anybody to make chipsets for Intel.

So maybe 10% give or take. Yea some things will be optomized and run fater but overall I don;t think it will hugh. Now AMDs bulldozer seems like it might be a good leap over that CPUs they have now.
 

yh125d

Diamond Member
Dec 23, 2006
6,886
0
76
10-15% at most, imo. Not too excited for SB (yet, at least)
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
This thread is better than the threads after Intel demoed conroe. On average for apps that don't use AVX expect a 20% gain . For apps that run AVX expect 30 to 110% increase . This is bigger move than P4 to conroe. Don't let people school you on how no apps use AVX As we speak now thousands of developers have SB right now and are recompiling code as we speaK . Also SB has a jitcompiler and when allowed some programs can be run on the fly all thats required for recompile is the Vec prefix.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,227
126
This thread is better than the threads after Intel demoed conroe. On average for apps that don't use AVX expect a 20% gain . For apps that run AVX expect 30 to 110% increase . This is bigger move than P4 to conroe. Don't let people school you on how no apps use AVX As we speak now thousands of developers have SB right now and are recompiling code as we speaK . Also SB has a jitcompiler and when allowed some programs can be run on the fly all thats required for recompile is the Vec prefix.

On average? I recall reading that someone from Intel was quoted as saying "significant Ipc improvements", but 20%, as an average, seems a bit high. I would be willing to believe 20% as a best-case, but not an average.

I don't buy this JIT compiler mumbo-jumbo. The CPU itself doesn't contain a JIT compilier, that's for certain. So where does it reside? Is it a driver, that you have to load on an OS, that recompiles programs when you load them? I think such an approach is too messy, and doesn't really work in the general case.
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Ya jit compiler may be a little (A lot) bady worded. Vex compiler would be more correct I guess. You don't have to believe the 20% on average doesn't matter to me one way or another.

There has been both ISA and microarth. improvements
 
Last edited:

extra

Golden Member
Dec 18, 1999
1,947
7
81
This thread is better than the threads after Intel demoed conroe. On average for apps that don't use AVX expect a 20% gain . For apps that run AVX expect 30 to 110% increase . This is bigger move than P4 to conroe. Don't let people school you on how no apps use AVX As we speak now thousands of developers have SB right now and are recompiling code as we speaK . Also SB has a jitcompiler and when allowed some programs can be run on the fly all thats required for recompile is the Vec prefix.

Hmmm I really have my doubts about this. Bigger move than P4 to Conroe? I really, really, doubt it. I remember all your posts predicting how awesome larrabee was going to be and that it would crush everything, etc, etc, and that didn't turn out (in fact, the reality was completely opposite to all your posts about it), so i'm very skeptical of your claims lol. (no offense, i always like your posts lol).

I also doubt programs will suddenly be optimized for a just-released chip. It's never worked that way in the past, why would it suddenly be any different?

Sandy bridge will probably be 0-20% faster per clock than the i5/i7 stuff we have now.

Whenever there has been any predictions of huge performance improvements in the past, it has never came true. It's always a slow gradual improvement. This release isn't going to change that. IMHO! (Also, just the same with AMD. Bulldozer isn't going to blow intel out of the water and zomg destroy them. It'll be competitive, though).
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Were not talking about larrabee right now . larrabee is alot better than you want to believe. But intel needed both more development time for drivers and tools. You won;t here about larrabee again until intel just releases it on 32nm. Could be alot sooner than you think. As I said befor when intel moved it. they already new that fermi was going to be late hot and a power hog. So they decided to go 32 and not talk about it.

I already said I don't care if you believe 20% average or not on SB . But Andy did say SB was worth waiting for . and he knows. He just can't talk on it. We have had tons of these thread and its always SB isn't much this that . WE talk about BD and its like going to revelotionize the Computer industry.

SB is the Baby of the Team that brought us Dothan . Dothan was AMDs Conquer.

This team Didn't get Merom But they got SB and for me 20% is a disappointment . But AVX apps might make it live up to what I expected.
 
Last edited:

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
I really like that intel is doing on the sandy release. First the lowend and later the high end and Intel has finely made the high end actually the highend with this move intel seperates the high from the low even further. You may not like it . But as long as Intels lowend (Midrange high couch) Beats what AMD has out core for core whats the differance if there priced within reason. The guy that pays deserves to have more than you can get with the low buck system. You see it from a selfish perspective.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,227
126
But they got SB and for me 20% is a disappointment . But AVX apps might make it live up to what I expected.

20% IPC improvement would be an INCREDIBLE amount. I have no idea why you would be disappointed in that. The pipeline of today's CPUs is so incredibly optimized, there are really very few avenues left for improvement, speaking from a computer-science perspective.

That's why we're getting more and more cores, rather than improvements in single-threaded IPC. If Intel was able to tweak, and extract 20% more performance, especially if that's an average (which I don't believe for a second), then they are practically miracle workers.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
This team Didn't get Merom But they got SB and for me 20% is a disappointment . But AVX apps might make it live up to what I expected.

Is it just me or does he sound like a spoiled kid? If you look at architectural predecessor to Merom, which is Yonah, it had 20% improvement. Considering these CPUs are basically ahead of everyone else, 20% is an amazing improvement.

When I went back and looked at mobile Core 2 benchmarks, it was only 15% faster than Core Duo. On desktops it was able to do 20% because additional FSB bandwidth reduced the bottleneck more than it did on Yonah.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Is it just me or does he sound like a spoiled kid? If you look at architectural predecessor to Merom, which is Yonah, it had 20% improvement. Considering these CPUs are basically ahead of everyone else, 20% is an amazing improvement.

When I went back and looked at mobile Core 2 benchmarks, it was only 15% faster than Core Duo. On desktops it was able to do 20% because additional FSB bandwidth reduced the bottleneck more than it did on Yonah.

What were the high-level microarchitectural differences between P4 and Core2 that resulted in the substantial boost in IPC at the time?

Being that this occurred nearly 5yrs ago I have surprisingly (or not) little recollection of "why" P4 was such a cluster-f whereas Core2 turned out to be the bee's knees.

And given that info, what would it take (microarchitecture-wise) to provide a similar boost in IPC over westmere? 1-cycle L1$?
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
What were the high-level microarchitectural differences between P4 and Core2 that resulted in the substantial boost in IPC at the time?

Being that this occurred nearly 5yrs ago I have surprisingly (or not) little recollection of "why" P4 was such a cluster-f whereas Core2 turned out to be the bee's knees.

And given that info, what would it take (microarchitecture-wise) to provide a similar boost in IPC over westmere? 1-cycle L1$?

A lot.

Pipeline stages-

The original Willamette(and Northwood too) CPU had 20 pipelines when there was a hit on the Trace Cache. From what I read there were 8 more stages when it was a "miss". The number of pipeline stage didn't affect it enough to make all the differences in performance, but indirectly I guess you can say it did.*

Trace Cache-

Usually when performance features are added, it isn't done to replace whatever it exists, but in the Pentium 4's case, it did. When there's a miss on the Trace Cache, it has only 1 decoder to fall back to, and essentially turns into a 1-issue CPU. I know there's jokes regarding how ILP era ended and such, but 1-issue is a problem. Trace Cache also worked in place of the L1 I cache, because Pentium 4 did not have one.

Execution Units-

There was a lot that Intel did to save die space on the Pentium 4. There was 3 ALUs on the Pentium 4, 1 Full "slow" ALU that ran at clock speed, and 2 Simple "fast" ALU which ran at 2x the clock speed. Trace Cache can only feed 6 instructions every 2 cycles which means it might not be able to feed when two fast ALU worked at once. Although that should be a small thing.

Misc-

*And then there's replay. Because of the pipeline stages were so long they needed very aggressive speculation to keep the pipeline fed with data. Replay is essentially a "clone" pipeline that was brought to work in case that speculation failed. I read although the idea itself was nice it brought various problems, one for example which could effectively increase pipeline stages, and thus misprediction penalty.

Replay was also thought to be the reason SMT on Pentium 4 did not work as well as it should have.

Core 2 managed to increase performance by ~90% over Pentium 4 at the same clock. To do it again compared to Westmere would be quite a sight. :)
 
Last edited:

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Is it just me or does he sound like a spoiled kid? If you look at architectural predecessor to Merom, which is Yonah, it had 20% improvement. Considering these CPUs are basically ahead of everyone else, 20% is an amazing improvement.

When I went back and looked at mobile Core 2 benchmarks, it was only 15% faster than Core Duo. On desktops it was able to do 20% because additional FSB bandwidth reduced the bottleneck more than it did on Yonah.

I didn't start at Yonah . The first CPU I had that kicked AMD to curb was Dothan so thats my starting point. Everthing started with Dothan and the Isreal team choosing the Pro Core to start with.
 

tweakboy

Diamond Member
Jan 3, 2010
9,517
2
81
www.hammiestudios.com
You wont notice a difference, so there you go. Ok ull save 10 seconds off a render for example. have you ever gone to 100 percent ? When I render audio it takes 68 percent,
 
Last edited:

dac7nco

Senior member
Jun 7, 2009
756
0
0
Can someone explain (shortly) what the AVX instructions are? I have an understanding of vector systems in older big iron, and somewhat in the IBM power systems, but not much else.

Daimon