• We should now be fully online following an overnight outage. Apologies for any inconvenience, we do not expect there to be any further issues.

Why Ivy Bridge Is Still Quad-Core?

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Intel just released information that IVY will increase IPC. Certainly not 10% or anything, but there are internal enhancements in IB that will yield some minor enhancements.

If this information is from X-bit . Why not say what that leak said rather than say IPC will not be 10% . If you know the information why don't you say exactly what the article said Intel said . This is so silly . Just like so many still say that on average SB was less than 10% IPC increase over last generation. Not in the real world closer to 25% than 10%.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Ivy Bridge is a Tick, which I take it to mean fixing weaknesses in the change they brought to Sandy Bridge(and I don't mean just by performance).

There will be some IPC gain(average of maybe 3-4%), but its relevant to media workloads like Rendering/Video editing, and in those applications the gains will be better. Rest of gains in Ivy Bridge are due to better Turbo residency(and higher Turbo).

Interesting, I have formed up a different view. Is it just all cache? ;-p

The 3770 has higher max Turbo clock than the 2600. The better comparison is 2500 vs 3550, as they both have same max Turbo clocks.

Ivy Bridge has same amount of cache as Sandy Bridge. So the part about larger caches is either a mistake or possibly something we never knew about Sandy Bridge(perhaps something prevents full utilization of the L3 caches?).

Nemesis said:
Just like so many still say that on average SB was less than 10% IPC increase over last generation. Not in the real world closer to 25% than 10%.

Exaggeration. Against Lynnfield the gains were ~15% at the same frequency. Of course its different when talking about the same price point.

I just calculated 25% gain for i7-880 to i7-2600K from Anandtech CPU bench. Comparing Turbo frequencies:

2600K vs i7-880 Turbo frequency:

Base: 11.1% higher
4 cores: +5.1%
3 cores: +8.1%
2 cores: +2.8%
1 core: +1.9%
 
Last edited:

denev2004

Member
Dec 3, 2011
105
1
0
Actually this IPC gain is not gain theoretically.
For Nehalem, they used 64bit Macro-Fusion. For Sandy Bridge, they introduced uop cache used when miss happens. These are more like a kind of gain theoretically.
 

jpiniero

Lifer
Oct 1, 2010
16,840
7,284
136
Because the era of high TDP processors is coming to an end soon. It's only a matter of time when it will be difficult/$$$ to get a 20+ W TDP desktop processor, let alone a laptop.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Because the era of high TDP processors is coming to an end soon. It's only a matter of time when it will be difficult/$$$ to get a 20+ W TDP desktop processor, let alone a laptop.

Just wondering....

Who makes the highest TDP processor core at the moment?

On what process technology? (to help me estimate the heat density)

(After a quick glance through the internet) I noticed IBM has some expensive processors, but I don't know how fast they are and how they stack up to Intel's leading edge x86 designs.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
Well, aren't you a clever one with your linking.

Wow!

A fake account or robot that is somehow able to respond with a quasi intelligent, quasi pertinent post.....accompanied by a fake link that superficially looks like it goes to another Anandtech thread....but instead leads to a commercial gadget selling website instead.

(Moderators, sorry for going off topic.)

No problem, thanks, Banned for spam.
Markfw900
 
Last edited by a moderator:

toyota

Lifer
Apr 15, 2001
12,957
1
0
I have said this many times and I will say it again...the first few posts from a new member need to be mod approved here.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
On top of that, I'd expect a theoretical 24 core of some cpu to be clocked lower than a 16 core counterpart, so again if you can't run more than 16 threads in the majority of your work, you're actually slowing yourself down.

That is true, but then there is always "Turbo".

Unfortunately, with a 24 core cpu on single die of moderate size we would be looking at a pretty small silicon space for each core.

This makes me wonder how much Turbo could be applied and for how long? :(

P.S. 24 cpu cores on a single die may not be an outrageous idea to entertain. A die shrink of Sandy Bridge-E to 16nm should be able to yield something like that. (assuming cache size doesn't become an issue)
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Ivy Bridge is a Tick, which I take it to mean fixing weaknesses in the change they brought to Sandy Bridge(and I don't mean just by performance).

There will be some IPC gain(average of maybe 3-4%), but its relevant to media workloads like Rendering/Video editing, and in those applications the gains will be better. Rest of gains in Ivy Bridge are due to better Turbo residency(and higher Turbo).



The 3770 has higher max Turbo clock than the 2600. The better comparison is 2500 vs 3550, as they both have same max Turbo clocks.

Ivy Bridge has same amount of cache as Sandy Bridge. So the part about larger caches is either a mistake or possibly something we never knew about Sandy Bridge(perhaps something prevents full utilization of the L3 caches?).



Exaggeration. Against Lynnfield the gains were ~15% at the same frequency. Of course its different when talking about the same price point.

I just calculated 25% gain for i7-880 to i7-2600K from Anandtech CPU bench. Comparing Turbo frequencies:

2600K vs i7-880 Turbo frequency:

Base: 11.1% higher
4 cores: +5.1%
3 cores: +8.1%
2 cores: +2.8%
1 core: +1.9%


This isn't a fair wager but what would you like to wager on that 3-4% ipc increase.Or is this your idea of waggery
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
So I see by your avatar you made elite rather quickly . Who are you and who do you work in the industry. I said it many times by being selective I can make any averages I choose to make through out the High keep the low. leave AVX out of equation . Leave Igp out of equation for Intel only. But it is on the same die . Same people who Ate all the AMD BD hype are the same ones who don't believe what intel said . I understand intel was talking price points when they said 20% performance increase . At the same time AVX on SB and IB are moving targets . Its never over till the fat lady sings. To keep AMD in the game alot of review sites are binding over backwards to sort of hype llano . Than its all right to count IGP as part of the die . But when we discuss Intel performance gains from one tick tock to another we ignor those gains. Amusing.
 

exdeath

Lifer
Jan 29, 2004
13,679
10
81
We don't need faster CPUs. We need faster data storage devices. Even 500 MB /sec from NAND SSDs is crap compared to the 30+ GB sec the main buses in the PC can move.

I want a zero wait state computing experience, not more cores at higher clock speeds so they can idle 99% while still waiting on disk IO. I want to be able to backup 1 TB of data and see "13 seconds remaining..." then maybe the overpowered CPUs we already have might get to actually do something that matters in the real world.
 
Last edited:

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
We don't need faster CPUs. We need faster data storage devices. Even 500 MB /sec from NAND SSDs is crap compared to the 30+ GB sec the main buses in the PC can move.

I want a zero wait state computing experience, not more cores at higher clock speeds to they can idle more while still waiting on IO.

The dog is biting its tail there. With the use of an SSD, many many processes on your PC are now rather CPU bound than I/O bound. Linear transfer rate is of little importance for everyday use of a computer. Even lower access times (by at least two orders of magnitude), accompanied by the appropriate CPU power are more important.
 

exdeath

Lifer
Jan 29, 2004
13,679
10
81
The dog is biting its tail there. With the use of an SSD, many many processes on your PC are now rather CPU bound than I/O bound. Linear transfer rate is of little importance for everyday use of a computer. Even lower access times (by at least two orders of magnitude), accompanied by the appropriate CPU power are more important.

Except for when you have to replace HDDs and recover 100 GB of data at 5 MB/sec on your 16 core 8 GHz super computer.

The real world is driven by data, -lots and lots and lots of data- not frames per second or 3DMarks or penislengths-per-nanosecond.

And when you have 15,000+ employees who want 100+ GB of storage in their laptop, even the most well funded company isn't going to be handing out personal SSDs like candy in their current state.

I'm not talking about linear transfer of single files, I'm talking about moving millions of files at speeds not measured in kilobytes per second in the year 2011 in the 21st century. You know, people in the real world who have accumulated 20 years worth of work in marketing, engineering, research, etc, and they've carried over that data with each new computer.

As time goes on, the amount of data being moved becomes unsurmountable, and storage speeds haven't changed in decades. Oh wait it took us a decade to go from 80 MB/sec to 100 MB/sec...but still .1 MB/sec random. In that same amount of time data collections grew from 1 GB to 1,000 GB.

More work needs to be done on SSDs and their future successors, and HDDs need to stop being manufactured altogether. Nobody really cares about CPU speeds and "lulz I have .1 moar fps than u n00b".

Hell even engineers using programs like Matlab that you would think would be CPU bound complain that it's disk access slowing them down and taking all weekend to run simulations, and CPU power isn't even remotely a concern.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
This isn't a fair wager but what would you like to wager on that 3-4% ipc increase.Or is this your idea of waggery

The absolute value isn't important. But one thing I'll say its its quite likely that Ivy Bridge with specific gains will get far less than Sandy Bridge had with ~15%. Westmere, didn't even bring any IPC improvements as a Tick!
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
The absolute value isn't important. But one thing I'll say its its quite likely that Ivy Bridge with specific gains will get far less than Sandy Bridge had with ~15%. Westmere, didn't even bring any IPC improvements as a Tick!

OK now tell the people whos nuckels are turning white . What it did bring .
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
We don't need faster CPUs. We need faster data storage devices. Even 500 MB /sec from NAND SSDs is crap compared to the 30+ GB sec the main buses in the PC can move.

I want a zero wait state computing experience, not more cores at higher clock speeds so they can idle 99% while still waiting on disk IO. I want to be able to backup 1 TB of data and see "13 seconds remaining..." then maybe the overpowered CPUs we already have might get to actually do something that matters in the real world.

Check out ramsdale specs
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
That is true, but then there is always "Turbo".

Unfortunately, with a 24 core cpu on single die of moderate size we would be looking at a pretty small silicon space for each core.

This makes me wonder how much Turbo could be applied and for how long? :(

It doesn't seem like it would gain too much. Turbo Mode for Xeon E7-8870, the top SKU with 10 cores.

Base: 2.4GHz
7-10C: +133MHz
5-6C: +266MHz
1-4C: +400MHz
Max: +400MHz
 

exdeath

Lifer
Jan 29, 2004
13,679
10
81
Perhaps you are too young to remember this, but AMD used to charge $999+ for their top end CPUs, back when they were beating Intel in performance.

Ah yes I fondly remember the Athlon 64 FX family. ^_^
 

exdeath

Lifer
Jan 29, 2004
13,679
10
81
I would pay $1200 for an 8 core LGA2011 EE :)

Some perspective from a programmer.

I want Intel because it's better right now, but I also want more cores because I write code that scales perfectly with cores provided there are enough independent objects to process... something I worked on in anticipation of Cell way before the PS3 release.

Replace:

for(i=0;i<num_objs;++i)
objs.pfnTick();

with

for(i=0;i<num_objs;++i)
QueueJob(objs.pfnTick);

WaitForMultipleObjects(objs);

All cores will be 100&#37; regardless of how many cores except for the brief instant at the end when the remaining objects in queue are less than the number of cores. I call it 'scatter gather multithreading' where you have an application that is primarily constrained to sequential single thread execution but can exploit heavy parallelism within each individual sequential step (eg games). It's basically just hybrid serial-parallel threading, but it reminds me of the Protoss carriers with the expanding and contracting swarms. The main sequentially dependant single thread is like the carrier, while each sequential component such as physics, AI, etc explodes into a million sub threads processed in parallel before collapsing back to the main thread and proceeding to the next sequential step, rinse repeat. Collision detection, spawn off a million collision pairs, process them in parallel, done, proceed to next step collision response, spawn off the 1000 responses for the objects that hit each other, done, next step, AI, spawn off a million 'think' functions, etc.

Double buffered interobject dependant state data guarantees consistent snapshot for the duration of the frame and eliminates the need for thread locks and synchronization between simultaneously updating objects so it just runs balls to the wall and scales with cores just as well as IPC without the usual thread synchronizing nightmares. In fact the only place you have classical thread locking is when adding or removing jobs from the global job queue.

Tradeoff is memory consumption with the double buffering of state data but having things like position, heading, 'what am I doing' state, etc stored twice per object pales in comparison to memory consumption by presentation resources (textures, vertices, etc). And hey 16 GB RAM costs $80 and you're complaining there is no need for 16GB RAM so there you go. And the case of the PS3 you have per core scratch pad RAM and powerful DMA that pretty much makes the memory duplication cost transparent all for the sake of maximizing parallelism. When most people were looking at Cell going "WTF?", I thought it was beautiful once I applied this design. Double buffering dependent critical data is the most effective answer to leveraging mass parallelism by eliminating thread dependancies.

Objects can be anything from birds in the sky to the 100 monsters in front of you to falling leaves to collision pairs to bot AI all at the same time... or in my test case millions of balls bouncing off the edges of the screen with time consuming 'dummy' code to waste time and simulate a CPU load. The idea is proving that a) all else held constant, performance scales linear with number of cores, and b) watching all cores in Task Manager max out to 100% regardless of how many.

And to clarify what I mean by dependancy with an example: collision thread for ball 1934 never has to lock/unlock or "wait" on X and Y of ball 3452 to make sure new Y isn't being updated while old X has already been read (parallel processes), because X and Y for all balls are double buffered and the public state of X and Y are guaranteed to be constant for each object for the duration of the frame.

;)

Of course I don't *need* 8+ cores, I just *want* more cores to continue my research and test the limits of seamless thread scaling in consumer apps. But back on topic, yes, more cores are good. But obviously not to the point that performance doesn't improve because of decreased IPC within a given core. All else held constant, eg same IPC same family CPU, 4 cores at 4 GHz will execute my simulation with identical performance to 8 cores at 2 GHz, and twice as fast as either with 8 cores at 4 GHz.
 
Last edited:

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
Except for when you have to replace HDDs and recover 100 GB of data at 5 MB/sec on your 16 core 8 GHz super computer.

The real world is driven by data, -lots and lots and lots of data- not frames per second or 3DMarks or penislengths-per-nanosecond.

And when you have 15,000+ employees who want 100+ GB of storage in their laptop, even the most well funded company isn't going to be handing out personal SSDs like candy in their current state.

I'm not talking about linear transfer of single files, I'm talking about moving millions of files at speeds not measured in kilobytes per second in the year 2011 in the 21st century. You know, people in the real world who have accumulated 20 years worth of work in marketing, engineering, research, etc, and they've carried over that data with each new computer.

As time goes on, the amount of data being moved becomes unsurmountable, and storage speeds haven't changed in decades. Oh wait it took us a decade to go from 80 MB/sec to 100 MB/sec...but still .1 MB/sec random. In that same amount of time data collections grew from 1 GB to 1,000 GB.

More work needs to be done on SSDs and their future successors, and HDDs need to stop being manufactured altogether. Nobody really cares about CPU speeds and "lulz I have .1 moar fps than u n00b".

Hell even engineers using programs like Matlab that you would think would be CPU bound complain that it's disk access slowing them down and taking all weekend to run simulations, and CPU power isn't even remotely a concern.

Where was I talking about fps or games? You said "zero wait state computing experience". I interpreted this statement from the point of view of a normal consumer, not a specialized data center or whatever else you meant. What does a normal person do with their computer? They browse the web, work with office, create and edit HD videos/pictures/audio and play games.

For the system to be even snappier (zero wait state aka "instant") we need a more efficient filesystem, less latency (access time and transfer time from the memory system to the CPU/RAM and back) and more processing power. If you analyze everyday activities in a modern PC with a fast SSD, you are mostly CPU bound. I think I have seen some interesting results over at XS forums.

But maybe you would like to clarify what you meant by "zero wait state computing experience" and from what perspective you look at the "need for speed" :)
 
Last edited:

CPUarchitect

Senior member
Jun 7, 2011
223
0
0
All software that was easy to thread has already been threaded. Almost all software that is medium or hard to thread has already been threaded.
That is incorrect. There is still a lot of opportunity left for more extensive threading. Note that preparing software to benefit from dual- or quad-core processors doesn't mean it has a scalable architecture that can easily take advantage of additional cores. Today's applications can best be described as coarse-grain multi-threaded. Fine-grain multi-threading has better scaling characteristics but comes with inherent overhead that impacts performance on processors with few cores.

So we're still somewhat in limbo as far as multi-threaded software goes. Also keep in mind that colleges and universities have only recently started teaching multi-core programming. The vast majority of developers out there doesn't have a firm grasp of what it takes to make a multi-threaded application with good scaling behavior. Also, adding multi-core support to existing code-bases can be a minefield and leads to suboptimal results.

Fortunately they'll get some help from future hardware in the form of Hardware Transactional Memory (HTM). It is rumored to be supported by Intel's Haswell architecture (but might be delayed till Broadwell for marketing reasons and/or because the process shrink creates room for more cores and thus creates the need for HTM). AMD has also proposed extensions to help automate multi-threaded development and improve performance.
 

CPUarchitect

Senior member
Jun 7, 2011
223
0
0
I can agree with you on the first part, but you do seem to be overlooking the part where Ivy Bridge is just a die shrink of Sandy Bridge, which is a native quad core part. If Haswell comes out with just a quad core, then I can maybe see where you're coming from, but that's still over a year away.
Mainstream Haswell chips will most likely still be quad-core (enthusiast parts could be six or eight core of course).

But that doesn't mean performance won't improve. Instead of wasting transistors on more cores, they'll invest them into widening SIMD and making is suitable for a very wide range of applications. AVX2 features 256-bit vector instructions for integer operations, fused multiply-add floating-point instructions, and most importantly memory gather instructions. The latter allows to parallelize almost any code loop that has independent iterations, up to a factor eight.

This is far more power efficient than trying to achieve the same level of parallelism through multi-threading. Hence Haswell is likely to remain quad-core, but each core is capable of much higher throughput.
 

tweakboy

Diamond Member
Jan 3, 2010
9,517
2
81
www.hammiestudios.com
Ive read many places Ivy Bridge will have a 6 core 12 thread variant. This is pretty much fact, set in stone. Its later on end of 2012 but I saw the Intel graph myself,, Ivy will have 6 core 12 thread variants,,, I heard a echo in here...
 

jpiniero

Lifer
Oct 1, 2010
16,840
7,284
136
Ive read many places Ivy Bridge will have a 6 core 12 thread variant. This is pretty much fact, set in stone. Its later on end of 2012 but I saw the Intel graph myself,, Ivy will have 6 core 12 thread variants,,, I heard a echo in here...

Yeah, Ivy Bridge Extreme. It'll probably be the same clock speeds and cores as Sandy E, with a lower TDP, and still no GPU.

Mainstream Haswell will def be max quad core.