• We should now be fully online following an overnight outage. Apologies for any inconvenience, we do not expect there to be any further issues.

4th Generation Intel Core, Haswell summarized

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

mikk

Diamond Member
May 15, 2012
4,308
2,395
136
Give me a specific number (not a range) to use for SB over Nehalem, as well as a specific number for Nehalem over Conroe. I will gladly recompute the value for Haswell based on Intel's published results.


Let's say 12.5% for SB over Nehalem. I'm not fully sure for Nehalem over Conroe but I would say more than your 10%. I would say 15%. But these numbers can't be accurate since we don't know Intels numbers or internal measurements.
 

happysmiles

Senior member
May 1, 2012
340
0
0
Knowing intel, they will try to charge $478 apiece for the minimum SKU, thus keeping tablets priced well over $1000 and thus continuing to bleed massive profits to apple and google.

that would leave ARM to eat their souls
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Let's say 12.5% for SB over Nehalem. I'm not fully sure for Nehalem over Conroe but I would say more than your 10%. I would say 15%. But these numbers can't be accurate since we don't know Intels numbers or internal measurements.

Sure. Then according to Intel's graph, if we take SB IPC to be 12.5% higher than Nehalem's IPC that requires Nehalem's IPC to be 18.7% higher than Conroe and Haswell's IPC to be 19.3% higher than Sandy's IPC for whatever "broad workload mixture" Intel used as their basis for assessing average IPC of their recent microarchitectures.

(and the Y-offset for the graph is "141" <- pixels if anyone was wondering what this is based on)
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
Sure. Then according to Intel's graph, if we take SB IPC to be 12.5% higher than Nehalem's IPC that requires Nehalem's IPC to be 18.7% higher than Conroe and Haswell's IPC to be 19.3% higher than Sandy's IPC for whatever "broad workload mixture" Intel used as their basis for assessing average IPC of their recent microarchitectures.

If it's 19.3%, then I imagine that "broad workload mixture" actually has allot of AVX2 code in it. If it doesn't, that would be freaking awesome, tough to imagine on the same node though.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
If it's 19.3%, then I imagine that "broad workload mixture" actually has allot of AVX2 code in it. If it doesn't, that would be freaking awesome, tough to imagine on the same node though.

Presumably Intel used the same workload mixture across all architecture when generating the data that went into the bar chart, otherwise the IPC results themselves would have been pointless to generate from the outset. (outside of marketing purposes that is, which goes to 11 for both companies so I'll concede that point in advance: +1 to Ajay)

Regardless my cynicism, your statement is quite spot on. Achieving nearly 20% increase in IPC is just phenomenal if true, in fact it is truly in the realm of "seems too good to be true" and my inner-cynicism is saying it probably is too good to be true.
 

BallaTheFeared

Diamond Member
Nov 15, 2010
8,115
0
71
Intel isn't competeing with AMD anymore . Intel is competing with ARm cpus and China cpus . Yes China has cpus and they are growing in usage. AMD has nothing for intel. I would very much like to see the link were you seen Intels clocks for haswell desktop. Your insane iff you believe IB is getting the cold shoulder from anyone . SB is about the same and most that are upgrading did so already with SB . But if you build now a system choosing SB over Ib be a retard move.

Haswell brings so much more than SB/IB its not even funny until one reads post like yours. You do understand a 15% increase of a HW processor would require about 22% increase in IPC on the next AMD release to stay in step with intel aDVANCES. iTS ACTUALLY HIGHER BUT DON'T WANT THE BABIES CRING.

Sure they are, they're clearly beating them but AMD isn't out of it. AMD isn't focusing on desktops as much anymore because the market isn't what it once was. Bulldozer seems to be a pure server/workstation design, while Intel is doing the same they're doing it with IPC while AMD is doing it with cores. AMD isn't doing quite as bad in servers as they are desktops. There is also the mobile and low power markets they've been trying really hard to break into those, they haven't yet but they're still there trying that's a threat and competition if I ever knew the definition.

I would very much like to see the post where I claim to know Haswell's clock rate.

I must be insane.

If I was building a new system I'd get whichever i5 k was cheaper as I personally don't see any advantage for my personal needs, but then again I'm insane and retarded.

Until what is brought is used commonly it won't make a bit of difference. AVX was a dud for most of us, AVX2 is considerably more promising but we'll cross that bridge when we get to it (It's not doing AMD any favors in the desktop world). The IGP is "meh" for most of us, though it does have value it's good to see it coming up.

You do understand my point was IPC is only one part of the equation right? A 15% increase with a lower clock speed threshold would be pretty "meh" to me, I dunno about you though. If you say so, I think your caps lock got stuck though.
 
Last edited:

CHADBOGA

Platinum Member
Mar 31, 2009
2,135
833
136
Sure they are, they're clearly beating them but AMD is out of it. AMD isn't focusing on desktops anymore because the market isn't what it once was.
Nonsense. AMD isn't focusing on desktops anymore because the brainiacs at AMD produced a dud CPU with horrible single core performance, so they can't compete there.

Bulldozer seems to be a pure server/workstation design, while Intel is doing the same they're doing it with IPC while AMD is doing it with cores.

With 8+ core CPU's in the server/workstation market, Intel is using both IPC and more cores, the way it should be done.

AMD isn't doing quite as bad in servers as they are desktops.

That's not true at all. Their server marketshare is pathetic and getting worse every month.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
AMD is doing alot worse in the server segment than any other segment. Its a what, sub 5% share for them?

Their current only hope is the APU designs. Their last chance.

I am looking forward to see how mobo makers will implement the new VRM system by Haswell. Will they find a way to add 32 phases for nothing? :D
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,455
5,842
136
Uaing gaming benchmarks is kinda phony anyway as the gpu is the biggest metric in those benchmarks unless your willling to use low res. results to remove the gpu as the bottle nick .

So, what you're saying is that gaming with a realistic graphics card (i.e. a single card, not some insane SLI rig), at realistic resolutions (i.e. 1080p and up), the main bottleneck is the GPU and there's no reliably discernible difference between results from an AMD or Intel APU? And yet you're still raging at people who buy a cheaper APU for an equivalent gaming experience? :)

When the present agreement between NV/Intel runs out I believe 3 more years Intel will know longer offer a PCi-E slots / If you want NV you will have to buy an nv cpu . For PHi intel will bring that ondie or use another socket . You will not see NV accelrators on Intel server products .

Well that's just not going to happen. :rolleyes: There are plenty of companies who have invested very, very large amounts of money into CUDA code running on Tesla cards which cost literally thousands of pounds each, and they aren't going to throw all of that away. They'd just stick with older Intel machines, or buy in AMD servers. It'd be utter suicide in the high performance computing market. (Not to mention that the enthusiast gaming market would murder them for dropping support for graphics cards.)
 

BenchPress

Senior member
Nov 8, 2011
392
0
0
Achieving nearly 20% increase in IPC is just phenomenal if true, in fact it is truly in the realm of "seems too good to be true" and my inner-cynicism is saying it probably is too good to be true.
Haswell adds a fourth integer execution port, which results in two nearly symmetrical pairs. That means having two threads per core is going to FLY. No more port contention! And it's equally useful for vector workloads. Freeing execution port 0 will make a huge difference. On top of that they can now sustain two loads and one store each cycle, and all the buffers for out-of-order execution grew by more than 10%.

So a 20% increase in IPC isn't so hard to imagine. We've seen nearly 15% gains in the past with far less new hardware.

However, it may very well come at a ~10% decrease in clock speed. But the end result would still be faster, and much more power efficient. It's either this, or they've made another compromise like increasing some instruction latencies, which lowers the IPC gain but keeps clock frequency high. Given that clock frequency is going to increase again anyway, I'm starting to lean toward the first option...
 

CPUarchitect

Senior member
Jun 7, 2011
223
0
0
If it's 19.3%, then I imagine that "broad workload mixture" actually has allot of AVX2 code in it. If it doesn't, that would be freaking awesome, tough to imagine on the same node though.
It's the same process node, but they've added 33% more execution ports! So is it really that tough to imagine?

Also, the IPC gain from AVX2 is... wait for it... nada. AVX2 isn't about Instructions Per Clock, it's all about doing twice the amount of work per instruction.

That said, Haswell's new arithmetic execution port frees up the execution ports used by AVX2, which are also used by AVX and all the SSE instructions. That is, vector workloads in general. So no need for AVX2 code to benefit from the new execution port!

My expectation is that this "broad workload mixture" included a lot of multi-threaded workloads and legacy vector workloads. Having two threads per core helps maximize the use of the 33% more execution ports to easily gain 20% in IPC.

The real question is, did it come at price in clock frequency, or did they make a different compromise?
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
So, what you're saying is that gaming with a realistic graphics card (i.e. a single card, not some insane SLI rig), at realistic resolutions (i.e. 1080p and up), the main bottleneck is the GPU and there's no reliably discernible difference between results from an AMD or Intel APU? And yet you're still raging at people who buy a cheaper APU for an equivalent gaming experience? :)
My 1st reply here
I haven't a clue as to what your talking about. Intel has NO apu on die. Never will have AMDers will try to tag Intel Igpu as an APU but until intel says differantly they have no APU . Accelerators is just another word for Compute. Intel has Compute programming units if you look at PHi thats really not an accerator either its a co processor using compute programming unit . APU when referring to intel is incorrect term . Intel knows better than AMD . I mean look who leads the market . As far as caring about who buys what . Your way off base . I don't care what anyone buys. I don't even see AMD as anysort of threat to intel at all . NV now thats a differant story a smart man who believes we have all the time in the world would be buying up NV stock . Than Chinas has the dragon cpu aND ITS RAMPING UP . amd WOULD HAVE BEEN BETTER OFF LEAVING THE x86 market and becoming a fab for hire . Intel will always have the best fabs in the world and companies



Well that's just not going to happen. :rolleyes: There are plenty of companies who have invested very, very large amounts of money into CUDA code running on Tesla cards which cost literally thousands of pounds each, and they aren't going to throw all of that away. They'd just stick with older Intel machines, or buy in AMD servers. It'd be utter suicide in the high performance computing market. (Not to mention that the enthusiast gaming market would murder them for dropping support for graphics cards.)

Ya your right and NV wants to play in the cpu markets though and that changes the rule . Intel had intended this be the year that the PCI-e disappeared . But because of infighting NV managed to get 5 more years on a intel pci-e . The rules changed when NV created its own processor . Intel does not have to supply access to intels tek to a company like nv in the near future as nv has its own products it can piggyback. Intel has PHi they could careless about cuda cores
 
Last edited:

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Haswell adds a fourth integer execution port, which results in two nearly symmetrical pairs. That means having two threads per core is going to FLY. No more port contention! And it's equally useful for vector workloads. Freeing execution port 0 will make a huge difference. On top of that they can now sustain two loads and one store each cycle, and all the buffers for out-of-order execution grew by more than 10%.

So a 20% increase in IPC isn't so hard to imagine. We've seen nearly 15% gains in the past with far less new hardware.

However, it may very well come at a ~10% decrease in clock speed. But the end result would still be faster, and much more power efficient. It's either this, or they've made another compromise like increasing some instruction latencies, which lowers the IPC gain but keeps clock frequency high. Given that clock frequency is going to increase again anyway, I'm starting to lean toward the first option...

It's the same process node, but they've added 33% more execution ports! So is it really that tough to imagine?

Also, the IPC gain from AVX2 is... wait for it... nada. AVX2 isn't about Instructions Per Clock, it's all about doing twice the amount of work per instruction.

That said, Haswell's new arithmetic execution port frees up the execution ports used by AVX2, which are also used by AVX and all the SSE instructions. That is, vector workloads in general. So no need for AVX2 code to benefit from the new execution port!

My expectation is that this "broad workload mixture" included a lot of multi-threaded workloads and legacy vector workloads. Having two threads per core helps maximize the use of the 33% more execution ports to easily gain 20% in IPC.

The real question is, did it come at price in clock frequency, or did they make a different compromise?

If I read this right, basically you both are saying that the dominate microarchitectural improvement for IPC in Haswell comes down to improving the performance of hyperthreading?

If this is the case, then what kind of IPC improvement are we talking about for single-threaded apps? (and processors which won't have HT, like the 2500k and 3570k, etc)
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
With Haswell you will see a big shift to All in One pcs. Much more than IB will. The handwriting is on the wall Intel has given us thunderbolt so intel is not locking anyone out , Your simply not going to be able to attach a gpu to a internal socket on intel products its all thunderbolt for the Fleas of the processing market. GPU over thunderbolt is the only way after the present agreement runs out.
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
If I read this right, basically you both are saying that the dominate microarchitectural improvement for IPC in Haswell comes down to improving the performance of hyperthreading?



If this is the case, then what kind of IPC improvement are we talking about for single-threaded apps? (and processors which won't have HT, like the 2500k and 3570k, etc)

Well you have been in the industry a long time . Looking at what we know this does seem to be the case . What about the none HT cpus . Thats what use to make america great . Choice you have a choice that intel offers . HT or not to HT that is the question.
Your a smart guy . Give it some thought than give us a reply we have come to expect from you .Unlike you I have never worked at creating hardware . But I have built wonderfful cabinets for Dec and IBM . You know about the core . I know about containing large compute in well designed cabinets. But I have been tring to learn as much as possiable about your end . without a formal education I can say your end is changing rapidly now . The form factor that houses this hardware is now rapidly changing . I know that My present watercooling setups are almost extenct now . small form factor is the future .I want the Dick tracy wrist watch that will be a day in history
 
Last edited:

MisterMac

Senior member
Sep 16, 2011
777
0
0
As a lurker - it's increasingly funny to watch some "senior" members completey and utterly implode of any logical sense.


No ST improvement does sound scary IDontcare - i'd be scared here.
 

piasabird

Lifer
Feb 6, 2002
17,168
60
91
Show me the benchmarks.

I kind of wonder if they already have working prototypes or are they just working off of design specifications. Lots of things work on paper a lot better than they do in real-world application. All kinds of claims can be made, but how much will things actually improve based on benchmarks and actual hands on use?
 
Last edited:

CPUarchitect

Senior member
Jun 7, 2011
223
0
0
If I read this right, basically you both are saying that the dominate microarchitectural improvement for IPC in Haswell comes down to improving the performance of hyperthreading?
Don't forget vector workloads, and any scalar floating-point workload for that matter too. All of these benefit from having execution ports 0 and 1 available for vector or floating-point operations, while the new port 6 takes over the ALU, shift and branch operations from port 0 (with port 5 offloading port 1):

DSC_8160_575px.JPG


If this is the case, then what kind of IPC improvement are we talking about for single-threaded apps? (and processors which won't have HT, like the 2500k and 3570k, etc)
If I had to guess I'd say 10% for single-threaded scalar integer workloads seems conservative. 15% for single-threaded vector or floating-point workloads. And 20% for multi-threaded ones (any kind). It's also safe to assume they reduced clock frequency a little to meet the timing requirements and lower the power consumption.

Note that single-threaded pure integer workloads, which could use higher performance but can't obtain it through multi-threading or vectorization, are getting quite scarce. Also note that lowering the clock frequency while increasing IPC means much higher perfomance/Watt. Which in particular means they'll finally be able to increase the core count, which in turn will be aided by the addition of TSX.

It's a good question what will happen to the i5 models. It doesn't make a lot of sense to have these very wide cores not support Hyper-Threading. Unless they really want to have people buy such expensive crippled CPUs, because they can.
 

CPUarchitect

Senior member
Jun 7, 2011
223
0
0
APU is CPU and GPU on a single die. Hence the entire Ivy/Sandy Bridge line are APUs. :)
Not really. An APU is an "Accelerated" Processing Unit, meaning a CPU and GPU on a single die, with the explicit intention of using the GPU to perform generic high throughput workloads instead of the CPU. This is heterogeneous computing.

Haswell has a GPU too, but its CPU cores are more powerful, thanks to having two 256-bit AVX2 FMA units per core. The GPU is primarily optimized for power efficient graphics instead. This homogeneous solution should prove to be more powerful because it doesn't suffer from bandwidth and latency issues that heterogeneous computing has to deal with. It also doesn't require developers to tune their algorithms for two completely different architectures (and many configurations of it), and they can use familiar programming languages and tools.

So it's a different paradigm. Only AMD uses the term APU.
 

Charles Kozierok

Elite Member
May 14, 2012
6,762
1
0
AMD's been using the term for a long time, though. Is there in fact any current integration of the processors, or just a plan for the future? My impression was that right now it really is just a GPU and a CPU on one chip.
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
APU is CPU and GPU on a single die. Hence the entire Ivy/Sandy Bridge line are APUs. :)

Intel does not have APUs They have Igpu The I is for Intel . APU is a term coined by AMD . Your welcome to use the term . But not with intel products . You can't debate me on this . You need to debate intel on this . GPU VPU . 2 names same meaning . But ATIs VPU is seldom used . GPUs are comput units with vary small cores . PHI is a co processor with many x86 baby cores along with vector compute. Show me 1 link were Intels says SB/IB is a APUs . You can't Because Intel will never use APU as a term for their tek. ever. SO no intel does not have APUs they have lots of COMPUTE processing cores . Explain to us all, the differance between Comput processing cores and accelerated proccessing,, cores are cores its just how much punch each core has. NICE try . But your lieing. Fact is these so called accelerator cores are the weakest of the weak. thats why it takes thousands to do about the same work as 1 intel haswell core and even than there is much these small accelerator cores of amd cann't do on there own in x86.
 
Last edited:

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
AMD's been using the term for a long time, though. Is there in fact any current integration of the processors, or just a plan for the future? My impression was that right now it really is just a GPU and a CPU on one chip.

Unless they integrate the ISA into a monolithic ISA the same they did with the FPU and the x87, the microarchitecture can't really be "integrated" any further than it is now beyond simple integration of the memory subsystem (cache, IMC, and ram...but no real logic cohabitation/sharing can happen until ISA integration occurs).

My gut tells me the future of APU's is very much going to continue to look like bolted together SOCs. The floor layout for the die itself may get more and more "muddled" as synthetic layout tools are relied upon to optimize thermal balance, clocks, etc but the actual logic integration and synergy that was to come from "fusion" is probably always going to be a pipedream.
 

Olikan

Platinum Member
Sep 23, 2011
2,023
275
126
Unless they integrate the ISA into a monolithic ISA the same they did with the FPU and the x87, the microarchitecture can't really be "integrated" any further than it is now beyond simple integration of the memory subsystem (cache, IMC, and ram...but no real logic cohabitation/sharing can happen until ISA integration occurs).

My gut tells me the future of APU's is very much going to continue to look like bolted together SOCs. The floor layout for the die itself may get more and more "muddled" as synthetic layout tools are relied upon to optimize thermal balance, clocks, etc but the actual logic integration and synergy that was to come from "fusion" is probably always going to be a pipedream.

the way i see the APU's is more the CPU to help the GPU at doing computing...
than the GPU helping the CPU at doing paralel code...:p

IIRC there was a some research that cpus would fetch data for the gpus...

but meh...kinda off-topic right?
 

CPUarchitect

Senior member
Jun 7, 2011
223
0
0
AMD's been using the term for a long time, though. Is there in fact any current integration of the processors, or just a plan for the future? My impression was that right now it really is just a GPU and a CPU on one chip.
The HSA roadmap (the architecture used by the AMD's APUs) runs till at least 2014. So yes it's definitely a longer term plan.

But it's really not just a hardware problem. They have to try and convince developers to adopt a quirky heterogeneous way of computing to access an integrated GPU that really isn't that powerful. HSA puts a JIT-compiled software layer on top of it to homogenize things, but that will come at a performance cost even in the best case. So basically they're stuck between a rock and a hard place, while homogeneous throughput computing using AVX2 is being applauded left and right.