Intel "Haswell" Speculation thread

Page 26 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Revolution 11

Senior member
Jun 2, 2011
952
79
91
" Core size can't be increased much more without hitting diminishing returns."
But luckily transistors are still shrinking so we can have more powerful execution units in each die shrink? Except all these components being integrated onto the die and separate dies in the package (NB, memory controller, GPU, VRM, SB) are slowing this growth.

Maybe you could help me understand something. Why can't they take all the execution units from a quad core die, and put them into 2 larger cores. Yes, you might have problems keeping all the execution units busy but then each core would be much more powerful.

Am I right that they have increased cores in order to keep execution units more fully utilitized when cpu usage is at 100%, along with hyperthreading to help?

The reason why they don't make dual cores that are as powerful as quad cores is because there is a lack of efficiency?

Because I wouldn't mind giving up a few cores for much better single threaded performance.

Besides what VirtualLarry pointed out, there is the matter of economics to point out. Currently, users who want a more budget CPU will get a dual-core chip. This saves them money but it also saves Intel money since the die area is smaller so the production cost is lower. I am sure Intel could make a gigantic dual core die but why would they spend all that R&D and production money. Users who want less performance get less cores, users who need more performance get more cores (and HT if you continue to move up). Unless you are offering like several hundred million dollars for such a die, no one will make it.

Transistor shrinks do allow more density per mm. But all the semiconductor companies use these shrinks to lower the production cost (increase number of chips per wafer) as a way to compensate for the ever-increasing R&D costs associated with die shrinks. We seem to be approaching a point where decreased die sizes may not be enough to pay for the next process node. Hence the decreased value from Intel CPUs: no solder, no TSX on overclocking chips, very expensive GT3e, stagnation of clocks/IPC/cores.
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
Haswell's weakness is the single core performance over the generations.

I read the same crap posted all over these forums. You people want like 20% IPC performance increase year over year and it is just not possible. The x86 uArch is pretty much maxed out and there is not much more juice that can be squeezed out of it.

New instructions, better memory subsystem, cache, and to a lesser extent, more cores are the solution to get around this. If you do not realize this by now, prepare to be disappointed for a long time.
 

TuxDave

Lifer
Oct 8, 2002
10,572
3
71
Making the cores "wider" is limited by the amount of ILP (instruction-level parallelism). Haswell is already 6-wide (or is that 8-wide) as far as execution pipelines goes.

So what you are asking, is already somewhat true in Haswell. They did make the cores "wider".

There's two types of "widths" to talk about. One is for instruction level parallelism. How many simultaneous instructions can you execute in a single cycle. Serial dependent instructions cannot go at the same time BUT an out of order execution engine will help bypass this. So the effectiveness of just shoving more execution units onto one core depends on how serial the instruction flow is, the out of order window and also the allocation/retirement width. Basically how many instructions shove into the execution engine per cycle. Haswell improves this and this could improve legacy performance. For example heavy FP mul instructions, now you get two per cycle instead of one.

Then there's data width that Haswell improves with introducing AVX2. This dictates how many elements of data you can compute at the same time within a single instruction. Unfortunately this will not improve legacy performance and not all traces. Some workloads have an infinite pile of data that it needs to do the same operation on and can easily expand to the full datawidth. However, there are still a large number of high performance traces that use scalar data and don't get any benefit from AVX2 or even AVX1.

So Haswell does improve the design in both these regards. Personally, CPU improvements have to be done on two parts. You need new instructions to get really large improvements in emerging workloads. But CPU reviews and all the hype really show mostly legacy performance and so on the side, you have to also put in some hardware to improve legacy traces that don't benefit from any new instructions. I think it's rare for people to continue benchmarking as new code comes out. The exception being that AIDA bandwidth trace. What you get on day #1 is what sticks in people's head.
 
Last edited:

SiliconWars

Platinum Member
Dec 29, 2012
2,346
0
0
Intel Ivy Bridge still isn't over 50% faster clock-clock than a 3MB/core Penryn CPU.

For instance my X9100 on my G50VT laptop overclocked to 3.5GHZ , 1.325V (!00% stable),
Super pi: 1m: 14.7s 2m: 34s

Ivy Bridge i5-3210, @3.1GHZ stock turbo;
Super pi: 1m: 12.7s 2m: 29s



So my 2009 laptop is still decent with real world responsiveness Rough estimate of ~33-50% faster clock for clock with floating point than Penryn. Of course with multitasking/SMP it would be pwned.

Super Pi is no longer a legit benchmark since the Haswell regression.
 

Pheesh

Member
May 31, 2012
138
0
0
Transistor shrinks do allow more density per mm. But all the semiconductor companies use these shrinks to lower the production cost (increase number of chips per wafer) as a way to compensate for the ever-increasing R&D costs associated with die shrinks. We seem to be approaching a point where decreased die sizes may not be enough to pay for the next process node. Hence the decreased value from Intel CPUs: no solder, no TSX on overclocking chips, very expensive GT3e, stagnation of clocks/IPC/cores.

"no solder"- From a raw material cost perspective solder/flux is cheaper than TIM, why assume the switch is a cost reason and not related to mechanical stress/die crack/failure reasons for smaller 22nm die?

"no TSX on overclocking chips"- Features are fused off all the time and this isn't anything new; it's market segmentation. Companies don't like to cannibalize their other product lines.

"very expensive GTe3"- custom die + harder to yield + limited market = $$$
If they are overpriced then people won't buy it. They are priced what the market will pay for them.

"Clocks stagnating"- haven't we been down that road w/ Netburst?

"IPC stagnation"- In the physical world exponential improvements eventually run into a wall, I think we are pretty close here under current materials. A given process flow can only be optimized so much before all the constraints are removed. IPC improvements are not magically and always attainable. Physics eventually comes into play and we're clearly at the point of diminishing returns. If IPC is so easily improved Intel would not be standing alone towering above others. The others aren't there because it's not easy.

"stagnation of cores"- I doubt Intel has seen huge spikes of demand on their 6c/12t CPU to justify their whole line moving there. The Mass market is agreeable with 2c/4c so that's where the meat will go.
 

Revolution 11

Senior member
Jun 2, 2011
952
79
91
I am not blaming Intel for the IPC or cores stagnation. They are making what the market demands and needs. I didn't know about the solder being cheaper so my bad there.

But the TSX not being present on all chips is silly. If you want to increase adoption of a new standard/instruction, include it on all your products. Otherwise software will always target the lowest common denominator (SSE2 for example). Excluding TSX on certain chips just delays the adoption of a unique feature to Intel.

GT3e is probably very expensive for Intel but it doesn't make sense for the price to be so high. Of course the markets will adapt to whatever the price is but if Intel is serious about integrated graphics, they can't make their best integrated solution priced out of the larger markets. If Intel is scared about their weak iGPUs vs competitors like AMD/ARM, then they can't price them so high. If they are not scared, why waste the die space at all? Doing this half-heartedly is the problem.
 

BallaTheFeared

Diamond Member
Nov 15, 2010
8,115
0
71
Ahh so nice to see other forums having something positive to say for a change, I thought I was going to have to refuse shipment!

Sounds like Haswell is too hot right? Not necessarily. It totally depends on what you want to do with it. If you just use your computer for Prime95, then you're going to live in the 4.0-4.3GHz range due to heat. But if you just play games and browse the web, your temps are going to be fine:

Max core temp after 2 hours of Team Fortress 2 (32 man custom maps) @ 4.5GHz and DDR @ 2133: 62C.
Max core temp after 3 hours of Planetside 2 @ 4.7GHz and DDR @ 1600: 64 C.

I think this is great. Planetside 2 hasn't dipped below 50 fps yet, whereas my i7-930 @ 4GHz dipped to 31fps at times. And the lowest I saw in TF2 was 90 fps for a brief moment, otherwise staying around 120fps during battle (299fps otherwise) where my i7-930 used to dip into the 50's. And even more interesting, my i7-930 would hit peak temps of 79C. So as far as I'm concerned, haswell gives me ~1.5x performance improvement AND runs cooler than my old CPU on the games I play. Win-win. If I happen to run into a Prime95 scenario, then it'll just get very hot and throttle down to 4.4GHz. But it won't crash. Heck, I'm considering going for 4.8GHz or more now since my games have so much thermal headroom.

I won't be playing Prime95, and if games pick up AVX2 I won't need an overclock anyways. :thumbsup:
 

BallaTheFeared

Diamond Member
Nov 15, 2010
8,115
0
71
Really I have no idea, I just remember when we were discussing Sandy Bridge so so long ago it was stated and then repeated that games generally do not use much floating point since it's much slower than integer.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,343
10,046
126
Really I have no idea, I just remember when we were discussing Sandy Bridge so so long ago it was stated and then repeated that games generally do not use much floating point since it's much slower than integer.

quite the contrary, most 3D engines use floats pretty heavily. That, and cache.

Edit: Generally SSE/SSE2 single-precision floats, not FPU x87 floats. At least, I think so. I didn't do the core engine design, but I worked at a game company for almost a year.
 
Last edited:

inf64

Diamond Member
Mar 11, 2011
3,698
4,018
136
In order to utilize AVX(2) games need to be specially optimized. This takes time and money and extends the development time. Game devs code for some baseline and I doubt they will screw all those C2Q, Westmere, Nehalem and pre-BD/PD AMD users. So I think AVX enabled/optimized games won't arrive in next couple of years (unless console devs start to use it since Jaguar supports AVX).