Switching speeds and leakage down but frequency ~ constant... why?

Hulk

Diamond Member
Oct 9, 1999
5,145
3,746
136
Looking at various Intel slides which show transistor switching speed and leakage it appears as though leakage has decreased with every process generation from 65>45>32>22>14nm and transistor switching speed has increased. Why have we pretty much been stalled at 4GHz or even 5GHz if you include overclocking through all of these process generations?

The only think I can think of and it's not my field is that the smaller dimensions of the parts have made it more difficult to remove heat, that is the heat generated by the part is concentrated in a very small area, so if heat transfer coefficients remain the same, even though there is less heat with each generation there is also less area to transfer it away from the die so the end result is that the parts are thermally constrained?

If this was the case then super cooled parts should be getting faster with each process shrink right? I'm thinking this because the temperature delta in this case allows for very high heat transfer.

Also, if Intel were to make a financially and power unconstrained processor, which process size would lead to the fastest part? Of the recent ones they gone though?

For example, would a 45nm Haswell part be faster or slower than a 22nm Haswell part? Assuming best yields for each process?
 

Lepton87

Platinum Member
Jul 28, 2009
2,544
9
81
Heat density certainly matters but there's still room to increase it even further.
 

III-V

Senior member
Oct 12, 2014
678
1
41
Looking at various Intel slides which show transistor switching speed and leakage it appears as though leakage has decreased with every process generation from 65>45>32>22>14nm and transistor switching speed has increased. Why have we pretty much been stalled at 4GHz or even 5GHz if you include overclocking through all of these process generations?
We stalled only at 22nm, where Intel's data also showed delay times hadn't budged much at the voltage levels used for overclocking. Every subsequent generation did bring an improvement.

The regression from fluxless solder to high-end TIM is another substantial factor.
The only think I can think of and it's not my field is that the smaller dimensions of the parts have made it more difficult to remove heat, that is the heat generated by the part is concentrated in a very small area, so if heat transfer coefficients remain the same, even though there is less heat with each generation there is also less area to transfer it away from the die so the end result is that the parts are thermally constrained?
A lot of tech journalists have incorrectly pinned the blame on higher thermal density. Thermal density has been increasing for ages now... if it were truly an issue at 22nm where it hadn't been before, semicos would be panicking, because subsequent generations would exhibit this regression at stock frequencies. Seeing as they're not, the thermal density hypothesis is clearly wrong.
Also, if Intel were to make a financially and power unconstrained processor, which process size would lead to the fastest part? Of the recent ones they gone though?
32nm, and 14nm would probably be even better (data not yet available on that). 22nm regressed slightly. This is largely due to its use of tapered fins, which are unanimously regarded as achieving lower currents at a given voltage than rectangular fins.
For example, would a 45nm Haswell part be faster or slower than a 22nm Haswell part? Assuming best yields for each process?
Slower. 22nm is only slightly regressed from 32nm, but it is undeniably an improvement from 45nm for overclocking.

I have an article on the subject that explains the regression at 22nm in detail, however my site is under construction and is not ready to go public at this time.
 
Last edited:

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
There was a recession in speeds at 22nm because

schmoo_transistor.png


the curve of FinFET is different. Other reasons could be architectural and node development decisions, although I don't think that's the reason. I just think silicon hit some wall and going above that will result in a steep cubic increase in heat. 14nm is indeed superior to 22nm, so there should be a minor improvement in clock speed, and if we go beyond silicon at 10nm, I expect another decent improvement.

Another reason (maybe there isn't just one black bullet but maybe there are multiple factors) could be the interconnect: those actually get worse when you scale them down. Another thing to consider is that although density continues to go up at Moore's Law's pace, the transistor has actually stayed really similar: the gate isn't that much smaller than 90nm:

Wrk82Ip.jpg


And if you don't scale the gate down, clock speeds won't increase. There are also other components on the transistor like the high-k that haven't shrunk, although I do not know if that impacts transistor speed. But some people are optimistic that FinFET brings back Dennard scaling because they allow the gate length to become smaller again (anyone knows the gate length of 22 and 14nm?):

1951338


lg_scaling.jpg



I any case, the gate must become smaller if Intel wants to scale for another 10 years like Mark Bohr claimed they would!
 
Last edited:

know of fence

Senior member
May 28, 2009
555
2
71
There are two things to differentiate here clearly the large majority of CPUs got slower (clocked) by choice, because of many reasons like mobile, yields, parallelism, efficiency, appearance over function and so forth.

Got to be careful not to confuse this with the question why the top of the line CPUs don't clock any higher.

But ultimately we know both of these things are related, it's possible to design cicuitry either for speed or density. (Richland > Kaveri, Sandy > Ivy)
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Wire delay isn't really improving and could even be getting worse, and it's a large component of overall propagation delay, which limits clock speed. Although the physical distance between transistors has decreased, the wire width has also decreased, which increases wire resistance and works against delay.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,696
4,658
75

I think this, and in particular this:

cpu-frequency-image5.jpg


are terrible examples. Intel well knows that there are several methods for getting around this issue. Indeed, they have instructions just like this, such as multiply.

First, large instructions can sometimes be pipelined. While some circuits are working on the previous large instruction, others can be starting on the next one. Of course, this doesn't always work, but it can be worth investigating.

Second, a second execution unit can be added so that two of these large instructions can be done at once. There are about seven EUs in Haswell, and at least 4 of them can do useful math, as opposed to loads, stores, and branches.

If there isn't chip space to add a second EU, the first EU can be clocked twice as fast as the rest of the chip. The Pentium 4 did this with its ALUs. (I know that isn't a selling point, but it isn't the main reason the chip was widely regarded as a failure.) Pentium 4's have gotten up to 3, 4, 6, even 8GHz, so those little ALUs got up to 16GHz! Using more space as opposed to higher clocks is probably preferable in most cases, but this is a workable option.

Fourth, even if none of the above works, the chip doesn't have to be working on the same 4 instructions every cycle. Some software might have an independent loop nearby that the chip could work on. Plus, with hyper-threading, the chip can even work on a completely different program while the first program is stalled on that large instruction.

As to the OP's question, I'd say it's because high-power chips don't sell well these days. They eat too much battery and get too hot on laptops, and the tablets and phones users and companies seem to be targeting these days. They don't work well for servers either - racking up two lower-power chips tends to get more overall work done. Only a few enthusiasts seem to want high-power chips, and that's not enough for Intel to target their manufacturing process that direction.
 

bronxzv

Senior member
Jun 13, 2011
460
0
71
I think this, and in particular this:
First, large instructions can sometimes be pipelined. While some circuits
.

note that all pictures at your link illustrate pipelined execution

the picture in your post isn't about a "large instruction" but a large pipeline stage with respect to other stages (i.e. the stage with the largest FO4 delay , see [1]), the only solution to raise frequency without bubbles in this case will be to split the stage in two (or more), now I suppose we don't want a Prescott-like design again

the fact that you can put several pipelines in parallel (with SIMD and/or replicated execution units) is a completely orthogonal issue

[1] http://www.realworldtech.com/fo4-metric/
 
Last edited:

videogames101

Diamond Member
Aug 24, 2005
6,783
27
91
Modern transistors are really REALLY fast. If you have the power and area budget, you can easily make significantly faster CPUs than Intel sells. Look at IBM server chips. 5.5GHz is there standard for 6-core CPUs. It takes quite a bit more hand layout work to make that happen. These chips could be even faster, but it turns out heat/power is the number one largest constraint, so no one designs 10 GHz CPU's despite the fact you in all certainty could. It's just that it would take a ton of people to do the design and it would need LN2.

The manual layout work is mostly because of wire delay, as someone mentioned. It takes a lot of work to make lines functions at speeds like 5GHz+. Especially when you are trying to use min pitch to pack X number of wires into Y area, but it just won't work and you have to add fancy shielding and extra width on your wires, and buffer every signal going pretty much any distance. It all makes designing fast CPUs hard.
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
16,830
7,279
136
I don't see clock speeds getting better, and possibly worse, at least for Intel's stuff until either these happen:
- A new material that replaces Silicon.
- They replace Core with a radically altered design that enables higher speeds.

Considering Intel is all about perf/watt right now, I don't think Intel would hesitate for a second to reduce the theoretical top clock speed if it meant they could get a nice gain in perf/watt or density. Just throw a couple extra cores to make the server customers happy.

The curious thing is what the "95 W" Skylake's clock speeds will be, unless they are already starting to have heat density problems at 14 nm @ 4 Ghz+. I'm presuming that the 65W Broadwell-K i7 will be 3.6/4.