THANK YOU for that link! I read this article shortly after it had been published and somehow in the years that passed since then I forgot enough of the details that I could not track down the article when I went looking for it a year or two ago. I could have sworn the originating authors were Intel employees, now I see why all my re-searches for that article always turned out to so fruitless.
The article itself is just an exquisite tour de force of applied physics and information theory. Whats not to like or appreciate about that?
The fundamental limits entertained and discussed in the article are correct, of course, but there are some assumptions made in their application to CMOS scaling. The authors rightly touch on a number of these assumptions in their "reality check" section.
IV. REALITY CHECK: COMPARISON WITH THE 2001 ITRS
Present day and projected silicon integrated circuits differ
from the above model in several respects. First, the packing
density is less than , since the effective size of FET
switch is larger than the channel length (in practice,
10 15 ). Second, in integrated circuits, there exist many
layers of interconnects that dissipate energy and also require
some floorspace. It is well known that the minimum energy
dissipated by interconnects for successful signal transmission
is also . Therefore, taking into account that
a part of chip area is occupied by interconnect does not substantially
change estimate for minimum power.
Third, not all switches in the circuit change their state simultaneously;
in other words, the activity factor is less than 100%.
Note that both the packing density as well as the switch rate are both assumed to be at their theoretical limits throughout the calculations in the article.
There are logic circuits, e.g. SRAM, which certainly are pushed to have packing densities at the limit of the process technology design rules and capability for a given node. But many other logic circuits are setup to maximize performance and not necessary minimize die area as the first priority.
And the question matter of the number of transistors that are actually switching...I think CPU's follow along the lines of human brains where only something like 10% of the circuits are active at any given moment in time. The number might be higher for microprocessors but the point is that this gives you another order of magnitude of play in your transistor counts before reaching the same limits.
Another thing you probably noticed is that the 22nm node and 9nm gate numbers are used simply for the fact they were targets printed in the ITRS. HKMG alleviates the necessity of scaling Lg more aggressively than the node label (once again, as it was pre-0.35um), for most 22nm process nodes if they are HKMG based their min-Lg's will be right around 20nm, maybe 15nm for your more aggressive guys. An Lg of 9nm like they worried themselves about in the article probably won't be used until the 11nm node.
Engineering of the electron and hole mobilities of by stress manipulation and materials choices (gate stack as well as channel) has a LOT of gas and room to go from a materials science engineering viewpoint. The economics of scaling will dictate the pace of scaling, not the physics of the length-scales involved.
Power consumption requirements are driving real changes in the approaches to process technology as well as device integration. Future nodes may focus more on cost reduction rather than performance enhancement, or focus on performance enhancement rather than cost reduction, but not both.
Are you familiar with the history of the "tyranny of numbers" that was used to characterize the computing era prior to the invention of the IC? I think we ultimately will end up back at that point. We will be once again dealing with the tyranny of numbers of integrating thousands and tens of thousands of discrete IC's (each themselves operating at the limits of physics outlined in your linked article) which are integrated at a macro-scale into a computing infrastructure that harnesses the combined computing power while simultaneously distributing the power-dissipation over an area orders of magnitude larger than the chip sizes themselves.
Much like the distributed computing, the internet, and supercomputer architectures of today.
Thanks again for that link, I sorely missed having the liberty of reviewing that article to refresh my memories over these past few years
