I'm confused about Intels 14nm process lead

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Ok, so we've all heard the Intel crowd's opinion now.


Was that really necessary? Did that add to the discussion besides stirring partisan pot?


But what is the reason Samsung decided to call their process tech 14 nm in the first place? Surely they must have some logical explanation for that which they convey to the public (whether you agree with it or not)? Does anyone know how they justify calling it 14 nm? And similar for TSMC 16FF/16FF+?

Node labels are just that - labels. They are completely arbitrary. They could call their next node "Ralph" and it would mean just as much as 14nm.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Maybe? It is a marketing term after all. Absent a definitive statement from Samsung or TSMC, there is no real way to know how they arrived at their naming conventions. But the available evidence points to naming driven more by marketing convenience than driven by process performance/density. Hell, if I remember correctly, the processes (at least for TSMC) were renamed midway through development.
I think this is what BK meant when he said his infamous:

“We felt like we went on a little early with 14nm as far as timing and performance and features and we saw actually competitors adjust to that. So we're gonna be a little bit more prudent, a little smarter about signaling to the industry exactly when, what and where. And you'll have to trust a little bit the 50 year history we have with Moore's Law and that we should be able to keep it going for 51 or 52 years. So we're gonna be a little careful there about that signaling exactly when, what and where.” --Brian Krzanich, CEO Intel, IM’14
I also like:

“It's a true [Intel] 14nm technology. There's lots of 14nm technologies around, and they're not all created equal.” --William Holt, Intel, IM’14
 

SOFTengCOMPelec

Platinum Member
May 9, 2013
2,417
75
91
Maybe they should have a more measurable dynamic, such as maximum transistor density per square cm. Maximum, because I believe that the transistor density (per square cm), varies (even for the same process), depending on what is being designed. E.g. I think that SRam (Cache) has the highest packing density, compared to cpus (minus their cache).

EDIT:
Or get a trusted, independent third party to independently measure/determine/decide the true nm dimensions. Like IEEE standards, and similar. Or at least create a framework, for strictly defining how nm size is determined.

EDIT2:
There was a time that I remember, probably in the 1980s/1990s, where people use to have/like/want stereo (ghetto blasta) hi-fi systems.

hqdefault.jpg


What happened is that, marketing/advertising, would come up with the most ridiculous schemes and figures for the music (watts) power of the system.

E.g. tiny 5W (true rms) systems, would be classed as 200 watt mega blasters, by dodgy marketing companies. At some point, it got really silly.
 
Last edited:

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
10nm was already effectively done since IDF12, but so what? That's what it means to be 4 years ahead. They're discussed as such because Intel totally wanted to give away its secret roadmap 2 years before they actually intend to do that (irony)!

Intel deeply cares about high transistor performance, certainly because 10nm was developed under P. Otellini's tenure, and since they've been researching III-V for a decade, I will take any other claim with a grain of salt until I see the evidence.

Saying that 10nm is effectively done means that they'd know at that point if it included III-V and/or Germanium, and wouldn't include those on a list of possible post-10nm technologies. I think the argument is pretty clear.
 
Mar 10, 2006
11,715
2,012
126
Maybe they should have a more measurable dynamic, such as maximum transistor density per square cm. Maximum, because I believe that the transistor density (per square cm), varies (even for the same process), depending on what is being designed. E.g. I think that SRam (Cache) has the highest packing density, compared to cpus (minus their cache).

EDIT:
Or get a trusted, independent third party to independently measure/determine/decide the true nm dimensions. Like IEEE standards, and similar. Or at least create a framework, for strictly defining how nm size is determined.

EDIT2:
There was a time that I remember, probably in the 1980s/1990s, where people use to have/like/want stereo (ghetto blasta) hi-fi systems.

hqdefault.jpg


What happened is that, marketing/advertising, would come up with the most ridiculous schemes and figures for the music (watts) power of the system.

E.g. tiny 5W (true rms) systems, would be classed as 200 watt mega blasters, by dodgy marketing companies. At some point, it got really silly.

There's a reason people publish SRAM cell sizes...

Intel's 14nm high density SRAM cell sizes are far lower than Samsung/TSMC 14/16nm.
 

Hans de Vries

Senior member
May 2, 2008
347
1,177
136
www.chip-architect.com
I thought that Intels 14nm process, was many years ahead of the competition. So that it would be at least 5 to 7 years before the competitors could put 14nm (approx) onto the market.

Intels main 14nm cpu (desktop) seems to be Skylake, which has not even been released yet (as far as I know, I do know Broadwell is coming out a bit sooner, but Skylake is probably the cpu to aim for, if performance is your primary objective).

So how come the potential competitors (such as Samsung and TSMC), are so close to 14nm large scale production ?

If you are curious, THIS THREAD, initiated my question. I did not put it into that thread, because someone was objecting to mentioning INTEL, in that thread. i.e. I thought it was better to ask the question in a new thread.

Second part to the question. Has Intel possibly had their 14nm process IP "leaked out". Possibly in a similar way to the Samsung/TSMC issue ?

TSMC's 20nm process has an effective 50% higher transistor density
as Intel's 14nm process if you compare Apple SOC's against Intel SOC's

TSMC's 16nm FF+ process will increase that to approximately 70%

Code:
TSMC 20nm process
-------------------------------------------------------------
Apple A8       89mm2  2B   ---> 22.5 Million transistors/mm2
Apple A8x     128mm2  3B   ---> 23.5 Million transistors/mm2

Intel 14nm process
-------------------------------------------------------------
Broadwell Y    82mm2  1.3B ---> 15.9 Million transistors/mm2
Broadwell U   133mm2  1.9B ---> 14.3 Million transistors/mm2
There are a number of reasons:

Intel uses a simplified type of interconnect with only horizontal
or vertical lines on a layer. The rest of the industry still manages
to use "2D" interconnection which can have '+' crossings or 'T'
crossings.

The latter is much harder to achieve. When Intel Introduced it's
simplified interconnect structure at SPIE in march 2006 nobody
believed that it would be even possible to do '2D' interconnect at
the current nodes with 193nm Lithography.

Nevertheless a combined effort of the Industry has resulted in
64nm 2D interconnect and it's seems it's not the end yet since
TSMC is now talking about 40nm 2D interconnect at it's 10nm
node.

Code:
TSMC 20nm      Intel 14nm 
---------      ----------
64 nm  2D      56 nm  1D      layer 1
64 nm  2D      70 nm  1D      layer 2
64 nm  2D      52 nm  1D      layer 3


The interesting question is if Intel will return to doing 2D interconnect
in the future.
 
Last edited:
Mar 10, 2006
11,715
2,012
126
TSMC's 20nm process has an effective 50% higher transistor density
as Intel's 14nm process if you compare Apple SOC's against Intel SOC's

TSMC's 16nm FF+ process will increase that to approximately 70%

Code:
TSMC 20nm process
-------------------------------------------------------------
Apple A8       89mm2  2B   ---> 22.5 Million transistors/mm2
Apple A8x     128mm2  3B   ---> 23.5 Million transistors/mm2

Intel 14nm process
-------------------------------------------------------------
Broadwell Y    82mm2  1.3B ---> 15.9 Million transistors/mm2
Broadwell U   133mm2  1.9B ---> 14.3 Million transistors/mm2

There are a number of reasons

Code:
TSMC 20nm      Intel 14nm 
---------      ----------
64 nm  2D      56 nm  1D      layer 1
64 nm  2D      70 nm  1D      layer 2
64 nm  2D      52 nm  1D      layer 3

Comparing a CPU design intended to hit >3GHz on the CPU and ~1GHz on the GPU with a design jam packed with blocks that run at very low clock speeds, the highest of which is around 1.4GHz (Cyclone CPU cores)?

Also, that's just the particular metal stack choice for Broadwell; chip designers can choose to use more of the tighter pitch layers in lower frequency SoC designs than they do in the high clock speed CPU designs.

Cvdj6Sv.png
 
Last edited:

lopri

Elite Member
Jul 27, 2002
13,329
709
126
Considering that 22nm Atoms barely compete with 28nm Snapdragons, I would say Intel is like 2 years behind. :biggrin:

Finally, why should the name matter? Don't we know enough of the density/performance characteristics of each process to know which one is generally "better"? Or at least better in a particular category? The name is just a distraction at this point.
Agreed 110% Granted this is a tech forum and I like to discuss technical merits, but many experts here seem to conclude that the terms are arbitrary and marketing-driven. cannot seem to agree even on how to measure and compare different processes. (understandable, really ^^)
 
Last edited:

imported_ats

Senior member
Mar 21, 2008
422
64
86
TSMC's 20nm process has an effective 50% higher transistor density
as Intel's 14nm process if you compare Apple SOC's against Intel SOC's

TSMC's 16nm FF+ process will increase that to approximately 70%

Code:
TSMC 20nm process
-------------------------------------------------------------
Apple A8       89mm2  2B   ---> 22.5 Million transistors/mm2
Apple A8x     128mm2  3B   ---> 23.5 Million transistors/mm2

Intel 14nm process
-------------------------------------------------------------
Broadwell Y    82mm2  1.3B ---> 15.9 Million transistors/mm2
Broadwell U   133mm2  1.9B ---> 14.3 Million transistors/mm2

It should probably also be pointed out that comparing transistors/mm2 is a rather poor metric as there are multiple different definitions for transistors/mm2! And that transistors/mm2 also varies greatly depending on many other metrics within a design. Just as a simple example of the issues, do you consider a double drive transistor to be 1 transistor or two? Sometimes two nmos transistors in double drive are more efficient (for some metric of efficient) than a single double strength transistor. These get baked into standard cells and then by 1 metric of transistors/mm2 you end up with 2x the transistors but no real difference in practice.


Intel uses a simplified type of interconnect with only horizontal
or vertical lines on a layer. The rest of the industry still uses "2D"
interconnection which can have '+' crossings or 'T crossings'.

The latter is much harder to achieve. When Intel Introduced it's
simplified interconnect structure at SPIE in march 2006 nobody
believed that it would be possible to do '2D' interconnect at
the current nodes with 193nm Lithography.

The later isn't really harder to achieve but it does have significant impact on yield. It should also be pointed out that almost all metal on any process is already unidirectional and enforcing unidirectional metal has minimal impact on area while greatly impacting yield. Its one of the reasons that TSMC yields have always been inferior to Intel's yields.
 

SOFTengCOMPelec

Platinum Member
May 9, 2013
2,417
75
91
There are a number of reasons:

Intel uses a simplified type of interconnect with only horizontal
or vertical lines on a layer. The rest of the industry still uses "2D"
interconnection which can have '+' crossings or 'T crossings'.

The latter is much harder to achieve. When Intel Introduced it's
simplified interconnect structure at SPIE in march 2006 nobody
believed that it would be possible to do '2D' interconnect at
the current nodes with 193nm Lithography.

Nevertheless a combined effort of the Industry has resulted in
64nm 2D interconnect and it's seems it's not the end yet since
TSMC is now talking about 40nm 2D interconnect at it's 10nm
node.

Code:
TSMC 20nm      Intel 14nm 
---------      ----------
64 nm  2D      56 nm  1D      layer 1
64 nm  2D      70 nm  1D      layer 2
64 nm  2D      52 nm  1D      layer 3

I think I have seen a similar effect, in other technical areas.
Although it is good that Intel (arguably) are way in front (in process nodes), they may be suffering a bit, because they are the first (outside of sometimes IBM and/or pure research places, such as Universities) to do these things, so it is by definition, the Mk1 version of it.
Some time later (maybe 5 years or so), other companies (such as TSMC/Samsung etc) are bringing their own version out of such advanced technology. But because it is so much later, technology has moved on, so that limitations which were believed to have existed, when Intel first brought it out. Have either been found to NOT be a limitation and/or have been resolved by clever innovation, experimentation or accident etc.

E.g. Valves/Tubes (a very old form of electronics) were not especially reliable or good, when they originally came out (1930s++). But much later, when the Russians made them, manufacturing tolerances/quality/capabilities had improved, so that they were potentially better. (I'm NOT 100% about the accuracy of this paragraph. It has been put in to show a principle and may be wrong).



Comparing a CPU design intended to hit >3GHz on the CPU and ~1GHz on the GPU with a design jam packed with blocks that run at very low clock speeds, the highest of which is around 1.4GHz (Cyclone CPU cores)?

Also, that's just the particular metal stack choice for Broadwell; chip designers can choose to use more of the tighter pitch layers in lower frequency SoC designs than they do in the high clock speed CPU designs.

Cvdj6Sv.png

I had not really thought about the switching speed of the transistors (fets), since I presume that a lowering of the rds on resistance (i.e. making it a bit bigger, on-chip), will "speed it up", as it will have a better ability to fight the undesired "stray capacitance". So high frequency cpus (>3GHz), needing slightly bigger fets (if I understand what you were saying, correctly) makes sense to me.
Presumably Intel work out the best compromise between increasing the fet size to increase its speed, vesus the increased capacitance (maybe) of the gate by doing this, and other disadvantages to increasing its size.
I'm NOT too knowledgeable about IC fet low level design, so maybe completely wrong/misunderstanding things here.
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
TSMC 20nm process
-------------------------------------------------------------
Apple A8 89mm2 2B ---> 22.5 Million transistors/mm2
Apple A8x 128mm2 3B ---> 23.5 Million transistors/mm2

Intel 14nm process
-------------------------------------------------------------
Broadwell Y 82mm2 1.3B ---> 15.9 Million transistors/mm2
Broadwell U 133mm2 1.9B ---> 14.3 Million transistors/mm2

Using # of transistors / area to judge process can be skewed by RF/SRAM numbers, frequency targets, design methodology, "how we count" and as you mentioned, interconnect limitations.
 
Last edited:

SOFTengCOMPelec

Platinum Member
May 9, 2013
2,417
75
91
Using # of transistors / area to judge process can be skewed by RF/SRAM numbers, frequency targets, design methodology, "how we count" and as you mentioned, interconnect limitations.

It can even vary, depending on who ends up selling it, as well, I think. If I remember correctly, there was a time, when IBM and Cyrix, had a joint venture, and they both sold cpus, which were made on EXACTLY the same production line, and in principle (ignoring binning), were 100% identical.

Despite this fact, IBM and Cyrix, came up with significantly different specifications, for the (otherwise identical) cpu.

In practice, IBMs were probably correctly specified, and at slightly lower frequencies, but were almost 100% reliable in practice.

But Cyrix ones, were overrated (effectively being sold as pre-overclocked chips), and tended to be a bit unreliable, as a result.

N.B. My analogy is a bit different, to what you were talking about. But I thought the basic principal was similar enough, to mention it.
 

Hans de Vries

Senior member
May 2, 2008
347
1,177
136
www.chip-architect.com
Comparing a CPU design intended to hit >3GHz on the CPU and ~1GHz on the GPU with a design jam packed with blocks that run at very low clock speeds, the highest of which is around 1.4GHz (Cyclone CPU cores)?

Also, that's just the particular metal stack choice for Broadwell; chip designers can choose to use more of the tighter pitch layers in lower frequency SoC designs than they do in the high clock speed CPU designs.

Cvdj6Sv.png

Well, only 20% of Broadwell U is CPU+L3 running at high frequencies.
The other 80%, mostly iGPU, runs at much lower frequencies and still
the average transistor density is lower.

Intel-5th-Gen-Core-Die-Map.jpg
 

SOFTengCOMPelec

Platinum Member
May 9, 2013
2,417
75
91
Well, only 20% of Broadwell U is CPU+L3 running at high frequencies.
The other 80%, mostly iGPU, runs at much lower frequencies and still
the average transistor density is lower.

Isn't that because they have to use the same process for ALL the transistors on the chip (otherwise it would be very expensive to make, because of the huge increase in process steps needed). So they have to choose to use either all slower, high density transistors, or less, but faster transistors ?

I read a good article about this, to do with AMDs APU design decisions, and reasons to why it did not clock up to such high frequencies, these days (because the higher density process, allowed more GPU transistors).
 

Hans de Vries

Senior member
May 2, 2008
347
1,177
136
www.chip-architect.com
Isn't that because they have to use the same process for ALL the transistors on the chip (otherwise it would be very expensive to make, because of the huge increase in process steps needed). So they have to choose to use either all slower, high density transistors, or less, but faster transistors ?

I read a good article about this, to do with AMDs APU design decisions, and reasons to why it did not clock up to such high frequencies, these days (because the higher density process, allowed more GPU transistors).

The highest density is achieved with automated layout routing,
typically used for almost everything except high frequency cores.

The latter are partly routed 'by hand' for the highest performance
at the expense of the density. The largest part of Broadwell U is
synthesized with automated routing however so that doesn't explain
the density either.
 

SOFTengCOMPelec

Platinum Member
May 9, 2013
2,417
75
91
The highest density is achieved with automated layout routing,
typically used for almost everything except high frequency cores.

The latter are partly routed 'by hand' for the highest performance
at the expense of the density. The largest part of Broadwell U is
synthesized with automated routing however so that doesn't explain
the density either.

That is weird.
Could it be the tick tock mechanism (i.e. because Broadwell is a process node improvement, but otherwise basically based on Haswell). i.e. next generation, they will risk bigger architectural changes.
Or maybe some other factor(s), such as yield or something.

When I read about stuff like this, they seem to talk in an either/or description.
i.e. You can have high density (for the whole chip) OR High speed for the whole chip OR low power consumption for the entire chip, implying that there are reasons why it tends to be across the entire chip, rather than just a section of it.

EXAMPLE:

TSMC's 20nm process technology can provide 30 percent higher speed, 1.9 times the density, or 25 percent less power than its 28nm technology
 
Last edited:

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Maybe they should have a more measurable dynamic, such as maximum transistor density per square cm. Maximum, because I believe that the transistor density (per square cm), varies (even for the same process), depending on what is being designed. E.g. I think that SRam (Cache) has the highest packing density, compared to cpus (minus their cache).

EDIT:
Or get a trusted, independent third party to independently measure/determine/decide the true nm dimensions. Like IEEE standards, and similar. Or at least create a framework, for strictly defining how nm size is determined.

Please don't confuse engineering with marketing.

Papers presented at IEDM do not contain marketing stuff. They're reports of achievements. They can easily be verified and will be verified by folks like Chipworks and probably competitors.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Saying that 10nm is effectively done means that they'd know at that point if it included III-V and/or Germanium, and wouldn't include those on a list of possible post-10nm technologies. I think the argument is pretty clear.

Intel does not want to reveal even the slightest information about 10nm until November '15 (except that the cost reduction trend can be continued/accelerated). That is 2 years after your article. Do you think they think nobody will notice when they don't put III-V on such a slide?

I also don't think such a slide actually exist. All slides are for 10nm and beyond.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Considering that 22nm Atoms barely compete with 28nm Snapdragons, I would say Intel is like 2 years behind. :biggrin:
The performance of a chip is (roughly speaking) determined by its architecture, while its power consumption is by the process node. Silvermont is about on par or so in terms of IPC, but from the little data I've got, I'd say that Silvermont crushes Krait in terms of PC.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
It should probably also be pointed out that comparing transistors/mm2 is a rather poor metric as there are multiple different definitions for transistors/mm2! And that transistors/mm2 also varies greatly depending on many other metrics within a design. Just as a simple example of the issues, do you consider a double drive transistor to be 1 transistor or two? Sometimes two nmos transistors in double drive are more efficient (for some metric of efficient) than a single double strength transistor. These get baked into standard cells and then by 1 metric of transistors/mm2 you end up with 2x the transistors but no real difference in practice.


The later isn't really harder to achieve but it does have significant impact on yield. It should also be pointed out that almost all metal on any process is already unidirectional and enforcing unidirectional metal has minimal impact on area while greatly impacting yield. Its one of the reasons that TSMC yields have always been inferior to Intel's yields.

How do you know all of this / source?

Thanks for your contribution.
 
Last edited:
Mar 10, 2006
11,715
2,012
126
The performance of a chip is (roughly speaking) determined by its architecture, while its power consumption is by the process node. Silvermont is about on par or so in terms of IPC, but from the little data I've got, I'd say that Silvermont crushes Krait in terms of PC.

Micro-architecture has a significant impact on power consumption.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Using # of transistors / area to judge process can be skewed by RF/SRAM numbers, frequency targets, design methodology, "how we count" and as you mentioned, interconnect limitations.

Here's my idea:

When Intel releases SoFIA, the folks from Chipworks should measure the SVM die area, compare it to 22nm Bay Trail-T, and plug in the 1.9x and 2.2x density improvements to get a quite apples to apples comparison of density (same logic silicon comparison).
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Well, only 20% of Broadwell U is CPU+L3 running at high frequencies.
The other 80%, mostly iGPU, runs at much lower frequencies and still
the average transistor density is lower.

Are you sure that Apples uses the layout measurement instead of the actual number of transistors? Else, the comparison becomes meaningless.

I actually would like to know the fin count comparison, but we'll never get that of course.