Globalfoundries 7LP 7nm Leading Performance FINFET process and FX-7 ASIC platform

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

CatMerc

Golden Member
Jul 16, 2016
1,114
1,153
136
Something also important to remember is rack density. Even if your perf/watt is 10% lower, if you provide double the performance in the same amount of space, you still end up with a favorable TCO.
To put it into Epyc terms, even if it draws 250W for 32 cores at 5GHz, if you can cool that efficiently, you're looking at amazing amounts of performance per socket. Such a solution could be better than 4GHz at 180W in TCO terms due to rack density afforded.

This makes picking the optimal spot on the frequency/voltage curve extra difficult for servers, and why a high performance node is a boon not just for desktop users, but servers too.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,116
136
Something also important to remember is rack density. Even if your perf/watt is 10% lower, if you provide double the performance in the same amount of space, you still end up with a favorable TCO.

Well, obviously we won't be seeing double the performance/sq meter of floor space here - otherwise it would be an *easy* decision, especially for urban server ops.
_________________________________________________________________________________________________________

I'd be impressed as all get out if Zen2 could turbo up to 5 GHz. It's early and this is a PR effort by GloFo (obviously, since they left out critical metrics). I do think that a 50% boost in cores/CCX is a straight forward ask - hopefully AMD agrees. As far as IPC, I'm in wait and see mode. Seems like AMD wrung an awful lots of IPC out of Zen already.
 
  • Like
Reactions: Aenra and Sweepr

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
Seems like AMD wrung an awful lots of IPC out of Zen already.

If its anything like the engineering I do, then they'll have learned an awful lot in the process of bringing it to market.

Things that worked as expected, things that didn't go so well and even things that exceeded expectations somewhat. There will be a load of rough edges to polish.
 

maddie

Diamond Member
Jul 18, 2010
5,204
5,613
136
Well, obviously we won't be seeing double the performance/sq meter of floor space here - otherwise it would be an *easy* decision, especially for urban server ops.
_________________________________________________________________________________________________________

I'd be impressed as all get out if Zen2 could turbo up to 5 GHz. It's early and this is a PR effort by GloFo (obviously, since they left out critical metrics). I do think that a 50% boost in cores/CCX is a straight forward ask - hopefully AMD agrees. As far as IPC, I'm in wait and see mode. Seems like AMD wrung an awful lots of IPC out of Zen already.
That statement reminds me of a question [unanswered] I once asked here.

Is there some theoretical limit to single thread IPC? AFAIK, all we have to go on for high performance X86 is Intel's products and ALL assume that if this is what Intel gets, then that is the best possible at the time.

This reasoning was behind the low expectations for Zen in both IPC and HT. Well, they beat Intel in HT. The unquestioned general assumption that the most any competitor can do, is get close to Intel, not surpass them.
 

raghu78

Diamond Member
Aug 23, 2012
4,093
1,476
136
That statement reminds me of a question [unanswered] I once asked here.

Is there some theoretical limit to single thread IPC? AFAIK, all we have to go on for high performance X86 is Intel's products and ALL assume that if this is what Intel gets, then that is the best possible at the time.

This reasoning was behind the low expectations for Zen in both IPC and HT. Well, they beat Intel in HT. The unquestioned general assumption that the most any competitor can do, is get close to Intel, not surpass them.

The fact is improving IPC on such large cores is a difficult task and is costly in terms of power and transistors spent. There is a law of diminishing returns in trying to increase IPC. Therefore AMD or Intel have to be very judicious in terms of where they spend the extra transistors from a node jump.
 
  • Like
Reactions: Drazick and Aenra

raghu78

Diamond Member
Aug 23, 2012
4,093
1,476
136
Here we have multiple sources confirming that GF 7LP is launching with a process for high performance .

http://www.anandtech.com/show/11558...nm-plans-three-generations-700-mm-hvm-in-2018

The 7 nm platform of GlobalFoundries is called 7LP for a reason — the company is targeting primarily high-performance applications, not just SoCs for smartphones, which contrasts to TSMC’s approach to 7 nm. GlobalFoundries intends to produce a variety of chips using the tech, including CPUs for high-performance computing, GPUs, mobile SoCs, chips for aerospace and defense, as well as automotive applications. That said, in addition to improved transistor density (up to 17 million gates per mm2 for mainstream designs) and frequency potential, GlobalFoundries also expects to increase the maximum die size of 7LP chips to approximately 700 mm², up from the roughly 650 mm² limit for ICs the company is producing today. In fact, when it comes to the maximum die sizes of chips, there are certain tools-related limitations.

For their newest node, the company is focusing on two ways to reduce power consumption of the chips: implementing superior gate control, and reducing voltages. To that end, chips made using GlobalFoundries' 7LP technology will support 0.65 – 1 V, which is lower than ICs produced using the company’s 14LPP fabrication process today. In addition, 7LP semiconductors will feature numerous work-functions for gate control.


https://www.semiwiki.com/forum/content/6837-globalfoundries-7nm-euv-update.html

GF however is leading with a high performance (LP equals Lead Performance in IBM speak) version of 7nm for AMD while TSMC is first with a low power version of 7nm for Apple, Qualcomm, MediaTek, and the other SoC vendors.


https://www.globalfoundries.com/sites/default/files/product-briefs/7lp-product-brief.pdf
http://semimd.com/chipworks/2017/01/18/iedm-2016-setting-the-stage-for-75-nm/

Four Vt options are available in the TSMC 7-nm technology [1] . There are four device Vt options with a range of ~200 mV.

GF 7LP provides choice of 5 core device Vt compared to TSMC which provides 4. Here again the requirement to support very high performance 5 Ghz and low power mobile is driving the need to provide a broader range of Vt and higher Vt.

TSMC is launching first with a low power mobile version of N7 for its primary customer Apple and GF is launching with a high performance version of 7LP for AMD. Its going to be interesting to see the comparison of products manufactured at TSMC N7 HPC for Nvidia vs GF 7LP for AMD in 2019.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,116
136
I hope GloFo pulls this off. Now that they have deeper relationship with IBM, we'll really get to see what the two can do together.
 
  • Like
Reactions: Aenra and ajc9988

stuff_me_good

Senior member
Nov 2, 2013
206
35
91
While this all is very interesting to read, but knowing GF track record of over promising and under delivering, I'm not jumping around and clapping my hands just yet.

IBM engineering mixed in this equation is rising my hopes to new level though, so there seems to be real change to hit the H2 2018 volume production target rather than Q4 2019 which it would be in reality if GF would do everything alone.
 

NTMBK

Lifer
Nov 14, 2011
10,522
6,046
136
That statement reminds me of a question [unanswered] I once asked here.

Is there some theoretical limit to single thread IPC? AFAIK, all we have to go on for high performance X86 is Intel's products and ALL assume that if this is what Intel gets, then that is the best possible at the time.

This reasoning was behind the low expectations for Zen in both IPC and HT. Well, they beat Intel in HT. The unquestioned general assumption that the most any competitor can do, is get close to Intel, not surpass them.

You're limited by the number of parallel operations present in your code. For instance in equation A = (B + C) / D, you can't run both the addition and division in parallel; the second is dependent on the result of the first. This level of parallelism is different for every piece of code, but is the ultimate limit to instruction level parallelism.
 

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
You're limited by the number of parallel operations present in your code. For instance in equation A = (B + C) / D, you can't run both the addition and division in parallel; the second is dependent on the result of the first. This level of parallelism is different for every piece of code, but is the ultimate limit to instruction level parallelism.

What you just described is a relationship between non/commutative and non/associative operations with respect to parallelism ... :)

If every program could be reduced to be either commutative (like addition) or associative (like multiplication) operations then programmers wouldn't be worth much since one could parallelize every program optimally with code being near maintenance free to boot and we'd all be rocking with GPUs ... :eek:
 

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,730
136
What you just described is a relationship between non/commutative and non/associative operations with respect to parallelism ... :)

If every program could be reduced to be either commutative (like addition) or associative (like multiplication) operations then programmers wouldn't be worth much since one could parallelize every program optimally with code being near maintenance free to boot and we'd all be rocking with GPUs ... :eek:
That particular example of A = (B+C)/D has nothing to do with commutativity. Assuming A, B, C are vectors then D must be a scalar(in the mathematical sense), in which case it is a simple multiplication that can be done outside of the loop in one operation.
 
  • Like
Reactions: Aenra and Drazick

NTMBK

Lifer
Nov 14, 2011
10,522
6,046
136
That particular example of A = (B+C)/D has nothing to do with commutativity. Assuming A, B, C are vectors then D must be a scalar(in the mathematical sense), in which case it is a simple multiplication that can be done outside of the loop in one operation.

Actually I was intending for A, B and C to all be scalars as well :) I was talking about instruction level parallelism, or ILP, which is what superscalar CPUs exploit.
 

khon

Golden Member
Jun 8, 2010
1,318
124
106
Honestly, I have a hard time believing this is true.

Intel was first to 14nm back in 2014, while GF only reached 14nm in 2016, so GF was 2 years behind. And now they claim 7nm in 2018, while Intel will only be doing 10nm at that time ? So we are supposed to believe that they have suddenly jumped from being 2 years behind to being one node ahead ? 14nm to 7nm in 2 years while Intel has taken 4 years to go from 14nm to 10nm ? That would be quite miraculous if they pull it off, but I have my doubts.
 

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
That particular example of A = (B+C)/D has nothing to do with commutativity. Assuming A, B, C are vectors then D must be a scalar(in the mathematical sense), in which case it is a simple multiplication that can be done outside of the loop in one operation.

Not what I had in mind. My point was if you read a little bit about algebraic fields, commutative and associative operations under certain number fields are order independent which has huge ramifications for parallelization ...

For example ... (assuming that the set of numbers only consists of integers)

(A = A+B+C+D) contains only commutative operations which is easily amenable to parallelization ... (In fact if one had a 4-operand addition unit it would take only 1 instruction to execute!)

(A = A-B-C-D) does not have any commutative or associative operations so it can't be parallelized. It would need 3 instructions to resolve problem ... (A-B, (A-B)-C, ((A-B)-C)-D in that order)
 

NTMBK

Lifer
Nov 14, 2011
10,522
6,046
136
Not what I had in mind. My point was if you read a little bit about algebraic fields, commutative and associative operations under certain number fields are order independent which has huge ramifications for parallelization ...

For example ... (assuming that the set of numbers only consists of integers)

(A = A+B+C+D) contains only commutative operations which is easily amenable to parallelization ... (In fact if one had a 4-operand addition unit it would take only 1 instruction to execute!)

(A = A-B-C-D) does not have any commutative or associative operations so it can't be parallelized. It would need 3 instructions to resolve problem ... (A-B, (A-B)-C, ((A-B)-C)-D in that order)

And then of course we need to consider that we have limited precision in all of our instructions, so the amount of re-ordering that the CPU can do at runtime is limited :) Mathematically speaking, an infinite chain of A = B + C + D + E + ... is associative and can be massively parallelised (with some kind of tree reduction at the end), but in reality changing the execution order can massively change the end result (especially for floating point datatypes). Of course at compile time the user can set compiler flags to indicate that he is more relaxed about accuracy (usually some sort of --fast_math flag) and let the compiler make similar optimizations, but the CPU does not have that information.
 
  • Like
Reactions: Aenra and tamz_msc

exquisitechar

Senior member
Apr 18, 2017
727
1,033
136
Honestly, I have a hard time believing this is true.

Intel was first to 14nm back in 2014, while GF only reached 14nm in 2016, so GF was 2 years behind. And now they claim 7nm in 2018, while Intel will only be doing 10nm at that time ? So we are supposed to believe that they have suddenly jumped from being 2 years behind to being one node ahead ? 14nm to 7nm in 2 years while Intel has taken 4 years to go from 14nm to 10nm ? That would be quite miraculous if they pull it off, but I have my doubts.

From what I know, Intel's 10nm and GF 7nm will be very similar in most aspects. They will not be one node ahead.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,596
136
From what I know, Intel's 10nm and GF 7nm will be very similar in most aspects. They will not be one node ahead.
Yep and add 14nm arived from Samsung and 7nm lp is a IBM technology. The timeframes of then ariving is more about deals and when ibm/samsumg is ready. Not gf.
If gf had to make 14nm by itself we would still be waiting for it.

Gf dropped the stupid idea that they had to make the basic process research themselves. A very simple decision but it saved them. And amd as a consequence. Now they can shop for best research.

And amd can shop for best process. Products get better. Customers win. Its good for all.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,730
136
Not what I had in mind. My point was if you read a little bit about algebraic fields, commutative and associative operations under certain number fields are order independent which has huge ramifications for parallelization
Well, speaking of linear spaces of which vectors are an example, they form a group under addition so they are guaranteed to be associative. As an added bonus, they are commutative under addition as well(and hence they form an abelian group).
(A = A+B+C+D) contains only commutative operations which is easily amenable to parallelization ... (In fact if one had a 4-operand addition unit it would take only 1 instruction to execute!)

(A = A-B-C-D) does not have any commutative or associative operations so it can't be parallelized. It would need 3 instructions to resolve problem ... (A-B, (A-B)-C, ((A-B)-C)-D in that order)
Or you could flip the sign bit in B, C and D and do the same 4-operand addition, with one additional step to check the correctness of the result of course.
Just use a 3 operand addition on B,C,D and one more step would be needed. Less parallel, but not impossible.
 
Last edited:
  • Like
Reactions: Aenra and Drazick

maddie

Diamond Member
Jul 18, 2010
5,204
5,613
136
Well, speaking of linear spaces of which vectors are an example, they form a group under addition so they are guaranteed to be associative. As an added bonus, they are commutative under addition as well(and hence they form an abelian group).

Or you could flip the sign bit in B, C and D and do the same 4-operand addition, with one additional step to check the correctness of the result of course.
Just use a 3 operand addition on B,C,D and one more step would be needed. Less parallel, but not impossible.
This.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Honestly, I have a hard time believing this is true.

Intel was first to 14nm back in 2014, while GF only reached 14nm in 2016, so GF was 2 years behind. And now they claim 7nm in 2018, while Intel will only be doing 10nm at that time ? So we are supposed to believe that they have suddenly jumped from being 2 years behind to being one node ahead ? 14nm to 7nm in 2 years while Intel has taken 4 years to go from 14nm to 10nm ? That would be quite miraculous if they pull it off, but I have my doubts.

Node names are meaningless. They could call it 7nm, 2nm or 100mm. It would all be the same thing.
 

Mopetar

Diamond Member
Jan 31, 2011
8,529
7,795
136
Node names are meaningless. They could call it 7nm, 2nm or 100mm. It would all be the same thing.

While this is true, I think that this is the closest other fabs have been to Intel in a long while. From what's been published the gate and interconnect pitches for the 7 nm process are pretty similar, with Intel only have a slight edge. Historically Intel enjoyed both much longer time-to-market advantage as well as better characteristics for a given process. Intel has had a lot of problems with the move to 10 nm, so I think it's more of a case of them falling behind their usual pace than anything else. It remains to be seen how well TSMC and Global Foundries can execute on their 7 nm process though as they're no strangers to delays and setbacks either.
 
  • Like
Reactions: moinmoin

raghu78

Diamond Member
Aug 23, 2012
4,093
1,476
136
While this is true, I think that this is the closest other fabs have been to Intel in a long while. From what's been published the gate and interconnect pitches for the 7 nm process are pretty similar, with Intel only have a slight edge. Historically Intel enjoyed both much longer time-to-market advantage as well as better characteristics for a given process. Intel has had a lot of problems with the move to 10 nm, so I think it's more of a case of them falling behind their usual pace than anything else. It remains to be seen how well TSMC and Global Foundries can execute on their 7 nm process though as they're no strangers to delays and setbacks either.

Yes. This is the closest that the foundries have ever got to Intel in terms of transistor density. Intel 10nm is slightly more dense than TSMC N7 and GF 7LP.

Intel 10nm , CPP = 54nm , MMP = 36nm , Cell Area = CPP * MMP = 54 * 36 = 1944
TSMC 7nm , CPP = 54nm , MMP = 40nm , Cell Area = CPP * MMP = 54 * 40 = 2160 (CPP is estimates from area shrink, MMP is actual disclosed value)
GF 7nm , CPP = 56nm , MMP = 40nm , Cell Area = CPP * MMP = 56 * 40 = 2240 (CPP and MMP are estimates from disclosed area shrink)
 
Last edited:

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
While this is true, I think that this is the closest other fabs have been to Intel in a long while. From what's been published the gate and interconnect pitches for the 7 nm process are pretty similar, with Intel only have a slight edge. Historically Intel enjoyed both much longer time-to-market advantage as well as better characteristics for a given process. Intel has had a lot of problems with the move to 10 nm, so I think it's more of a case of them falling behind their usual pace than anything else. It remains to be seen how well TSMC and Global Foundries can execute on their 7 nm process though as they're no strangers to delays and setbacks either.

Let's wait and see. GF has a long history of over-promising on their node technology.
 
  • Like
Reactions: Drazick

scannall

Golden Member
Jan 1, 2012
1,960
1,678
136
Let's wait and see. GF has a long history of over-promising on their node technology.
If it were just GF, I'd be skeptical as well. But, at the end of the day this is an IBM node. Does that make it certain? Nothing is a %100 certainty when it comes to nodes. Intel has had more than a few fumbles as well I'd point out. All the foundries use the same litho gear from ASML. And they all have the same challenges. Intel has been very good at using new things. Not perfect, but very good is about as good as you're going to get. But IBM is very good too.
 
  • Like
Reactions: Space Tyrant