CPCHardware:2nd gen AMD EPYC will have 64 cores, 256 Mo (!) L3, 8x DDR4-3200 and 128 PCIE-4 lines

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

raghu78

Diamond Member
Aug 23, 2012
4,093
1,475
136
You got surprisingly close to GloFo's own density claims. They claim a 60% density increase, which is in one dimension. When you account for two dimensions, it'a 0.6x0.6=0.36.

So 7nm is 0.36x of 14nm in size.

And what you said about two nodes worth of density improvement is, well, exactly what it is. GloFo skipped 10nm to focus on 7nm, and their 7nm is competitive with other 7nm solutions. So they effectively jumped two nodes.

The key here is the choice of track count will determine the actual shrink.

14LPP - CPP = 78nm MMP = 64nm. CPP X MMP = 78 x 64 = 4992
7LP - CPP = 56nm MMP = 40nm . CPP x MMP = 56 x 40 = 2240.

So in terms of CPP x MMP the shrink is 2240/4992 = 0.45 . 55%. But when you bring in track count which determines the actual cell size the shrink can go upto 70%. 7SoC is enough to provide perf required for servers.

14LPP 9T to 7SoC 6T = 70% shrink
14LPP 7.5T to 7SoC 6T = 65% shrink
14LPP 9T to 7HPC 9T = 55% shrink.

7SoC 6T is aimed at a real sweet spot of power, perf and maximum density. AMD needs 7HPC only for Zen 2 desktop. I think servers and notebooks are better served by 7SoC.

Likely it won't need. The graph itself says that 7LP SoC have 30% more Fmax than 14LPP. That gives Ryzen a 5.2Ghz clock, at 80% of the power. It just couldn't be better.

http://btbmarketing.com/iedm/docs/29-5 Narasimha_Fig 2.jpg

imo the graph plots the most efficient regions on the freq/power curve for the respective processes. It does not cover the entire freq range upto fmax.
 
Last edited:

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
8 Zen cores on an epic MCM would be both a tight packaging problem and a major interconnect issue. The mcm now has each die with 4 connections, one to each of the other dies and a fourth to the second socket. With 8 dies, you now need either 7 intradie connections and one external for each die. This is a massive jump in pin count on the package and to the complexity of the uncore and that MCM itself.

Either you make a fiendishly complicated MCM and go single processor with it, or you stick with 4 dies with new floor plans.

It does create an interesting possibility though. Say AMD made a three CCX, large cache Zen die on 7nm. Then, they took the space savings and shrink the physical die a bit, say enough to fit 8 on an epic MCM. It's possible to do all 7 interconnections on the MCM, have a single channel of ram for each core, have higher clocks for each core as well. This would give a 1p system 96 cores and 192 threads. It would be short on total bandwidth a bit, but still have huge performance.
 

maddie

Diamond Member
Jul 18, 2010
4,740
4,674
136
8 Zen cores on an epic MCM would be both a tight packaging problem and a major interconnect issue. The mcm now has each die with 4 connections, one to each of the other dies and a fourth to the second socket. With 8 dies, you now need either 7 intradie connections and one external for each die. This is a massive jump in pin count on the package and to the complexity of the uncore and that MCM itself.

Either you make a fiendishly complicated MCM and go single processor with it, or you stick with 4 dies with new floor plans.

It does create an interesting possibility though. Say AMD made a three CCX, large cache Zen die on 7nm. Then, they took the space savings and shrink the physical die a bit, say enough to fit 8 on an epic MCM. It's possible to do all 7 interconnections on the MCM, have a single channel of ram for each core, have higher clocks for each core as well. This would give a 1p system 96 cores and 192 threads. It would be short on total bandwidth a bit, but still have huge performance.
Are there any fundamental problems in Zen 2 having 2 cores with a shared L3 cache as a basic unit to replace the present 1 core + L3 unit? If possible, this would bypass all routing complications.
 

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
Yield is going to suck that early in GloFo's 7 nm process. Not Intel 10 nm bad, but bad. That's why the 12 core die made sense because they would be able to keep it at a reasonable size. So I'm skeptical but I can see it would be tempting to throw in 4 CCXs per die.
Meh thats what you have desktop for. All the cust gets here. Heck or even servers.
If anything the way amd does this it makes sense to go excactly there unlike the Intel big die approach. Its one of the benefits.
 

rainy

Senior member
Jul 17, 2013
505
424
136
So is next year Zen 2 or zen+ running on GloFo 12 nm LP FINFET. Or is just existing designing running at higher speeds.

No, Zen 2 would arrive in 2019, next year (most probably late Q1) we should see Zen+ (aka Pinnacle Ridge) which is tweaked/optimized version of Summit Ridge at higher clocks.
 
Last edited:

CatMerc

Golden Member
Jul 16, 2016
1,114
1,149
136
AMD is likely to have separate dies for server and desktop starting at 7nm. The 7SoC power/freq curve is much better than 7HPC . 7SoC is optimized for designs running at 3.5 Ghz. Server chips do not require 4+ Ghz frequencies like desktop chips and are optimized for throughput / multithread performance.

http://btbmarketing.com/iedm/docs/29-5 Narasimha_Fig 2.jpg

7HPC will be needed if AMD want Zen 2 to hit 5 Ghz. AMD will need those high clocks if they want to compete with Intel for ST performance.
I rather doubt 7nm HPC will be used for Desktop Ryzen either. Unless the sockets and circuits were massively overengineered for the sake of it, a 7nm HPC chip would stretch even the mid tier VRM's power limits, let alone if the socket is capable of handling it.

At no point is 7nm HPC (aside from the absolute bottom of the curve) more efficient than 14nm, so an increase in throughput will increase power consumption by a fair bit.
 

ksec

Senior member
Mar 5, 2010
420
117
116
Sorry I cant find who to quote.

But it doesn't take 3 - 4 years to get this done. It takes that amount of time when you have a new uArc and design. The extra core is really a much a smaller change that take months rather years.

Also it is likely they have had both 12 /16 Core Design from the start.

It is likely AMD see EPYC isn't doing as well as they expected, as with most Web and Cloud vendors have already locked in what ever they planned for the next 6 - 8 months. They will need a much better roadmap to get those very luxurious DC business.

I wonder if it is PCI-E 4.0? And at this rate they will need DDR5 or HBM2 as memory becomes the bottleneck.
 

el etro

Golden Member
Jul 21, 2013
1,581
14
81
With a lower cell size of 0.0029um2 for GF 7LP, AMD have liberty of to put a lot of cache into Ryzen/Mobile/TR/Epyc.
 

Vattila

Senior member
Oct 22, 2004
799
1,351
136
  • Like
Reactions: el etro and CatMerc

el etro

Golden Member
Jul 21, 2013
1,581
14
81
I guess that's a typo. High density 6T SRAM cell size for 7LP is 0.0269 µm² according to SemiWiki.

https://www.semiwiki.com/forum/cont...alfoundries-discloses-7nm-process-detail.html

SRAM cell size from 14LPP to 7LP is thus (1 - 0.0269/0.065) = 59% smaller, or put in another way, a 7LP cell is 41% of a 14LPP cell, and density is (1/0.41) = 2.43x. According to GlobalFoundries, logic density is also better than 2x.

That's why Zen2 can have 8 cores and 16MB L3 per CCX. And can imclude a iGPU, too. The transistors per mm2 increase is just mind-blowing. A 10 to 15% IPC increase would be the cherry of the cake for Zen2 competitiveness with IceLake.
 
  • Like
Reactions: DarthKyrie

Excessi0n

Member
Jul 25, 2014
140
36
101
I rather doubt 7nm HPC will be used for Desktop Ryzen either. Unless the sockets and circuits were massively overengineered for the sake of it, a 7nm HPC chip would stretch even the mid tier VRM's power limits, let alone if the socket is capable of handling it.

At no point is 7nm HPC (aside from the absolute bottom of the curve) more efficient than 14nm, so an increase in throughput will increase power consumption by a fair bit.

That depends on what the actual frequency range is that they're showing for the 14nm process. If the final point on the shown 14nm curve is the 4 GHz wall that Ryzen hits then yeah, only the first couple of points on the HPC process are more efficient than maxed-out Ryzen. However, that seems unlikely because it would imply that the HPC process scales to nearly 7 GHz! And if that were the case I would expect that GloFo would be shouting from the rooftops that they have a process which can go past 6 GHz.

But... they aren't. They have talked about scaling past 5 Ghz, however. That leads me to believe that the last point on the 14nm curve is not 4 Ghz, but is in fact something lower. If that last 14nm point is at 3.5 GHz then the HPC scales to a little short of 6 GHz. And if the 14nm curve continues to curve upwards, then you could get ~33% higher frequencies for the same power as 4 GHz 14nm. This feels somewhat optimistic to me, but it also seems a lot more plausible than GloFo/IBM coming out with a process that scales to 6.5-7 GHz.

This is based on a series of assumptions, vague recollections, and a few minutes of fiddling around with the graph in an image editor, so take this post with a few mine truck's worth of salt. And I'm self-aware enough to admit that I'm biased because I really want AMD to hit this one out of the park.
 

el etro

Golden Member
Jul 21, 2013
1,581
14
81
For me is Ryzen for the SoC curve demo and Power10 for the HPC demo. They are the GF key clients.
 

jpiniero

Lifer
Oct 1, 2010
14,599
5,218
136
One option I suppose if it really is that big is that maybe that AMD only sells the top EPYC model at 48 cores, with one core disabled per ccx and some of the L3 initially and as yield improves sell the 64 core later.

Intel's not going to be able to release any 10 nm server products without EMIB; so until we get a better idea of when they will release their first Core product with it, AMD will be all alone with that high of a core count, even at 48.
 
  • Like
Reactions: stockolicious

raghu78

Diamond Member
Aug 23, 2012
4,093
1,475
136
That depends on what the actual frequency range is that they're showing for the 14nm process. If the final point on the shown 14nm curve is the 4 GHz wall that Ryzen hits then yeah, only the first couple of points on the HPC process are more efficient than maxed-out Ryzen. However, that seems unlikely because it would imply that the HPC process scales to nearly 7 GHz! And if that were the case I would expect that GloFo would be shouting from the rooftops that they have a process which can go past 6 GHz.

But... they aren't. They have talked about scaling past 5 Ghz, however. That leads me to believe that the last point on the 14nm curve is not 4 Ghz, but is in fact something lower. If that last 14nm point is at 3.5 GHz then the HPC scales to a little short of 6 GHz. And if the 14nm curve continues to curve upwards, then you could get ~33% higher frequencies for the same power as 4 GHz 14nm. This feels somewhat optimistic to me, but it also seems a lot more plausible than GloFo/IBM coming out with a process that scales to 6.5-7 GHz.

This is based on a series of assumptions, vague recollections, and a few minutes of fiddling around with the graph in an image editor, so take this post with a few mine truck's worth of salt. And I'm self-aware enough to admit that I'm biased because I really want AMD to hit this one out of the park.

Good reasoning. I think the graph plots freq/power in the most efficient range for 14LPP vs 7SoC and 7HPC. GF also has stated that 7LP is optimized for ARM cores running at 3.5 Ghz.

http://www.eenewseurope.com/news/gf-debuts-7nm/page/0/1

"An ARM Cortex-A72 core could run at more than 3.5 GHz in the process, the company estimates."

imo the freq scale used is 1x = 2.5 Ghz. This would put the 7SoC graph hitting 3.5-3.6 Ghz at the same power as 14LPP at 2.5 Ghz. . This means the 7HPC freq range is 4.25-5.25 Ghz (1x = 2.5 Ghz).
 

DrMrLordX

Lifer
Apr 27, 2000
21,633
10,845
136
I rather doubt 7nm HPC will be used for Desktop Ryzen either. Unless the sockets and circuits were massively overengineered for the sake of it, a 7nm HPC chip would stretch even the mid tier VRM's power limits, let alone if the socket is capable of handling it.

At no point is 7nm HPC (aside from the absolute bottom of the curve) more efficient than 14nm, so an increase in throughput will increase power consumption by a fair bit.

AMD could always do an FX version of Ryzen 2 on 7nm HPC. Water cooling required. For people who want that 5.5 GHz powah.