CPCHardware:2nd gen AMD EPYC will have 64 cores, 256 Mo (!) L3, 8x DDR4-3200 and 128 PCIE-4 lines

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

CatMerc

Golden Member
Jul 16, 2016
1,114
1,153
136
While talking with my friend, I came up for a reason and solution to align both the 48 core and 64 core rumors, as well as making more sense for mainstream and HEDT markets.

AMD about a year or two ago realized they hit gold, and want to capitalize it. The 48 core design that was rumored and already in the pipeline would be based on consumer dies with 12 cores, which was their original "safe" design. Now they decided they want to aggressively capitalize on their MCM design, and created a massive censored all die for massive censored all packages that would promise complete dominance for that generation.

Threadrippers would also be able to grow so much in core counts by using these dies, that unless next gen HEDT Intel CPU's use the massive HCC dies, which would rock their boat on margins, AMD would have undisputed core count dominance.
 

CatMerc

Golden Member
Jul 16, 2016
1,114
1,153
136
While talking with my friend, I came up for a reason and solution to align both the 48 core and 64 core rumors, as well as making more sense for mainstream and HEDT markets.
Adding to that, the 48 core designs would have 12 cores and 24MB of L3 cache per die as you would expect, which on 7nm would be at 120mm more or less, making them perfect for consumers, and for reasonably priced EPYC's like we have right now.

The 16 cores with 64MB cache per die would at minimum be 200mm^2, though more likely 300-400 mm^2. When four are put together, that's 1200 to 1600 mm^2 od silicon. Considering the massive amount of computational power and cache, they can easily price them at Intel top end like prices to both be an amazing deal, and for large margins.

There are quite a few use cases that scale almost linearly with cache, so a 256MB EPYC would make all Intel products irrelevant for thoae markets, unless Intel responds in kind. That would mean a highly profitable niche until Intel responds.
 
  • Like
Reactions: DarthKyrie

Jan Olšan

Senior member
Jan 12, 2017
572
1,129
136
Starship was a relatively new addon from late 2016. Not that ancient and honestly lines up with an early to mid 2019 release.

I didn't say there was. I just doubt that AMD in 6 months has gone from one design to another for a CPU that is only just over a year away no matter how important it is for AMD to push for more cores.

No, we know about Starship from a roadmap that was dated to february 2016 (so three years to launch). Stuff could have changed since.
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
Not quite sure why this is not possible or even likely? Actually it makes a lot of sense to attack intercore communication deficit head on by doubling the cache and cores per CCX. If AMD was to build such chip today, it would be ~340mm^2. Scaled on 7nm? Who knows, but they have strong business case to go above Ryzen's 192mm^2. Even 250 is not out of question? That would propel AMD into a GREAT place both in desktop and server markets as long as they can clock each core north of 4Ghz.
 

Jan Olšan

Senior member
Jan 12, 2017
572
1,129
136
The earliest I saw it was September of last year.

http://www.fudzilla.com/news/processors/40945-amd-working-on-7nm-48-core-processor reported it in june 2016. But this spring, the roadmap documents their report was likely based on leaked themselves, by VideoCardz. And the PDF show "February 2016" date. See the link in my above post. So when Fudzilla published it, the information was already several months old, in any case the PDF date counts as the "last reference observed". BTW the same PDF is also the source for the information on Great Horned Owl (Raven Ridge) and Banded Kestrel (no "Ridge" not-embedded codename known yet).
 

scannall

Golden Member
Jan 1, 2012
1,960
1,678
136
Like all rumors, with minimal backup or documentation it should be taken with a very large grain of salt. But it would be very cool if it actually works out to be true. Speculation is always fun.
 

Jan Olšan

Senior member
Jan 12, 2017
572
1,129
136
t3h6lPZ.png


Actually, see that red line? At the time of the document (which is the last time we have information from), the specs were still subject to change and not closed. So that Feb 2016 document does leave some space for adding cores.
 

DeeJayBump

Member
Oct 9, 2008
60
63
91
And the IDF is losing it's collective ever-FUD-ing + spinning minds in the sisyphean mission with which they are tasked. :D

Great job, AMD.
 
  • Like
Reactions: Space Tyrant

jpiniero

Lifer
Oct 1, 2010
16,799
7,249
136
If AMD was to build such chip today, it would be ~340mm^2. Scaled on 7nm? Who knows, but they have strong business case to go above Ryzen's 192mm^2. .

Yield is going to suck that early in GloFo's 7 nm process. Not Intel 10 nm bad, but bad. That's why the 12 core die made sense because they would be able to keep it at a reasonable size. So I'm skeptical but I can see it would be tempting to throw in 4 CCXs per die.
 

raghu78

Diamond Member
Aug 23, 2012
4,093
1,476
136
Folks bear in mind that the area shrink is going to be massive. 7SoC with 6T is optimized for designs running at 3.5 Ghz. At 14LPP to hit > 3 Ghz you needed 9T libraries . 7.5T was only for mobile CPUs running in the 2 - 2.4 Ghz range.

https://www.globalfoundries.com/sites/default/files/product-briefs/product-brief-14lpp.pdf
https://www.semiwiki.com/forum/cont...alfoundries-discloses-7nm-process-detail.html

Cell Height = Minimum Metal Pitch x Track count
Contacted Poly Pitch x Cell Height is the new measure for transistor density

14LPP = 78nm x 64 nm x 9 tracks = 44928
7SoC = 56nm x 40nm x 6 tracks = 13440

13440/44928 = 0.299 or 0.3. Thats a 70% area shrink from 14LPP 9T. A single full node generation shrink will take you from 1 to 0.5 and another full node would take you to 0.25. 7SoC with 6T is literally bringing close to 2 generations of density increase. 7SoC 6T vs 14LPP 9T comparison by GF shows a 60% power reduction at iso perf or 40% perf increase at iso power.

https://m.eet.com/content/images/eetimes/1 7 12 14 copared x 800_1505972923.jpg

AMD should be able to pack 64 Zen 2 cores while doubling L3 cache per core and still should be able to keep die size <= 200 sq mm. I think AMD knows they have an opportunity to take a decisive lead in servers and are going for the kill. Intel EMIB and 10++ will arrive with server first in 2020 (most probably H2) and Icelake-SP is not going to be able to bring 64 cores to market in 2019. If AMD can launch Rome with 64 cores in Q1 2019 they will catch Intel totally off guard.
 

.vodka

Golden Member
Dec 5, 2014
1,203
1,538
136
So, considering it's still supposed to have 8 memory channels, that means there are still 4 dies in an Epyc MCM...

Am I right to think that AM4 gets 16 cores, Threadripper gets 32 cores, Epyc gets 64 and enables 128 cores on a 2P motherboard? Holy crap.

Looking at AM4 using one of those dies... Zen 2 cores (+ IPC), Raven's refined per core turbo, 7nm frequency improvements and... 16 cores? If so, AM4 will get such an insane amount of power in 2019 for both ST and MT workloads it's almost ridiculous.
 
  • Like
Reactions: Space Tyrant

raghu78

Diamond Member
Aug 23, 2012
4,093
1,476
136
So, considering it's still supposed to have 8 memory channels, that means there are still 4 dies in an Epyc MCM...

Am I right to think that AM4 gets 16 cores, Threadripper gets 32 cores, Epyc gets 64 and enables 128 cores on a 2P motherboard? Holy crap.

Looking at AM4 using one of those dies... Zen 2 cores (+ IPC), Raven's refined per core turbo, 7nm frequency improvements and... 16 cores? If so, AM4 will get such an insane amount of power in 2019 for both ST and MT workloads it's almost ridiculous.

AMD is likely to have separate dies for server and desktop starting at 7nm. The 7SoC power/freq curve is much better than 7HPC . 7SoC is optimized for designs running at 3.5 Ghz. Server chips do not require 4+ Ghz frequencies like desktop chips and are optimized for throughput / multithread performance.

http://btbmarketing.com/iedm/docs/29-5 Narasimha_Fig 2.jpg

7HPC will be needed if AMD want Zen 2 to hit 5 Ghz. AMD will need those high clocks if they want to compete with Intel for ST performance.
 
Last edited:

.vodka

Golden Member
Dec 5, 2014
1,203
1,538
136
Good points.

Well, considering AMD now has more money to spend and is increasing their R&D budget, it wouldn't surprise me if they started doing separate dies for both desktop and server, especially seeing the two 7nm nodes available to them. They can now do better than a one size fits all solution.

Zen sure has a bright future.
 
Apr 20, 2008
10,067
990
126
I will never understand why there are people being remotely negative about a product release. Even if this isn't your cup of tea or do not see yourself ever owning this, more competition is better for consumers. Even for those of us who work in semiconductor it's keeping everyone on their toes. Things got pretty damn boring for a while. Now along with many we're interested in finally upgrading. That's a win for everyone, right?
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,072
3,897
136
Adding to that, the 48 core designs would have 12 cores and 24MB of L3 cache per die as you would expect, which on 7nm would be at 120mm more or less, making them perfect for consumers, and for reasonably priced EPYC's like we have right now.

The 16 cores with 64MB cache per die would at minimum be 200mm^2, though more likely 300-400 mm^2. When four are put together, that's 1200 to 1600 mm^2 od silicon. Considering the massive amount of computational power and cache, they can easily price them at Intel top end like prices to both be an amazing deal, and for large margins.
Cache is dense, look at zepplin, 8mb L3 is abour 1/3 of a CCX. CCX takes about 45% of the die. Basic math looks something like 4x CCX, CCX size = 45 * 1.33 = 60mm. So on 14nm it would look something like:

uncore = 110 + an amount of more CCX connection
CCX * 4 = 240

So on 14nm your looking at around 350mm to 400mm.

Now assume something like an average across the SOC of 40% reduction going to 7nm = 210 to 240. I dont know how you did your maths.........
 

CatMerc

Golden Member
Jul 16, 2016
1,114
1,153
136
Folks bear in mind that the area shrink is going to be massive. 7SoC with 6T is optimized for designs running at 3.5 Ghz. At 14LPP to hit > 3 Ghz you needed 9T libraries . 7.5T was only for mobile CPUs running in the 2 - 2.4 Ghz range.

https://www.globalfoundries.com/sites/default/files/product-briefs/product-brief-14lpp.pdf
https://www.semiwiki.com/forum/cont...alfoundries-discloses-7nm-process-detail.html

Cell Height = Minimum Metal Pitch x Track count
Contacted Poly Pitch x Cell Height is the new measure for transistor density

14LPP = 78nm x 64 nm x 9 tracks = 44928
7SoC = 56nm x 40nm x 6 tracks = 13440

13440/44928 = 0.299 or 0.3. Thats a 70% area shrink from 14LPP 9T. A single full node generation shrink will take you from 1 to 0.5 and another full node would take you to 0.25. 7SoC with 6T is literally bringing close to 2 generations of density increase. 7SoC 6T vs 14LPP 9T comparison by GF shows a 60% power reduction at iso perf or 40% perf increase at iso power.

https://m.eet.com/content/images/eetimes/1 7 12 14 copared x 800_1505972923.jpg

AMD should be able to pack 64 Zen 2 cores while doubling L3 cache per core and still should be able to keep die size <= 200 sq mm. I think AMD knows they have an opportunity to take a decisive lead in servers and are going for the kill. Intel EMIB and 10++ will arrive with server first in 2020 (most probably H2) and Icelake-SP is not going to be able to bring 64 cores to market in 2019. If AMD can launch Rome with 64 cores in Q1 2019 they will catch Intel totally off guard.
You got surprisingly close to GloFo's own density claims. They claim a 60% density increase, which is in one dimension. When you account for two dimensions, it'a 0.6x0.6=0.36.

So 7nm is 0.36x of 14nm in size.

And what you said about two nodes worth of density improvement is, well, exactly what it is. GloFo skipped 10nm to focus on 7nm, and their 7nm is competitive with other 7nm solutions. So they effectively jumped two nodes.
 

beginner99

Diamond Member
Jun 2, 2009
5,318
1,763
136
So, considering it's still supposed to have 8 memory channels, that means there are still 4 dies in an Epyc MCM...

Damn true. I wanted to say that a 64 core with 256Mb L3 would be easy to achieve. Just double the L3 in zeppelin die and put 8 of them on the MCM.

More complicated:
How flexible is infinity fabric? Could you just put in 8 zeppelin dies and a cache-only die with 128 mb of cache? And use 1 of the memory channels to access this shared L3? That would explain 8 dies but only 8 memory channels. But yeah, seems a bit unrealistic.
 

el etro

Golden Member
Jul 21, 2013
1,584
14
81
Desktop Zen 2 (Ryzen 7/5/3/TR) will need 7HPC to hit 5 Ghz.

Likely it won't need. The graph itself says that 7LP SoC have 30% more Fmax than 14LPP. That gives Ryzen a 5.2Ghz clock, at 80% of the power. It just couldn't be better.