AMD Zen 2 Based ‘Starship’ CPU to Bring 48 Cores, 96 Threads in 2018

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

dnavas

Senior member
Feb 25, 2017
355
190
116
However, there is also the rumour (from Canard PC Hardware) that next-gen EPYC will have 64 cores, requiring a building block with 16 cores (4 CCXs; see this thread).

Hasn't glofo said that their 7nm has 55% power improvements? Twice the cores at nominal power, or 50% more cores with frequency boosted by (20%?) both seem like reasonable moves (former for Epyc, latter for TR). Of course, it's unclear how the process works at the limit/freq.wall. If 7nm is barely into risk production, could it be that AMD is hedging its bets and trying both?
 

Vattila

Senior member
Oct 22, 2004
817
1,450
136
could it be that AMD is hedging its bets and trying both?

Could be. The InfinityFabric strategy, based on using a single multi-purpose die for many products, is great for keeping cost and risk low. But as AMD grows, it is likely they will design more dies targeted at various markets. I guess it comes down to an analysis of competition, market needs, opportunity and cost.

What is the end-goal though: the ultimate HSA exascale server APU with HBM, as alluded to in earlier AMD roadmaps and research. This is why I am so drawn to my GCX hypothesis — at some point AMD will have scalable GPU units on die or package, i.e. similar to the CCX in Zeppelin.
 
  • Like
Reactions: Drazick

NostaSeronx

Diamond Member
Sep 18, 2011
3,803
1,286
136
There should be a change in the CCX with 12LP.

//Skip below, if you don't want most of the info.
It should look more what was like with Broadcom's Vulcan and Cavium's ThunderX2. Since, the new bigs at AMD who are doing everything are from Broadcom's CPU and SoC.

Broadcom employees at AMD; (Joined 2015~2017, but all of this has been worked upon for longer than 5 years)
-> Volume core w/ CMT 66~85% of Zen 2 perf on 22FDX.
-> New CCX supports all four; Mobile Graphic(GPU for Mobile(Mediatek)), Volume core(For THATIC), Next Graphics(New architecture(Navi? or actually Next-gen 7nm+)), Premium core(Zen+ and beyond).
-> Implement RISC-V to add AI (xNN) functionality, and etc. RISC-V will also replace the Tensilica UVD/VCE/True Audio units.

//Skip end.
The changes in the CCX should allow for more cores in a single cluster of cores. While not impacting the worst case travel latency or overall bandwidth. In fact, the opposite is true... mores cores w/ more bandwidth and less latency. The change in the CCX should be more power efficient as well. Allowing for more TDP to be given to the CPUs, for higher clocked XFRs.

48c/96t should be fully feasible with 12LP with the new design. 64c/???t should also be fully feasible with 7LP as well. ??? for the threads are in the case that 4-threaded SMT is a thing. "Improving Zen in multiple dimensions" is a hugely, broad and vague statement.
 
Last edited:
  • Like
Reactions: Drazick

Exist50

Platinum Member
Aug 18, 2016
2,452
3,105
136
Broadcom employees at AMD; (Joined 2015~2017, but all of this has been worked upon for longer than 5 years)
-> Volume core w/ CMT 66~85% of Zen 2 perf on 22FDX.
-> New CCX supports all four; Mobile Graphic(GPU for Mobile(Mediatek)), Volume core(For THATIC), Next Graphics(New architecture(Navi? or actually Next-gen 7nm+)), Premium core(Zen+ and beyond).
-> Implement RISC-V to add AI (xNN) functionality, and etc. RISC-V will also replace the Tensilica UVD/VCE/True Audio units.

Taking bets on how many of these things come true. 50 bucks says it's none of the above. Any takers?
 

Vattila

Senior member
Oct 22, 2004
817
1,450
136
Twice the cores at nominal power, or 50% more cores with frequency boosted by (20%?) both seem like reasonable moves

Latest rumour from WCCFTech says the same thing — two dies:
  • Die 1: Single CCX 6 core, each die 12 core, single CPU maximum 48 core
  • Die 2: Single CCX 8 core, each die 16 core, single CPU maximum 64 core
However, 6-core and 8-core CCXs sound implausible. I still think it is likely the CCX stays quad-core. You could of course have two dies with these core counts using quad-core CCXs: 3 CCXs = 12 cores (48-core EPYC), and 4 CCXs = 16 cores (64-core EPYC).

It could also be that Die 1 is based on Zen 2 and Die 2 is the next generation based on Zen 3. We know AMD has leapfrogging teams at work.
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,028
3,800
136
More then 4 core CCX only makes sense if they move away from a full mesh between the core/L3 slice within the CCX to something like a bi-dir ring bus. Its possible But i also think it is unlikely, There are lots of things they could do outside the CCX to reduce general latency in a multi CCX, multi die product.


Also i dont think RD has increased anywhere near enough to do 2 zepplin style dies on 7nm at/near the same time. 7nm is expensive.
 

Vattila

Senior member
Oct 22, 2004
817
1,450
136
i dont think RD has increased anywhere near enough to do 2 Zeppelin style dies on 7nm at/near the same time.

I also doubt they'll do two dies and product lines simultaneously at this stage. If the rumour has any truth in it at all, I think it is more likely that Die 1 is based on 7LP Zen 2 (for EPYC "Rome") and Die 2 is the next generation based on 7LP+ Zen 3 (for EPYC "Milan"). If that is the case, and assuming my codename interpretation is correct, Die 1 is "Starship 1" and Die 2 is "Starship 2".
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,028
3,800
136
I also doubt they'll do two dies and product lines simultaneously at this stage. If the rumour has any truth in it at all, I think it is more likely that Die 1 is based on 7LP Zen 2 (for EPYC "Rome") and Die 2 is the next generation based on 7LP+ Zen 3 (for EPYC "Milan"). If that is the case, and assuming my codename interpretation is correct, Die 1 is "Starship 1" and Die 2 is "Starship 2".
I believe the initial plan was 3 cxx / 48 core. Now the question is did the plan change, There was enough time for it to change and adding an extra CCX would only be mid 20's mm sq to the die size. but either way just 1 SOC.
 

Vattila

Senior member
Oct 22, 2004
817
1,450
136
either way just 1 SOC

Definitely. Bringing up two dies at the same time would be a validation nightmare. And validating 6-core and 8-core CCXs, and at the same, that doesn't even bear thinking about. It seems obvious to me that they'll build on what they have learned about the quad-core CCX, its characteristics etc., to maximise reuse and minimise disruption from generation to generation. My bet is 3 quad-core CCXs per die for "Rome", 4 CCXs per die for "Milan".
 
Last edited:
  • Like
Reactions: Drazick

bsp2020

Member
Dec 29, 2015
106
122
116
I also doubt they'll do two dies and product lines simultaneously at this stage. If the rumour has any truth in it at all, I think it is more likely that Die 1 is based on 7LP Zen 2 (for EPYC "Rome") and Die 2 is the next generation based on 7LP+ Zen 3 (for EPYC "Milan"). If that is the case, and assuming my codename interpretation is correct, Die 1 is "Starship 1" and Die 2 is "Starship 2".

If AMD would make 12 core chips, it makes sense for them to do it in 12nm (Pinnacle Ridge). If it was up to me, I'd put 2 6-core CCXs on 12nm process with 1MB L3 cache per core, as done in RR. It should be smaller than the current Zeppelin die and should fit nicely between their current 8 core 1800X and 12/16 core dual-die ThreadRipper. Since they will skip 12nm EPYC, lack of L3 cache should not matter much and ThreadRipper 2 would demolish Intel's HEDT offering.
AMD introduced 6 core Phenom II about a year after they brought out 4 core Phenom II. So, there is a precedent...
 

Vattila

Senior member
Oct 22, 2004
817
1,450
136
If AMD would make 12 core chips, it makes sense for them to do it in 12nm (Pinnacle Ridge). If it was up to me, I'd put 2 6-core CCXs on 12nm process

Too expensive. Creating a 6-core CCX is a major undertaking, unlikely to be seen on any die, in my view. My hunch is that they'll build the next generation dies using their quad-core CCX lego piece, and I think they have laid out a straight-forward roadmap for their "Zeppelin" successors; with 7LP "Starship 1" having 3 CCXs and 7LP+ "Starship 2" having 4 CCXs. If so, you'll get your 12-core Ryzen and 24-core Threadripper based on "Starship 1", hopefully next year.
 

dnavas

Senior member
Feb 25, 2017
355
190
116
Latest rumour from WCCFTech says the same thing — two dies:
  • Die 1: Single CCX 6 core, each die 12 core, single CPU maximum 48 core
  • Die 2: Single CCX 8 core, each die 16 core, single CPU maximum 64 core

Well, a clearer refutation of my query is likely not in the offing :>
If AMD uses 12nm's 10% performance improvement to increase cores by 50%, I'll eat the apparel item of your choice. AMD badly needs higher clocks, not more cores.
It would make sense that the 16core/die rumor is for 7nm+, except that we've heard crazy rumors before, and I'm still waiting for my 5Ghz Ryzen. More likely, one or both of these are pure fiction. I'd love to see a 32 core TR that boosted well north of 4Ghz. Take my money. And so, because it sounds too much like what I'd want to hear, I think I'll stick with 48 as best guess for late 2019.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,803
1,286
136
If AMD uses 12nm's 10% performance improvement to increase cores by 50%, I'll eat the apparel item of your choice. AMD badly needs higher clocks, not more cores.
12LP only has a 7.5 track height and uses a BEOL similar to 12FDX for the first two layers. M0->M2 = 12nm(56p/70p/56p), and M3->Mx = 14nm(64p+). The transistor is a retrograde pull-up from 7LP.

More clocks come from more idle cores.

https://patents.google.com/patent/US9634003
"continuous RX construct allows an optimal device performance without loss of area."

Numbers for 22FDX;
https://i.imgur.com/jqfQG3Z.jpg
Thus, 12LP's 7.5T will be faster than 14LPP's 9T for less leakage.
 
Last edited:

TempAcc99

Member
Aug 30, 2017
60
13
51
Definitely. Bringing up two dies at the same time would be a validation nightmare. And validating 6-core and 8-core CCXs, and at the same, that doesn't even bear thinking about. It seems obvious to me that they'll build on what they have learned about the quad-core CCX, its characteristics etc., to maximise reuse and minimise disruption from generation to generation. My bet is 3 quad-core CCXs per die for "Rome", 4 CCXs per die for "Milan".

Agree. Simple reason from them not going with 4x4ccx is because that thing could for sure only be used in server space. Even 4x3 is stretching it for consumer chips. A hypothetical Ryzen 3800x would then be a 12-core, the r5 1600 a 8 core and r3 as 6-core?. And what happens to the APUs? Still 4 core? Possible. At least the r3 then has it's clear place above the APU in contrast to now.

Still pricing these correctly will get tough. The 12-core versions should in this case almost certainly get a price increase.
 

Hans Gruber

Platinum Member
Dec 23, 2006
2,496
1,341
136
I think the Zen 2 1700/1800 eqivalent will feature 16 cores vs.8 cores on zen. If they stick with the same 1600 version of zen 2 would have 12 cores vs 6 cores with zen. When you get to 32 and 48 cores. I think that will be Zen 2 version of threadripper parts.
 
  • Like
Reactions: Drazick

dooon

Member
Jul 3, 2015
89
53
61
Latest rumour from WCCFTech says the same thing — two dies:
  • Die 1: Single CCX 6 core, each die 12 core, single CPU maximum 48 core
  • Die 2: Single CCX 8 core, each die 16 core, single CPU maximum 64 core
However, 6-core and 8-core CCXs sound implausible. I still think it is likely the CCX stays quad-core. You could of course have two dies with these core counts using quad-core CCXs: 3 CCXs = 12 cores (48-core EPYC), and 4 CCXs = 16 cores (64-core EPYC).

It could also be that Die 1 is based on Zen 2 and Die 2 is the next generation based on Zen 3. We know AMD has leapfrogging teams at work.

Source (Chiphell)
 

Gideon

Platinum Member
Nov 27, 2007
2,003
4,960
136
Agree. Simple reason from them not going with 4x4ccx is because that thing could for sure only be used in server space. Even 4x3 is stretching it for consumer chips. A hypothetical Ryzen 3800x would then be a 12-core, the r5 1600 a 8 core and r3 as 6-core?. And what happens to the APUs? Still 4 core? Possible. At least the r3 then has it's clear place above the APU in contrast to now.

Still pricing these correctly will get tough. The 12-core versions should in this case almost certainly get a price increase.
There are other possibilities. AMD might make 2 chips on 7nm:

1. 8 core (2 CCX) + GPU chip for Ryzen (and possibly high-end >=35W Ryzen Mobile).
2. 16 core (4 CCX) chip for Threadripper and 32 - 64 cores (2-4 of those chips) for EPYC (and possibly highest end Threadripper).

I don't find it all that likely, but also wouldn't rule it out completely.

Anyway, if there is indeed a 16-core chip in the works, it makes the appearance of a integrated GPU on the 8-core chip considerably more likely, as that same chip wouldn't make any sense in EPYC or Threadripper anyway (that market would belong to the 16 core chip).
And on the other hand, the 16 core chip would be very much needed in EPYC 2019 and onwards to reach higher core-counts, as Intel will not be standing still, when their main revenue source is under attack.
 
  • Like
Reactions: Drazick

Topweasel

Diamond Member
Oct 19, 2000
5,437
1,659
136
If AMD would make 12 core chips, it makes sense for them to do it in 12nm (Pinnacle Ridge). If it was up to me, I'd put 2 6-core CCXs on 12nm process with 1MB L3 cache per core, as done in RR. It should be smaller than the current Zeppelin die and should fit nicely between their current 8 core 1800X and 12/16 core dual-die ThreadRipper. Since they will skip 12nm EPYC, lack of L3 cache should not matter much and ThreadRipper 2 would demolish Intel's HEDT offering.
AMD introduced 6 core Phenom II about a year after they brought out 4 core Phenom II. So, there is a precedent...

Pinnacle Ridge was always going to be a low cost process change. In fact AMD didn't even originally plan for it to be on 12nm which isn't even a real die shrink anyways. It was going to be on 14nm+ which is what the 12nm process really is. Even then AMD isn't going to increase die size by 50% on what is in theory less than a 20% increase in density.
 
  • Like
Reactions: CatMerc

bsp2020

Member
Dec 29, 2015
106
122
116
Even then AMD isn't going to increase die size by 50% on what is in theory less than a 20% increase in density.
Adding 4 more cores won't increase the die size by 50%. Look at the die shot of Zeppelin. Cores are only about about 60% of the area and rest are uncores/IO/etc. Adding 4 more cores would only increase the die size by less than 20% (1.3*.9 = 1.17). If they cut the L3 cache size, I think the die size will be about the same as the current Zeppelin. So, it is very possible to do it with 12nm.

I'm not saying AMD has done it. However, if they do want to increase the number of cores in a CCX, which I think they do, it just seems logical to try it with 12nm first rather than with 7nm. 7nm process will be too difference and will present many challenges of its own and moving to a brand new process node is not the best time to experiment with big architectural changes.
 
Last edited:

NostaSeronx

Diamond Member
Sep 18, 2011
3,803
1,286
136
It was going to be on 14nm+ which is what the 12nm process really is.
14nm+ is the Raven Ridge process.

12LP is not 14LPP or 14LP Plus Plus. GlobalFoundries 14nm+ being the CTE of the 14LPP(CTE is continuous transistor enhancement). For example of CTEs; Carrizo is the base GF28A node, Bristol Ridge and Stoney Ridge is the CTE GF28HPA node.

12LP is the the FinFET part of the 12nm nodes. As GlobalFoundries wants the two nodes(TeraHertz and FinFETs) to be inseparable(together forever) in PDK. Thus, 12LP is with the 12FDX node.
 
Last edited:
  • Like
Reactions: Drazick