• Hey there Guest! This holiday season you have a chance to win some cool AnandTech swag! Check out “The Holiday Hotlist” giveaway, which you can find here.

Speculation: The CCX in Zen 2

How many cores per CCX in 7nm Zen 2?

  • 4 cores per CCX (3 or more CCXs per die)

    Votes: 50 43.9%
  • 6 cores per CCX (2 or more CCXs per die)

    Votes: 45 39.5%
  • 8 cores per CCX (1 or more CCXs per die)

    Votes: 19 16.7%

  • Total voters
    114

Gideon

Senior member
Nov 27, 2007
413
6
136
I'm sticking with 6 core ccx for desktop and 8 core ccx for servers. We got very strong rumors pointing towards this.
That would mean doubling the engineering effort (IMO unnecessarily). Considering how much AMD has recycled Zeppelin (essentially not even changing anything for 12 nm Ryzen and Threadripper) I dont' see them suddenly doing 2 totally unrelated designs for both server and desktop
 

Vattila

Senior member
Oct 22, 2004
362
50
136
a 4-core ccx would not be 50mm2 on 7nm but more likely 125mm2
Typo?

The size of a CCX is 45.5 mm² on 14LPP, and with over 2x density on the 7LP process, a straight shrink should be down to less than half the size. 25-50 mm² is allowing for some additional transistor budget for core improvements and larger caches.
 

Glo.

Platinum Member
Apr 25, 2015
2,618
5
106
Perfectly straight forward:
Dual 4 core CCX in Matisse design.
8 core CPU.
16 MB's L3 cache.
Around 120 mm2 die size.

48 core EPYC2 is made from 6 dies.
64 Core is made from 8 dies.

AMD decided to go 8 core design route for perfectly simple scaling of the CPU.
 

Vattila

Senior member
Oct 22, 2004
362
50
136
48 core EPYC2 is made from 6 dies.
64 Core is made from 8 dies.
What is the topology between dies? Again, since we go beyond 4, direct-connect is out of the question. Hence why I doubt this.

And what about memory controllers?
 

jpiniero

Diamond Member
Oct 1, 2010
5,805
42
126
I still think it's 4x4, and that the GloFo version was going to be 4x3.
 

Glo.

Platinum Member
Apr 25, 2015
2,618
5
106
What is the topology between dies? Again, since we go beyond 4, direct-connect is out of the question. Hence why I doubt this.
Ask the IO die on the package, with Interposer connecting all of the dies ;).
 

Glo.

Platinum Member
Apr 25, 2015
2,618
5
106
That's too much of a change with Zen 2.
Why it had to be the same? ;)

Maddie already posted some of the reasons: TSMC makes server CPUs because of efficiency of their process, and because TSMC has experience with large Interposers(GV100), GF process clocks higher than TSMC's hence why on this process you will get Matisse, AM4 CPUs.
 

french toast

Senior member
Feb 22, 2017
916
0
91
Perfectly straight forward:
Dual 4 core CCX in Matisse design.
8 core CPU.
16 MB's L3 cache.
Around 120 mm2 die size.

48 core EPYC2 is made from 6 dies.
64 Core is made from 8 dies.

AMD decided to go 8 core design route for perfectly simple scaling of the CPU.
I don't agree with the core numbers, but this seems reasonable imo, I agree we will be looking at a smaller ~150mm2 die for desktop, but does not play into the core wars strategy...it is possible we get 12 cores, but in 3x4 CCX...that offers up greater flexibility with server.
 

Glo.

Platinum Member
Apr 25, 2015
2,618
5
106
I don't agree with the core numbers, but this seems reasonable imo, I agree we will be looking at a smaller ~150mm2 die for desktop, but does not play into the core wars strategy...it is possible we get 12 cores, but in 3x4 CCX...that offers up greater flexibility with server.
You do understand that dual Matisse package will be perfectly capable to fit in AM4 package? ;)

If AMD will need more cores to offer better product than Intel - they will make dual CPU package for AM4. If they will not - we will get simply 8 core design.
 

JoeRambo

Senior member
Jun 13, 2013
627
11
136
while a 6-core ccx would be 150mm2
That does not pass any common sense checks. Intel has Coffee Lake 150mm, with GPU on board. AMD 4 cores and 8MB of L3 are estimated 44mm^2. How can 6 cores be that large on a process more dense?
 

jpiniero

Diamond Member
Oct 1, 2010
5,805
42
126
Why it had to be the same? ;)
R&D costs, ensuring they meet project schedules, you know, that sort of thing. They already made a rather large change in essentially switching from GloFo to TSMC when they were well into development.
 

Glo.

Platinum Member
Apr 25, 2015
2,618
5
106
R&D costs, ensuring they meet project schedules, you know, that sort of thing. They already made a rather large change in essentially switching from GloFo to TSMC when they were well into development.
Do you think all of the design of Rome is out of budget for AMD, especially when that R&D cost will give AMD opportunity for much higher asking price and margin, from EPYC2 CPUs?

Imagine that manufacturing cost goes up from 100$ currently on EPYC CPU to 200$, but allows them to charge not 4999$, but 9999$ for highest SKU.
 

jpiniero

Diamond Member
Oct 1, 2010
5,805
42
126
Do you think all of the design of Rome is out of budget for AMD, especially when that R&D cost will give AMD opportunity for much higher asking price and margin, from EPYC2 CPUs?
If anything, that's exactly why they also changed from 12 core dies to 16 when they switched to TSMC. My guess is that the 16 core die products will be offlabel so you won't see them unless you are Amazon or Google.
 

Glo.

Platinum Member
Apr 25, 2015
2,618
5
106
If anything, that's exactly why they also changed from 12 core dies to 16 when they switched to TSMC. My guess is that the 16 core die products will be offlabel so you won't see them unless you are Amazon or Google.
What 12 and 16 core dies?

There are only 8 core CPUs, and 8 core dies.
 

jpiniero

Diamond Member
Oct 1, 2010
5,805
42
126
What 12 and 16 core dies?

There are only 8 core CPUs, and 8 core dies.
That was what I was getting at, what you are suggesting is way too big of a change for them at this point. IO die, etc, that's something more for Milan or maybe even later when they switch sockets.
 

Abwx

Diamond Member
Apr 2, 2011
8,710
11
126
Why not 4 x 4-core CCX chiplets on an active interposer? See my earlier posts.
This would require yet another interconnect between CCXs, if they did the things well IF is scalable as well easily, doubling its paths widths and the caches sizes is the most logical perf/watt wise.
 

eek2121

Senior member
Aug 2, 2005
289
0
116
A lot of you guys aren't thinking this through. Zen2 has to be compatible with AM4. AM4 has dual channel RAM. Adding more cores will add bandwidth and latency constraints as cores become starved of RAM. Zen2 will be 2x4 core CCXes, just like previous designs. You won't see a core increase until a new socket.
 

Vattila

Senior member
Oct 22, 2004
362
50
136
Dual 4 core CCX in Matisse design.
It seems pretty obvious that the two CCXs in Zeppelin scales perfectly to 4 CCXs using direct-connect. Why not take advantage of that?

This forms a hierarchical two-layered topology of direct-connected quads (what I, probably somewhat incorrectly, calls a quad-tree topology in my OP). Then optimise this topology by adding further connections as fas as metal layers allow, creating a more complex and optimised topology, that brings down average latency between any two cores.

Then connect up to 4 of these 4-CCX dies together using direct-connect on the package, as they currently do. This avoids yet another sub-optimal interconnect scheme between the 6 to 8 dies in your approach, which also require the packaging to change to use a large interposer underpinning all the dies.

The simplest options I see:
  1. If we assume AMD does not move to a chiplet design, then just add two direct-connected 4-core CCXs to the die.
  2. Assuming AMD moves to a chiplet design, then implement the uncore on an active interposer with 4 x 4-core CCX chiplets mounted on top.
Both approaches can reuse the current MCM packaging scheme.
 

Vattila

Senior member
Oct 22, 2004
362
50
136
You won't see a core increase until a new socket.
So they (including me) thought about Threadripper SocketTR4. Yet here we are with 32-core Threadripper WX.

I think 16 cores on AM4 is now a given for Ryzen 3000.
 
Last edited:

Vattila

Senior member
Oct 22, 2004
362
50
136
Yohoo! The poll now has 4-core CCX in the lead!
 

Glo.

Platinum Member
Apr 25, 2015
2,618
5
106
A lot of you guys aren't thinking this through. Zen2 has to be compatible with AM4. AM4 has dual channel RAM. Adding more cores will add bandwidth and latency constraints as cores become starved of RAM. Zen2 will be 2x4 core CCXes, just like previous designs. You won't see a core increase until a new socket.
I mostly agree, apart from the last bit ;).

Nothing stops AMD from offering 16 core SKU, on AM4 board with Zen 2, but made from two Matisse CPUs.
 

JoeRambo

Senior member
Jun 13, 2013
627
11
136
It seems pretty obvious the two CCXs in Zeppelin scales perfectly to 4 CCXs using direct-connect. Why not take advantage of that?
How is this "perfect" scaling defined? By having 80ns of latency? Please don't drink too much AMD cool-aid. The fact that one needs to mention 4 cores and interconnect in same sentence is not "perfect", the opposite is true.
 

Vattila

Senior member
Oct 22, 2004
362
50
136
How is this "perfect" scaling defined?
4 CCXs will have no worse latency than 2 CCXs, since they all will be directly connected (6 links between 4 CCXs). See my OP.
 

JoeRambo

Senior member
Jun 13, 2013
627
11
136
4 CCXs will have no worse latency than 2 CCXs, since they all will be directly connected (6 links between 4 CCXs). See my OP.
That is simply not true. Instead of checking just 1 CCX, requests will need to be sent to 3 entities, and same proliferation of targets will happen in socket ( or god forbid in dual socket). Coherency is a nice feature, but does not come for free.

2 socket systems are much much easier than 4S, scale better. Even if basic QPI interconnect has same speeds and latencies.
 

Similar threads


ASK THE COMMUNITY