Question Zen 6 Speculation Thread

Page 332 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

marees

Platinum Member
Apr 28, 2024
2,175
2,810
96
Olympic Ridge: IOD + CCD
Medusa Point: SoC + (optional) CCD
Medusa Premium: SoC + AT4 GMD
Medusa Halo: SoC + AT3 GMD

Yes

1 IOD, 1 CCD, 3 SoCs, 2 GMDs (which are also used for dGPUs)

Doubt that will become an actual product

Just a MID for PCIe/Display

Medusa point 1 alone has an optional 12p ccd. This is known as medusa point-1 hi

Medusa point 3 is bumblebee. It doesn't have optional ccd

Medusa point 2 sits between 1 & 3 but no optional config. This is not on AMD's leaked roadmaps (unlike bumblebee) hence speculation is that it is 2028 or later depending TSMC nodes availability

So on FP10 socket we have
  1. Medusa premium — 4p+8c ccd + mid with 2lp + AT4 GCD with 12wgp/24cu
  2. Medusa point 1 high — 12p ccd + soc with 4p + 4c + 2lp + unknown rdna 3.5 CU
  3. Medusa point 1 — soc with 4p + 4c + 2lp + unknown rdna 3.5 CU
  4. Medusa point 2 — soc with less than 4/4/2 cpu & rdna 3.5 cu
  5. Medusa point 3 (bumblebee) — soc with 2/2/2 & rdna 3.5 cu

There is a disagreement between the two post.
  • 1 CCD only with with 12 classic cores
  • 2 CCDs
    1. with 12 classic Cores
    2. with 4 classic cores and 8 dense cores
If the hybrid CCD is real, then it would IMO be better a fit to Medusa point 1 high. 8p + 12c is IMO better suited for mobile than 16p+4c

going back thru all the leaks, it appears

  1. there is only 1 cpu ccd = 12 zen 6p cores
  2. 6 to 3 soc cum iod dies
    1. medusa point 3 — bumblebee 2p + 2c +2lp + RDNA 3.5 gpu cores
    2. medusa point 2 — possibly scrapped/delayed
    3. medusa point 1 — 4p + 4c + 2lp + 8CU RDNA 3.5
    4. medusa point 1 high — medusa point 1 + 12p ccd
    5. medusa premium — 4p + 8c + 2lp apu cum iod + AT4 (24 CU / 12 wgp RDNA 5) GCD
    6. medusa halo — 4p + 8c + 2lp apu cum iod + AT3 (48 CU / 24 wgp RDNA 5) GCD + optional 12p ccd
    7. magnus — 3p + 8c apu cum iod + AT2 (72? CU / 36? wgp RDNA 5)
Above suggests that medusa halo (MDS-H) & medusa mini halo (MDS-P medusa premium) share the same base iod/apu with 4p + 8c + 2lp with MDSH having the option to add 12p core ccd on top of this config

magnus remains separate. doesn't have lp cores (unlike PS6)

to add:
medusa point-1 is on FP8 socket (same as strix point)
medusa point-1 high is on FP10 socket (same as MDSP)

 
Last edited:

fastandfurious6

Senior member
Jun 1, 2024
895
1,033
96
7+ SKUs??? this is confusing even for us....

though the base parts are simply:

CPU
a) 12p full zen6 CCD (+ low-power interconnect previously only on STXHalo)
b) 4p + 8c frankenstein "point" hybrid
c) 2LP cores (new IOD everywhere)

GPU
d) 2/8CU rdna3.5 (same as zen5)
e) 24CU rdna5 increments for Halo (24 half halo / 48 full halo / 72 magnus)
 
Last edited:

fastandfurious6

Senior member
Jun 1, 2024
895
1,033
96
the 4p + 8c frankenstein is likely due to failure to contain full 12c ccd in ultramobile temps/V. But what is its true benefit to justify it?

Beyond that I don't see a market for medusa point.... looks like a really bad marketing plan. I doubt Strix Point offerings did well in the market either. Too expensive, better options for same price.


TBH unless there's serious logistics reasons to keep Point series, AMD should scrap it... all of it


Just keep the equivalents of:

9700x = 1CCD (12p)
9950x = 2CCD (12p + 12p)
Halo Half = 1CCD + 24CU
Halo Full = 1/2CCD + 48CU

that's it. same for both desktop + laptop:

- new zen6 CCD has low idle W + better temps/V by using by default the better low-power interconnect (used in STXHalo)
- 2LP cores will do wonders for OS and ultramobile

the rest should be just logistical offerings / yield failures i.e. 6 cores half CCD disabled and use these for ultramobile offerings
 

marees

Platinum Member
Apr 28, 2024
2,175
2,810
96
I don't see a market for medusa point
medusa point 1 (without the added 12p ccd) sits in the FP8 socket of strix point. that matters a lot to AIBs

medusa point 3 (bumblebee) is the low power variant that replaces Krackan (as per road map) — mendocino will remain until zen 7 grimlock point 4?

medusa premium will sit in the same socket, FP10, as medusa point 1 high with the added 12p ccd

that leaves medusa halo & magnus — problem solved if both of them have the same soc (4p + 8c + 2lp) as medusa premium — this is my suggestion. not sure how AMD will do this

(& medusa point 2 is either later or scrapped — will only release if the stars align)
 
Last edited:

fastandfurious6

Senior member
Jun 1, 2024
895
1,033
96
MP1 sits on FP8:
same expensive laptops without might of full zen6 12/24 cores
I speculate it will fail

MP3/MP2/bumblepoint/krakan/phoenix/mendocino/cappuccino/assassino:
nobody cares TBH
price : perf wise, 2nd-hand previous gens are better
the only benefit is for AMD to cut costs, does it achieve that?

"medusa premium":
should be just called half-halo or similar


Halo definitely does have a market on its own

It surprises me to see the 4p+8c hybrid everywhere in so many offerings. Does it really offer such good thermal/V benefits?
 

marees

Platinum Member
Apr 28, 2024
2,175
2,810
96
It surprises me to see the 4p+8c hybrid everywhere in so many offerings. Does it really offer such good thermal/V benefits?
is there any workload that benefits from more than 4 zen 6 classic cores (& not outsourced to the zen 6c cores)
 

fastandfurious6

Senior member
Jun 1, 2024
895
1,033
96
you're kinda right for pure gaming / steamdeck (1 thing running) but STX didn't work

STX (ZEN5) = 4p + 8c
PHX (ZEN4) = 8p

upgrade from PHX to STX is really not worth it, not much benefit and high STX prices

PHX = KRK = "lower end" zen6 etc

zen6c "dense" likely same perf as zen4 full with better thermals

cross-CCD latency etc


my speculation is that ramming the hybrid 4p8c everywhere won't work

full 12p CCD will work better in all scenarios except pure gaming
 
  • Like
Reactions: Tlh97 and marees

itsmydamnation

Diamond Member
Feb 6, 2011
3,113
3,964
136
When you consider total cores in a given area and a limited TDP per core , there will come a cross over point were more efficent lower clocked core will be better then less higher clocked more area taking cores.

So for desktop we only pretend to care about power. so give us fat cores and turn voltage up to 11. but out side of that and specific server sku's. within a given TDP the dense cores are going to be cheaper to make and more efficent.

on the flip side there will be a point where lightly threaded workloads generally dont scale to more cores/threads. So if AMD have picked 4+8 i would assume thats been profiled/modelled.
 

fastandfurious6

Senior member
Jun 1, 2024
895
1,033
96
you're theorizing, in theory we 100% agree

I'm also only talking mobile


however the reality is that STX completely failed.

STX (4p8c hybrid) vs 9955HX (full 16p) = STX loses literally everywhere from 5% to 70% or so

even gaming: both FHD and QHD, 9955HX squeezes from +15 to +60 fps... same GPU + TDP
STX laptops are also so expensive that it's hard to justify vs 9955HX unless you need a specific model for whatever reason unrelated to CPU (i.e. 4k screen unavailable to 9955HX models)


the only remaining theory is 4p+8c doing better than full CCD in ultramobile/lowTDP situations, but even there STX gets trashed..... in all scenarios, both low and high power


compare the CPUs with same GPUs on silent mode


asus strix G16 (9955HX + 5070ti):
1strix.png

asus zephyrus G14 (STX + 5070ti)
1zeph.png





huge difference.......

also, 9955HX silent (55W TGP GPU) hits blows with HX370 manual (max power, 50+dbA) on certain parts:
1zeph1manual.png



but all that gets completely owned by 9955HX turbo (manual can get extra +5%) :
1strix1turbo.png





all this is reflected in real games FPS too

farcry6 QHD ultra:
9955HX(silent) = 83fps
STX(silent) = 54fps
STX(perf) = 92fps
9955HX(perf) = 118fps

cyberpunk QHD ultra:
9955HX(turbo) = 84fps
STX(turbo) = 62fps


basically, silent 9955HX > performance (loud) STX...

literally silent 9955HX vs perf/loud STX HX370:

+50% Fire Strike score
+50% Cinebench MT,
+10% ST
+45% 3DMARK CPU score
~+25% SPECviewscore

with real games FPS being similar....


the links full reviews


lastly AMD has developed the ultra-low-power interconnect used on Halo but they didn't manage to fit it on 9955HX on time. zen6 should have it by default, makes 24 full cores viable to run on low power mode
 
Last edited:

itsmydamnation

Diamond Member
Feb 6, 2011
3,113
3,964
136
you're theorizing, in theory we 100% agree

I'm also only talking mobile


however the reality is that STX completely failed.

STX (4p8c hybrid) vs 9955HX (full 16p) = STX loses literally everywhere from 5% to 70% or so

even gaming: both FHD and QHD, 9955HX squeezes from +15 to +60 fps... same GPU + TDP
STX laptops are also so expensive that it's hard to justify vs 9955HX unless you need a specific model for whatever reason unrelated to CPU (i.e. 4k screen unavailable to 9955HX models)


the only remaining theory is 4p+8c doing better than full CCD in ultramobile/lowTDP situations, but even there STX gets trashed..... in all scenarios, both low and high power


compare the CPUs with same GPUs on silent mode


asus strix G16 (9955HX + 5070ti):
View attachment 136130

asus zephyrus G14 (STX + 5070ti)
View attachment 136129





huge difference.......

also, 9955HX silent (55W TGP GPU) hits blows with HX370 manual (max power, 50+dbA) on certain parts:
View attachment 136132



but all that gets completely owned by 9955HX turbo (manual can get extra +5%) :
View attachment 136133





all this is reflected in real games FPS too

farcry6 QHD ultra:
9955HX(silent) = 83fps
STX(silent) = 54fps
STX(perf) = 92fps
9955HX(perf) = 118fps

cyberpunk QHD ultra:
9955HX(turbo) = 84fps
STX(turbo) = 62fps


basically, silent 9955HX > performance (loud) STX...

literally silent 9955HX vs perf/loud STX HX370:

+50% Fire Strike score
+50% Cinebench MT,
+10% ST
+45% 3DMARK CPU score
~+25% SPECviewscore

with real games FPS being similar....


the links full reviews


lastly AMD has developed the ultra-low-power interconnect used on Halo but they didn't manage to fit it on 9955HX on time. zen6 should have it by default, makes 24 full cores viable to run on low power mode
Your compairing SOC's to SOC's and then using that as a proxy to cores/ core perf/watt , thats not going to end well. To many unknown viables, even under the cover firmware/AGESA config. Even on my Zen3 laptop using ATTU/RyzenController/STAMP i can create very different looking perf/watt/max perf configurations.

Also none of what you have pasted shows anything about core loading / core clock / core voltage/ wattage let alone those things over time, what these base STAMP config is etc. its just a big soup of nothing.

there is an example
 
  • Wow
Reactions: fastandfurious6

itsmydamnation

Diamond Member
Feb 6, 2011
3,113
3,964
136
Yeah, as predicted. Just like basically everyone else already has concluded a long time go (Apple, Qualcomm, Mediatek, Intel, …). AMD was just a bit late to the party.
AMD was first to do it right..........

having workloads end up on A53 was horrid , having xls threads running on RPL-U E-Cores is so bad, its horrible.

Apple is good but they dont have to deal with supporting big wide registers.
 

StefanR5R

Elite Member
Dec 10, 2016
6,827
10,924
136
so AMD architect team really concluded that P+E combo works better most of the time?
No. AMD does not have P and E cores.

Right now they have one and the same core, but either physically optimized with a focus on peak frequency, or physically optimized with the focus more shifted towards area. (Besides different FP pipeline width in different markets.)

That's not new and I am puzzled why people still don't get in which market areas a combined non-dense + dense chip makes sense.

Just like basically everyone else already has concluded a long time go (Apple, Qualcomm, Mediatek, Intel, …). AMD was just a bit late to the party.
AMD is doing something rather differently than that bunch. AMD doesn't put different core microarchitectures into a single chip. Again, that's old now and folks should have realized how that differs.
 

Fjodor2001

Diamond Member
Feb 6, 2010
4,505
711
126
AMD was first to do it right..........

having workloads end up on A53 was horrid , having xls threads running on RPL-U E-Cores is so bad, its horrible.

Apple is good but they dont have to deal with supporting big wide registers.
There are of course some drawbacks for certain cases. But initial issues have been mostly been mitigated / fixed. So the pros overrule the cons by far, which they all have concluded.

Like I said, there is a reason why everyone in the business is heading in this direction, and staying there. If it would have been bad, they would not keep using big.LITTLE style solutions.
 
  • Haha
Reactions: Thibsie

Fjodor2001

Diamond Member
Feb 6, 2010
4,505
711
126
AMD is doing something rather differently than that bunch. AMD doesn't put different core microarchitectures into a single chip. Again, that's old now and folks should have realized how that differs.
AMD is not the only one that uses the same ISA on both core types, if that’s what you mean by microarchitectures? If you’re instead talking about internal arch, that does not matter as long as the ISA is the same.
 
Last edited:

Thibsie

Golden Member
Apr 25, 2017
1,172
1,382
136
There are of course some drawbacks for certain cases. But initial issues have been mostly been mitigated / fixed. So the pros overrule the cons by far, which they all have concluded.

Like I said, there is a reason why everyone in the business is heading in this direction, and staying there. If it would have been bad, they would not keep using big.LITTLE style solutions.
Mitigated ? Where, lol ?
 

Fjodor2001

Diamond Member
Feb 6, 2010
4,505
711
126
Mitigated ? Where, lol ?
E.g. in the OS scheduling mechanisms. What remaining real-life problems do you see with current generation big.LITTLE style CPUs (regardless of whether they call it P+E, classic vs compact, or whatever)?
 
Last edited:

Thibsie

Golden Member
Apr 25, 2017
1,172
1,382
136
AMD is not the only one that uses the same ISA on both core types, if that’s what you mean by microarchitectures?
No, not the ISA, the architecture.
E-Cores have nothing to do with P-Cores and I’m not even talking about ISA. But you knew that didn’t you ?

Now, I’ll stop feeding you know what…
 

Fjodor2001

Diamond Member
Feb 6, 2010
4,505
711
126
No, not the ISA, the architecture.
E-Cores have nothing to do with P-Cores and I’m not even talking about ISA. But you knew that didn’t you ?

Now, I’ll stop feeding you know what…
I already updated my post to clarify this before you responded. Like I mentioned there, if it is internal arch that is meant instead of ISA, that does not matter. The SW won’t see/notice the difference as long as the ISA is the same (except that one core type is faster than the other of course).
 

StefanR5R

Elite Member
Dec 10, 2016
6,827
10,924
136
Consider the example of a program's inner loop parallelized by e.g. OpenMP and executed on either a mix of dense and non-dense cores (hint: they all finish their parts of the loop body at about the same time) or on a mix of cores of different Core Microarchitectures (hint: they handle their parts of the loop different from each other because they have different instruction latencies and whatnot).

Which approach works? Both of them. Which works well? One of them.
 
  • Like
Reactions: booklib28