Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

Doug S · Jun 7, 2023

naukkis said:
Every modern cpu instruction timing is wildly variable. Your 6502 example had memory running zero latency but additional one cycle latency for more than 8 bit addressing. CPU's now have usually 3-level caches for both instruction and data separately, memory is divided to many different timed pages - and usually operate in translated virtual memory meaning that memory access speed in every cache level varies too depending on translation cache hits and misses with page walks -resulting that every instruction execute can vary from one cycle to thousand cycles. And cpu's can reorder instruction to hide that to up to thousand instruction window. So to study how code is executing needs special tools to diagnose it - and for example Intel provides great tools for that.

I was talking in context of Mopetar's "perfect cache". i.e. some specialized thing for an embedded system where you could fix something in L1 and avoid TLB misses etc.

deasd · Jun 8, 2023

Zen 5 ES Granite Ridge in the wild

https://twitter.com/x/status/1666553190207086593

https://twitter.com/x/status/1666676094832107520

Geddagod · Jun 8, 2023

Zen 4 ES samples started being found 6-9 months before launch as well right? Early 1H 2024 is looking more and more likely.

Joe NYC · Jun 8, 2023

igor_kavinski said:
Has anyone here postulated that Zen 5 being on N3 and N4 could mean that the single CCD SKUs may use N4 and the dual CCD ones may use N3? It's also possible that the E-core CCD may use N3 for minimal energy usage while the P-core CCD will benefit from the maturity of the N4 node family?

I think that for Zen 5 generation, AMD will have a native 16 core chiplets, not dual CCD. On N3.

And N4 for "classic" 8 core chiplets.

igor_kavinski · Jun 8, 2023

Joe NYC said:
I think that for Zen 5 generation, AMD will have a native 16 core chiplets, not dual CCD. On N3.

That certainly would be exciting. So Zen 5 is expected to go up to 64 threads?

BorisTheBlade82 · Jun 8, 2023

igor_kavinski said:
That certainly would be exciting. So Zen 5 is expected to go up to 64 threads?

That depends. I'd expect for the consumer market, there will always be at least one big 8c CCD with Zen5. The second CCD could be a 16c Zen5c CCD or another 8c - giving you up to 48 hybrid threads.

Timmah! · Jun 8, 2023

igor_kavinski said:
That certainly would be exciting. So Zen 5 is expected to go up to 64 threads?

Oh, it would. Even if it was just single 16C chiplet for Zen5 and we would have to wait for dual chiplet version to Zen6. I dont intend to upgrade to Zen5 anyway

But Zen6 better be still on AM5 and provide substantial performance increase overall - both in IPC and number of cores.

Ajay · Jun 8, 2023

BorisTheBlade82 said:
That depends. I'd expect for the consumer market, there will always be at least one big 8c CCD with Zen5. The second CCD could be a 16c Zen5c CCD or another 8c - giving you up to 48 hybrid threads.

Yeah, having to cut down a 16 core CCD to 4,6,8 core CPUs would be pretty wasteful.

BorisTheBlade82 · Jun 8, 2023

Ajay said:
Yeah, having to cut down a 16 core CCD to 4,6,8 core CPUs would be pretty wasteful.

Not only that: IMHO for the client market, competitive ST performance will remain a significant factor even in the long run. The server market OTOH is already on the verge.

jamescox · Jun 8, 2023

DisEnchantment said:
I expect some interesting things from Zen 5 considering AMD is not developing cores on a shoestring budget anymore since a couple of years now.
Zen 3 was developed pretty much during the years of austerity at AMD. Zen 4 slightly less so and Zen 5 should see the first fruits of R&D under better days.
But more interesting for me is indeed packaging and SoC architecture. MI300 is almost here (next week?) to give us a glimpse of next gen packaging.

Curious to see whether InFO-R will replace substrate based PHY for 2.5D packaging on the Zen 5 family. Bergamo seems to have demonstrated the limits of routing with the substrate based interconnects and a likely way forward is fanout based RDLs at a minimum if not active bridges.
Besides the issue with practically no more space for traces coming out from IOD to CCD there is also the problem that the next gen IF which as per employee LinkedIn can hit up to 64Gbps compared to the current 36 Gbps.

I think InFO-3D could be a wildcard to enable lower cost 3D packaging. InFO-3D fits nicely here to enable lesser dense interconnect density than say FE packaging like SoIC but dense enough for SoC level interconnects for stacking on top of IOD. There is big concern at the moment with F15 and F14 underutilized and TSMC is pushing customers from 16FF and older to N7 family and ramping down those fabs (commodity process nodes you might say). Having any customer generously making use of N7/6 besides the leading node would be a win win.

Regarding the core perf gains, they have more transistors to work with and a more efficient process to work with, so at the very least just throwing more transistors at the problem should bring decent gains if their ~6 years (2018-2023) of 'grounds up design' of Zen 5 has to be worthwhile. Zen 4 is behind in capacity in almost all key resources of a typical OoO machine from key contemporaries. Pretty good (but not surprising given other factors) that it even keep up.

Nevertheless, few AMD patents, regarding core architecture, I have been reading strikes me as intriguing and I wonder if they will make it to Zen 5 in some form.
Not coincidentally all these patents are about increasing resources without drastically increasing Transistor usage.

Dual fetch/Decode and op-cache pipelines.

This seems like something that would be very interesting for mobile to power gate the second pipeline during less demanding loads

Remove secondary decode pipeline for a Zen 5c variant? Lets say 2x 4 wide decode for Zen 5 and 4 wide for Zen 5c

Retire queue compression

op-cache compression

Cache compression

Master-Shadow PRF

I had initially thought that we would get almost everything stacked in the Zen 5 generation, but that just doesn't seem to be the case. The stacked silicon packages are more expensive and they add in some limitations. They generally require that chips be directly adjacent. The infinity fabric fan out used for MCD/GCD connection seems to be an in between tech (almost 900 GB/s), but it also has the adjacency limitation. You can't easily build something like Genoa with either tech since it does not place the chips adjacent to each other. I thought about daisy chaining chiplets, but this is also a problem for routing high speed signals that distance. They might be able to place 4 along each edge and 2 on the top and bottom, but that would require a lot IO die design work and the chips may be too large.

Also, cpus just do not really need that high of bandwidth. Stacked devices would be lower power, but that seems to be one of the few advantages of stacked silicon for cpus. I suspect that MCM with infinity fabric connected chips (technically not chiplets) are going to stay with us for quite a while yet. They can continue to make them very cheaply since the same chiplet is used for a huge number of products. Intel, with an expensive stacked silicon package, will likely have trouble competing with this. Intel has their own fab, so I guess they don't take as big of a hit from having everything on the same tile chiplet and made on the most advanced process. AMD has been spliting everything out to allow them to make IO, cache, and logic all on different processes, which should allow them to better compete on price. This is in addition to having a cheaper MCM package.

For GPUs, 2.5D or 3D stacked silicon or interconnect makes sense due to the bandwidth requirements, but AMD isn't even using stacking for consumer level GPU devices. They are using infinity fabric fan out to connect MCDs to the GCD. The infinity cache allows the use of cheaper memory also, rather than HBM. Stacking seems to be reserved for the very high end like MI300. Since they probably have to use EFB to connect to the HBM, I suspect that the base die are connected together with EFB, which is also a cost saving packaging tech used in place of a full interposer. It would be great to be able to get HBM in consumer products though. An APU with a single HBM stack for mobile would be a powerful device. This also leads me to wonder what AMD could possibly still be making at Global Foundries. If looks like GF has made HBM in the past, so I was actually wondering if it is plausible that AMD would make a specialized version of HBM at GF using infinity fabric fan out links rather than 2.5D connections. The PC market is going to need something to compete with Apples M-series chips. This may require some add-on accelerator chips for video editing and such. Perhaps such things could be connected with infinity fabric fan out.

Joe NYC · Jun 8, 2023

igor_kavinski said:
That certainly would be exciting. So Zen 5 is expected to go up to 64 threads?

But that 16 cores would be only the dense cores as far as I understand. They would not be too exciting for desktop customers, since the ST performance might be weak.

It might be possible to release 1st CCD as 8 core classic (maybe with V-Cache) on N4 and 2nd CCD as 16 core dense on N3. To get an all around performance...

jamescox · Jun 8, 2023

moinmoin said:
It's not only about the improvements directly achieved but also the new technologies introduced (which can then be refined) and future improvements enabled by the changes (the usual even Zen gen).

Also the excitement may be not only about the Zen cores but also the package layout with CCDs and one IOD that with Zen 4 was still essentially unchanged since Zen 2.

We have seen some changes to the caches with different Zen generations, but I think Zen 5 is going to be changes to the whole cache hierarchy. This may be significantly more radical changes compared to zen 2 to Zen 3. I don't know whether that will result in big improvements though. Pushing single thread performance is obviously getting harder and harder, so I am keeping my expectations low. They can much more easily push FP performance, so I am expecing a significant increase there.

Joe NYC · Jun 8, 2023

jamescox said:
I had initially thought that we would get almost everything stacked in the Zen 5 generation, but that just doesn't seem to be the case. The stacked silicon packages are more expensive and they add in some limitations. They generally require that chips be directly adjacent. The infinity fabric fan out used for MCD/GCD connection seems to be an in between tech (almost 900 GB/s), but it also has the adjacency limitation. You can't easily build something like Genoa with either tech since it does not place the chips adjacent to each other. I thought about daisy chaining chiplets, but this is also a problem for routing high speed signals that distance. They might be able to place 4 along each edge and 2 on the top and bottom, but that would require a lot IO die design work and the chips may be too large.

Also, cpus just do not really need that high of bandwidth. Stacked devices would be lower power, but that seems to be one of the few advantages of stacked silicon for cpus. I suspect that MCM with infinity fabric connected chips (technically not chiplets) are going to stay with us for quite a while yet. They can continue to make them very cheaply since the same chiplet is used for a huge number of products. Intel, with an expensive stacked silicon package, will likely have trouble competing with this. Intel has their own fab, so I guess they don't take as big of a hit from having everything on the same tile chiplet and made on the most advanced process. AMD has been spliting everything out to allow them to make IO, cache, and logic all on different processes, which should allow them to better compete on price. This is in addition to having a cheaper MCM package.

For GPUs, 2.5D or 3D stacked silicon or interconnect makes sense due to the bandwidth requirements, but AMD isn't even using stacking for consumer level GPU devices. They are using infinity fabric fan out to connect MCDs to the GCD. The infinity cache allows the use of cheaper memory also, rather than HBM. Stacking seems to be reserved for the very high end like MI300. Since they probably have to use EFB to connect to the HBM, I suspect that the base die are connected together with EFB, which is also a cost saving packaging tech used in place of a full interposer. It would be great to be able to get HBM in consumer products though. An APU with a single HBM stack for mobile would be a powerful device. This also leads me to wonder what AMD could possibly still be making at Global Foundries. If looks like GF has made HBM in the past, so I was actually wondering if it is plausible that AMD would make a specialized version of HBM at GF using infinity fabric fan out links rather than 2.5D connections. The PC market is going to need something to compete with Apples M-series chips. This may require some add-on accelerator chips for video editing and such. Perhaps such things could be connected with infinity fabric fan out.

AMD will have the server market segmented into 4 segments:
- SH5: Mi300, high end, HBM for local memory, CXL for external memory
- SP5: traditional server including 12 channels of local memory
- SP6: low end, traditional server, 4-8 channels of local memory
- AM5: very low end servers, 2 channels of local memory

So, all the leading edge technologies will be only in the high end tier, and if the volume of the SH5 tier grows significantly, SP5 will just become a legacy socket.

I think we may see a similar bifurcation on the client side a generation later. I wonder if it happens with Strix Halo product or later.

BorisTheBlade82 · Jun 9, 2023

jamescox said:
I had initially thought that we would get almost everything stacked in the Zen 5 generation, but that just doesn't seem to be the case. The stacked silicon packages are more expensive and they add in some limitations. They generally require that chips be directly adjacent. The infinity fabric fan out used for MCD/GCD connection seems to be an in between tech (almost 900 GB/s), but it also has the adjacency limitation. You can't easily build something like Genoa with either tech since it does not place the chips adjacent to each other. I thought about daisy chaining chiplets, but this is also a problem for routing high speed signals that distance. They might be able to place 4 along each edge and 2 on the top and bottom, but that would require a lot IO die design work and the chips may be too large.

Also, cpus just do not really need that high of bandwidth. Stacked devices would be lower power, but that seems to be one of the few advantages of stacked silicon for cpus. I suspect that MCM with infinity fabric connected chips (technically not chiplets) are going to stay with us for quite a while yet. They can continue to make them very cheaply since the same chiplet is used for a huge number of products. Intel, with an expensive stacked silicon package, will likely have trouble competing with this. Intel has their own fab, so I guess they don't take as big of a hit from having everything on the same tile chiplet and made on the most advanced process. AMD has been spliting everything out to allow them to make IO, cache, and logic all on different processes, which should allow them to better compete on price. This is in addition to having a cheaper MCM package.

For GPUs, 2.5D or 3D stacked silicon or interconnect makes sense due to the bandwidth requirements, but AMD isn't even using stacking for consumer level GPU devices. They are using infinity fabric fan out to connect MCDs to the GCD. The infinity cache allows the use of cheaper memory also, rather than HBM. Stacking seems to be reserved for the very high end like MI300. Since they probably have to use EFB to connect to the HBM, I suspect that the base die are connected together with EFB, which is also a cost saving packaging tech used in place of a full interposer. It would be great to be able to get HBM in consumer products though. An APU with a single HBM stack for mobile would be a powerful device. This also leads me to wonder what AMD could possibly still be making at Global Foundries. If looks like GF has made HBM in the past, so I was actually wondering if it is plausible that AMD would make a specialized version of HBM at GF using infinity fabric fan out links rather than 2.5D connections. The PC market is going to need something to compete with Apples M-series chips. This may require some add-on accelerator chips for video editing and such. Perhaps such things could be connected with infinity fabric fan out.

I share your feelings. I was also more positive about the adoption of advanced packaging in the CPU space by AMD.
Heck, back in October 2022 I posted a longer post with some mock-ups of how I could imagine Zen5 with InFO-R (although I explicitly mentioned EFB, I think some basic rules apply there as well) to look like. Many of the points I made are pretty much in line with your reasoning.
As I do not want to repost, here is the link, in case you might be interested:

Page 7 - Discussion - Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 7 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

yuri69 · Jun 9, 2023

AMD has been all about cheap-to-make products using a sensible tech.

The MI300 is a premium product with the price tag surely sitting well above 128c server chips or the previous accelerators.

soresu · Jun 9, 2023

yuri69 said:
AMD has been all about cheap-to-make products using a sensible tech.

The MI300 is a premium product with the price tag surely sitting well above 128c server chips or the previous accelerators.

Yesn't.

The entire decision to pursue HBM for Fiji and Vega was expensive relative to the mainstream altermative.

3D V cache isn't exactly cheap either considering the cache chiplets are only 1 node behind the main processor die.

They have pursued advanced packaging tech for a while now.

MI300 is something of a culmination in those efforts - 2.5D/3D packaged chiplets (beeeeg chiplets) and HBM stacks together to make a compact, but high density accelerator on a single socket.

yuri69 · Jun 9, 2023

Productionisation of the HBM-based Fiji was clearly a mistake. Using HBM for Vega was partly a result of the GDDR5->GDDR6 transition period and partly the desire to start a line of compute-oriented chips where HBM is a must.

The 3D VCache has been AMD's way to ensure staying relevant in the gaming/engineering market. They chose to keep using the same interconnect, cores, etc. but employ TSMC's 3D.

It still feels like going cheap is the main way.

Exist50 · Jun 9, 2023

soresu said:
3D V cache isn't exactly cheap either considering the cache chiplets are only 1 node behind the main processor die.

They can get an extra $100 for it, so the cost makes sense there. The big question is how the incremental cost of more advanced packaging compares to its product benefits. And more likely than not, that tradeoff will change over time.

A/// · Jun 9, 2023

Exist50 said:
They can get an extra $100 for it, so the cost makes sense there. The big question is how the incremental cost of more advanced packaging compares to its product benefits. And more likely than not, that tradeoff will change over time.

Yep and "isn't cheap" is not an issue is

a - it's cheap to produce per $ chip

b - it's got a damn good yield

the results speak for themselves. the x3d processors are terrific, specificaly the x800x3d's. they perform just as well as any other processor in games but in games where their $ gets hit they outperform the competition. very curious to see if intel attacks this anytime soon in upcoming gens.

DisEnchantment · Jun 11, 2023

When Zen 2 came out in 2019, it was a radical change when it comes to CPU packaging as well as core scaling with SCF/SDF.
5 years later in 2024 we would hope something new will come up to address the short comings of this technology for the next gen CPUs

Latency is a known issue with AMD's IFOP and for addressing that few LinkedIn posts put it at 64 Gbps for next gen IF. This is a big jump and could have a major efficiency impact if same IFOP is used going forward.
Bandwidth was not an issue with earlier cores, but Zen 4 showed signs of bandwidth deficiency in many workloads with the 36 Gbps IFOP.
Power, well 0.4 pJ/bit for MCD links vs 2pJ/bit speaks for itself. GLink is being advertised as 0.25 pJ/bit
Die area being consumed on Zen 4 for IFOP is ~7mm2 in a 66mm2 chip (excluding the SDF/SCF that is part of L3), that is 10% of a potential N4P and soon N3E Si. GCD-MCD links have demonstrated smaller beachfront for higher bandwidth density. GuCs GLink for instance needs 3mm of beachfront to provide 7.5 Tbps of BW.
Trace density from the IOD to the CCDs and IO. It seems there is already a limit reached how much space is available to route the signals from IOD to CCD considering the space is also needed for IO/Memory etc traces

AMD will have to address the above problems with a new interconnect, even their competitor is using much more exotic BE packaging in current gen products. But I wouldn't hold my breath if next gen CPUs are stuck on same tech.
As for the costs, AMD is doing fanout links on a 750 USD GPU product in which the actual chip should be sold at half of that if not less, and with 6 of these links. And SoIC parts like 5800X3D selling for less than 300 USD.

jamescox said:
I had initially thought that we would get almost everything stacked in the Zen 5 generation, but that just doesn't seem to be the case. The stacked silicon packages are more expensive and they add in some limitations. They generally require that chips be directly adjacent. The infinity fabric fan out used for MCD/GCD connection seems to be an in between tech (almost 900 GB/s), but it also has the adjacency limitation. You can't easily build something like Genoa with either tech since it does not place the chips adjacent to each other. I thought about daisy chaining chiplets, but this is also a problem for routing high speed signals that distance. They might be able to place 4 along each edge and 2 on the top and bottom, but that would require a lot IO die design work and the chips may be too large.

Also, cpus just do not really need that high of bandwidth. Stacked devices would be lower power, but that seems to be one of the few advantages of stacked silicon for cpus. I suspect that MCM with infinity fabric connected chips (technically not chiplets) are going to stay with us for quite a while yet. They can continue to make them very cheaply since the same chiplet is used for a huge number of products. Intel, with an expensive stacked silicon package, will likely have trouble competing with this. Intel has their own fab, so I guess they don't take as big of a hit from having everything on the same tile chiplet and made on the most advanced process. AMD has been spliting everything out to allow them to make IO, cache, and logic all on different processes, which should allow them to better compete on price. This is in addition to having a cheaper MCM package.

For GPUs, 2.5D or 3D stacked silicon or interconnect makes sense due to the bandwidth requirements, but AMD isn't even using stacking for consumer level GPU devices. They are using infinity fabric fan out to connect MCDs to the GCD. The infinity cache allows the use of cheaper memory also, rather than HBM. Stacking seems to be reserved for the very high end like MI300. Since they probably have to use EFB to connect to the HBM, I suspect that the base die are connected together with EFB, which is also a cost saving packaging tech used in place of a full interposer. It would be great to be able to get HBM in consumer products though. An APU with a single HBM stack for mobile would be a powerful device. This also leads me to wonder what AMD could possibly still be making at Global Foundries. If looks like GF has made HBM in the past, so I was actually wondering if it is plausible that AMD would make a specialized version of HBM at GF using infinity fabric fan out links rather than 2.5D connections. The PC market is going to need something to compete with Apples M-series chips. This may require some add-on accelerator chips for video editing and such. Perhaps such things could be connected with infinity fabric fan out.

I noticed one patent which attempted to address this issue by using normal fanout when the CCDs are single column on each side of the IOD and adding bridges in the fanout when multi columns of CCDs are there. so basically same CCD/IOD can be used but packaged differently for different configs.
https://www.freepatentsonline.com/11469183.html

Rumors of MI300C have been floating around, lets see if this is real in couple of days. It could be a precursor.

soresu · Jun 11, 2023

DisEnchantment said:
Bandwidth was not an issue with earlier cores, but Zen 4 showed signs of bandwidth deficiency in many workloads with the 36 Gbps IFOP.

Is 36 Gbps IFOP from Zen2 generation?

DisEnchantment · Jun 11, 2023

soresu said:
Is 36 Gbps IFOP from Zen2 generation?

It depends on FCLK. Earlier Zen generations have slightly lower FCLK. Zen 2 was stable at 1600MHz FCLK. Zen 4 is around 2000MHz.
Unfortunately not very public how the 36 Gbps figure was arrived for Zen 4 and official figures I have seen for Zen 2 is 25 Gbps. My guess is FCLK is the main difference from Zen 2 to Zen 4.

Kepler_L2 · Jun 11, 2023

DisEnchantment said:
When Zen 2 came out in 2019, it was a radical change when it comes to CPU packaging as well as core scaling with SCF/SDF.
5 years later in 2024 we would hope something new will come up to address the short comings of this technology for the next gen CPUs

Latency is a known issue with AMD's IFOP and for addressing that few LinkedIn posts put it at 64 Gbps for next gen IF. This is a big jump and could have a major efficiency impact if same IFOP is used going forward.

Bandwidth was not an issue with earlier cores, but Zen 4 showed signs of bandwidth deficiency in many workloads with the 36 Gbps IFOP.

Power, well 0.4 pJ/bit for MCD links vs 2pJ/bit speaks for itself. GLink is being advertised as 0.25 pJ/bit

Die area being consumed on Zen 4 for IFOP is ~7mm2 in a 66mm2 chip (excluding the SDF/SCF that is part of L3), that is 10% of a potential N4P and soon N3E Si. GCD-MCD links have demonstrated smaller beachfront for higher bandwidth density. GuCs GLink for instance needs 3mm of beachfront to provide 7.5 Tbps of BW.

Trace density from the IOD to the CCDs and IO. It seems there is already a limit reached how much space is available to route the signals from IOD to CCD considering the space is also needed for IO/Memory etc traces

AMD will have to address the above problems with a new interconnect, even their competitor is using much more exotic BE packaging in current gen products. But I wouldn't hold my breath if next gen CPUs are stuck on same tech.
As for the costs, AMD is doing fanout links on a 750 USD GPU product in which the actual chip should be sold at half of that if not less, and with 6 of these links. And SoIC parts like 5800X3D selling for less than 300 USD.

I noticed one patent which attempted to address this issue by using normal fanout when the CCDs are single column on each side of the IOD and adding bridges in the fanout when multi columns of CCDs are there. so basically same CCD/IOD can be used but packaged differently for different configs.
https://www.freepatentsonline.com/11469183.html
View attachment 81610

Rumors of MI300C have been floating around, lets see if this is real in couple of days. It could be a precursor.

Zen5 won't change anything in this regard, but AFAIK Zen6 and onwards will use MI300/400 style packaging.

Joe NYC · Jun 11, 2023

Kepler_L2 said:
Zen5 won't change anything in this regard, but AFAIK Zen6 and onwards will use MI300/400 style packaging.

That seems like the most logical progression.

Mi400 will get Zen 5 cores. Cloud providers / hyperscalers start migration from local motherboard memory to pooled CXL memories. AMD will be able to effectively serve both markets, with SP5 and SH5 (with 2 ways to win).

Joe NYC · Jun 11, 2023

DisEnchantment said:
Rumors of MI300C have been floating around, lets see if this is real in couple of days. It could be a precursor.

It seems that with the Mi300 approach about to be released this year, from GPU, APU to CPU, I don't think AMD is going to expend any money or effort on any half measure between the current Zen 4 (Genoa) and Zen 5 (Turin) on SP5 socket and the "nirvana" of Mi300/400.

All of the innovation will go into SH5 socket.

I wonder if there is a way to retrofit AM5 socket to move to the Mi300 approach on the client side. Maybe it is not worth it, and maybe AMD will offer alternative to AM5 socket sometimes between Zen 5 and Zen 6. The Strix Halo may potentially be the harbinger of this.

1/4 of Mi300 would be the right size for client, with base die, 1-2 HBM stacks, 1-3 compute units.

Or 1/2 of Mi300, 1-4 HBM stacks, and discrete desktop GPU level graphics performance.

The key - or the major stumbling block - is to get HBM DRAM at competitive price, and AMD would have to work extremely closely with one of the DRAM makers to enable competitive price through large volume purchases.

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Diamond Member

Senior member

Golden Member

Diamond Member

Lifer

Senior member

Golden Member

Lifer

Senior member

Senior member

Diamond Member

Senior member

Diamond Member

Senior member

Senior member

Diamond Member

Senior member

Platinum Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Senior member

Diamond Member

Diamond Member