But, what does AMD do on a GF process that makes any financial sense?
Related old GloFo roadmap:
12nm FinFET - 2023-2025 => ~3500-3700 USD at GloFo.
7nm FinFET - 2023-2025 => ~9000+ USD at TSMC
5nm FinFET - 2023-2025 => ~13000+ USD at TSMC
3nm FinFET - 2023-2025 => ~18000+ USD at TSMC
R&D of NTO on 12nm => ~$100M
R&D of NTO on 7nm => ~$300M
R&D of NTO on 5nm => ~$550M
Addition of 3D-NTO for 12nm => ~$130M (no matter how much layers added, the cost for AMD is fixed regardless of layer count on R&D)
Single-layer 12nm ('23-'25) => 3500-3700 USD
Two-layer 12nm => 4500-4700 USD
Three-layer 12nm => 5500-5700 USD
Four-layer 12nm => 6500-6700 USD
Five-layer 12nm => 7500-7700 USD
2016-2017 14nm => ~8000 USD
AMD's target with GlobalFoundries is to reduce cost.
12nm - 1 layer => Behind 7nm
12nm - 2 layer => Equal to 7nm node || 0.5x cost of 2D-7nm
12nm - 3 layer => Equal to 5nm node || 0.4x cost of 2D-5nm
12nm - 4 layer => Equal to 3nm node || 0.35x cost of 2D-3nm
12nm - 5 layer => Equal to 2nm node
It is very specifically meant for AMD's HPC microarchitecture. Which is likely macro-sliced or whatever Zen(x) on 12nm-3D. With the performance of Zen2/Zen3, but with reduced cost. Allowing for 5x00X/3x00X/2x00X-like performance at a significantly lower price point. AM4 will continue to get products, but they will return the entire stack back to GlobalFoundries.
2015 plan for 2017 AM4:
Summit => GlobalFoundries
Raven => GlobalFoundries
Bristol => GlobalFoundries
Stoney => GlobalFoundries
2023-2025 transitions from TSMC products back to GlobalFoundries products.
AM5 => TSMC-only, Higher ASP with higher than expected performance jumps.
AM4 => Transitions back to GloFo-only, Lower ASP with expected performance jumps.
AM4 GloFo Strikes Back edition is split in two architectures:
HPC at Malta-3D; Increased area density/perf at lower cost relative to TSMC-variants => High price point (~$200) ~ Budget-market
ULP at Dresden-2D; Significant decreased costs for low-cost markets with AM1(Bhavani) & FM2+(Carrizo[A8-7680/A6-7480]) target. => Low price point (~$50) ~ Ultra-budget-market
Specifically 3D-HPC (3D-Zen);

12nm 7.5T/84CPP/64Mx
7nm 6T/57CPP/40Mx+57Mx
For two-layer(same lib as prior 12nm; no idea if new 12nm has new denser libs):
Most optimistic =>
12nm-3D virtually is 3.75T/42CPP/32Mx (There is plenty of strategies of getting 0.5X*0.5Y shrink w/ 2Z expanded)
Least optimistic =>
12nm-3D virtually is 5.6T/63CPP/48Mx + 3rd layer => 4.2T/47CPP/36Mx
GloFo-3D(TSV)&M3D(MIV) production-ready showcase: January 2023 to June 2023, customers receive their orders in "Early 2023"
They could possibly update Dali (Zen1+ 2/4, 3CU Vega 1st gen) to 12LP+ for better power savings and higher transistor density for better wafer yields for the very bottom of the market, but, that product is just way WAY behind at this point.
If they did that, it would NOT be 12LP+, but this node:
https://ieeexplore.ieee.org/document/9771014
14LPP/12LP/12LP+ => Doped Fins for sLVT/LVT/HVT.
The above node follows 14HP and 7LP => Undoped Fins for sLVT/LVT/HVT.
There however is no planned RTOs, the only plan is NTOs.
A lib-shrink/optimization of Raven2/Dali is unlikely, a project with the scope of Monet however is likely to pop-up.
With 3D-TSV/3D-MIV being used together, as that is the most advanced option succeeding standard 12nm;
2-layer CPU (MIV-interconnect + TSV) <-- CPU Hottest so closest to heatsink:: Zen3-esque target
2-layer GPU (MIV-interconnect + TSV) <-- GPU Middle do to low-leak:: RDNA2-esque target ==> Re-used as RX6100/RX7100
IOD (TSV-interconnect) <-- IOD is bottom
~~~~~~~~
Prior WSA => "Revenue-centric" - pay more for shrinks
Current WSA => "Volume-centric" - pay less for more volume, the current metric hitting GlobalFoundries is low-utilization rate. Low-util rate means higher fab upkeep costs.
AMD's 2011-2015 3D roadmap, likely revived since node shrinks at GloFo beyond 12nm/11nm are far away till something replaces 193i:
CPU-first:
3D-CPU for Malta (cheaper than 7nm, standard perf up(Zen2/Zen3 => 5-7% increase in perf for 12nm 3D)
2D-CPU for Dresden (heavy cost cutting: 12nm Quad+ RISC-V P710/ARM A710+Custom clones)
GPU-second:
New IOD following AM5's strategy, but rather than having a in-tile GFX, it will be on-tile GFX. Potentially, IO+GFX would be re-used for low-end/budget GPU. || Malta
2D-CPU will have a 2D-GPU w/ 3D-DRAM in 2.5D(APU-function), but will be even lower-end/ultra-budget for AIB(GPU-function). || Dresden
APU-last:
3D-CPU stacked over IO+GFX tile || Malta
2D-CPU+2D-GPU includes CPU or GPU IO in Version2 bottom die(true-APU function). || Dresden
Mature node (GF-only) => Mature platform
Bleeding node (TSMC-only) => New platform
New designs+new innovations is the expected iterative design focus.
TSMC|AMD|AM5 = Maximum budget($$$), maximum (above industry) performance increase
GF|AMD|AM4 (HPC) = Reduced budget($$), expected (at industry) performance increase
GF|AMD|AM4 (ULP) = Lowest budget($), good-enough (below industry) performance increase for below expected TDP/PPT(above industry).
New products as stated before will be there for AM4. Of which, Zen-category will be cheaper than previous iterations (compared to: Ryzen-7nm), and below Zen-category will be even cheaper (compared to: Ryzen-14nm/Ryzen-6nm).
Also, as we approach the launch of ULP:
ULP1 Micro-architecture(2023-12nm) => Cluster-based :: iCore - ILP/TLP & fCore - ILP/TLP/DLP(ST gets faster big SIMD)
ULP2 Micro-architecture(2025-11nm shrink) => Grid-based :: mCore - ILP/TLP/DLP
Example of Grid;
Standard Arch Integer: 2x 64-bit sALU + 1x64-bit IMUL + {2x 128-bit VALU + 1x 128-bit VIMUL} cannot be used for ILP.
Grid Arch Integer: 2x 64-bit Control ALUs(ILP or TLP) + 2x 64-bit ALUs(ILP/TLP/DLP) + 2x 64-bit ALUs(ILP/TLP/DLP) + 2x 64-bit IMUL(ILP/TLP/DLP)
Left-half gets PRF0, Right-half get PRF1:: same to less function units of standard arch, but better utilization of ILP Superscalar or DLP SIMD if program swings either way. TLP in case ILP/DLP isn't exploited, as well. ~90+% work efficiency(unit efficiency) on normal real world programs.
Compute programs = ~42% Integer Scalar + Packed, with more focus towards Scalar(~70%) over Packed(~30%). Scalar-compute is currently executed on the fCore not the iCore, locking out big perf increase of OoO-ILP optimization. As the iCore is ILP-focused, while the fCore is DLP-focused. Hence, the focus of a mixed-point grid-architecture core, which can focus on both.
64-bit GPR+128-bit Packed (64-bit PRF0+64-bit PRF1) => much more efficient than 64-bit GPR PRF0 + 128-bit Packed PRF1. It also can scale well with 64-bit FP+128-bit VFP(64-bit PRF2+64-bit PRF3).
With HPC, being the key high ASP part on AM4; HPC1/HPC2/etc would be 2-layer/3-layer/etc being stacked monolithic.
AM5 {
Path1. (Performance-orientated) TSMC shrink-scaling => 2D Standard(Zen) Architecture (Increased cost input, requires higher ASPs)}
AM4 NPI-post TSMC to AM5 {
Path2. (Power-orientated) GlobalFoundries stack-scaling => 3D Standard(Zen) Architecture (Reduced cost input, reduce need of higher ASP)
Path3. (Pervasive-orientated) GlobalFoundries cost-scaling => 2D Non-standard(a.k.a. Not Zen) Architecture (Maximize energy/area/cost, lowest cost input, ultra-low ASP but better profit-margin)}
CPU-side, majority of configurations + price:
AM5 New TSMC-HPC(Zen) = 105W-170W TDP (~142W to ~230W PPT) == $$$ -- ~$650 (CPU+Mobo) <== Highest revenue for single unit
AM4 New GF-HPC(Zen) = 35W-65W TDP (~47W to ~88W PPT) == $$ -- ~$250 (CPU+Mobo)
AM4 New GF-ULP(Not) = 5W-25W TDP (~7W to ~34W PPT) = $ = ~$75 (CPU+Mobo) <== Highest revenue for total market
AM4 Ryzen Server (upper-band ASP) -> AM5 Ryzen Server (Top-tier HPC/Zen @ TSMC)
AM4 Ryzen Server (middle-band ASP) -> AM4 Athlon Server (Budget-tier HPC/Zen @ GF)
Prior Opteron (lowest-band ASP, pre-AM4) -> AM4 Opteron Server (Ultra-budget-tier ULP/Not Zen @ GF)
AM4 as budget/ultra-budget has a side benefit(unrelated) of going against the international-versions sockets of;
Huawei/Hisilicon Kunpeng Desktop (Most recent re-org is to focus on Desktop and Laptop, reducing funding for Server and Mobile)
Zhaoxin LGA-socket Desktop
Starfive JH-Supreme-CPU Desktop
Lower cost + Lower Power => Lower barrier of more than one PC per household + person.
GlobalFoundries not-capable of funding tailored bleeding edge nodes => GlobalFoundries capable of funding tailored nodes

Using ARM's Cortex A-M3D two-layer architecture as a guide by folding the core based off the L2 cache.
~44 mm2 with 4-cores => 8-cores
Assumption is that they keep 10.5T/9T and not use a more dense tailored library.
The L3 can be folded in a way to have 8 MB in 4 MB of L2. So, the output is capable of 8-cores within ~33 mm2. ~50mm2 to ~68 mm2 is a safe bet.
A smaller die can help punch down on the costs. Example:
16-core Athlon Platinum = ~3xx USD
12-core Athlon Platinum = ~2xx USD
8-core Athlon Gold = ~1xx USD
6-core Athlon Gold = ~1xx USD
Re-iterate once again there are newer 12nm FinFETs;
- Higher Yield
- Lower Variation
- Higher Perf
- Lower Power
~
https://ieeexplore.ieee.org/document/9771014
GlobalFoundries petty much stated if they aren't shrinking, they would move to stacking.
2012 - 3D Stacking planning included into Malta
2013 - EUV shrinking planning included into Malta
2015-2016 - Mono3D/M3D/Sequential Stacking is adopted for aggressive Logic-on-Logic.
2018 - EUV is dropped
2019 - 3D Stacking becomes major focus.
AM4 without something new, cuts its demand down. With AM5 targeting higher power targets, AM4 would have to target lower power to not compete.
Prior WSA:
Phase one = First half-2023 existing products
Phase two = Second half-2023 existing and new products (shared minimum annual floor)
Latest WSA:
Phase one = First half-2023 existing products
Phase two = Second half-2023 existing products
Phase three = Second half-2023 new products (independent annual floor)
The WSA also indicates the removal of movement across standard nodes, instead prioritizing tailored nodes.
Prior-roadmap: Standard node -> Tailored node -> Standard node -> Tailored node
Latest-roadmap: Tailored node -> Tailored node -> Tailored node -> Tailored node
Since, majority of AMD's revenue isn't being attributed to a fraction of required CapEx for 7LP/5LP/3LP.