Discussion Zen 7 speculation thread

BorisTheBlade82 · Sep 30, 2025

Joe NYC said:
I would say NVidia too. They would like to put a big number on their dGPUs and later iGPU tiles.

Congratulations, hereby you've been awarded the title "Captain Obvious of the Day" 😉
And shame on @igor_kavinski for not using the mandatory <sarcasm> tags.

igor_kavinski · Sep 30, 2025

BorisTheBlade82 said:
And shame on @igor_kavinski for not using the mandatory <sarcasm> tags.

So little faith in Intel 😛

Joe NYC · Sep 30, 2025

igor_kavinski said:
So little faith in Intel 😛

Can Qualcomm get a participation trophy?

511 · Sep 30, 2025

Joe NYC said:
Can Qualcomm get a participation trophy?

does their GPUs even have GEMM

MrMPFR · Sep 30, 2025

Edit: Forgot this is the Zen 7 thread, so would this be feasible at all for a CPU architecture? Maybe AMD processing in-cache 3D V-Cache for zen7 xD
Edit 2: moved to appropriate thread, link: https://forums.anandtech.com/threads/rdna-5-udna-cdna-next-speculation.2624468/post-41515323

marees · Sep 30, 2025

MrMPFR said:
So it seems, but patent specifies it can be any non-local cache, so they could be coupled to Shader Engine private cache, L2, or MALL.
If the reduced L2 (AT2 = 24MB L2 vs Navi 48 = 64MB MALL) is accurate and CCU is leveraged for RDNA 5 then those being coupled to L2 would shrink L2 for other processes since they require sizeable dedicated. Wonder how AMD engineers would tackle this. A SE cache implementation could happen as well, but it would require a much bigger SE cache, but some other benefits like superior cache latency, closer integration with CUs (routing and latency) etc... Latter seems more likely given the entire thing about the supposedly (not confirmed IIRC) overhaul with autonomous SE scheduling and dispatch (WGS and ADC). In this case CCUs outside SEs would complicate things alot.

Remember CCU offloads work from CU, so overhead might be lower than what it seems. No need to duplicate instructions, but yeah still some overhead, but how much?

Highly doubt that. But some BW heavy RT instructions could be offloaded to CCUs.

Again lots of unaswered questions but one of the most interesting AMD patents in a long time and a massive departure architecturally from anything previous.

Edit: Forgot this is Zen 7 thread, so would this be feasible at all for a CPU architecture? Maybe AMD processing in-cache 3D V-Cache for zen7 xD

We have a RDNA 5 thread

marees · Sep 30, 2025

MrMPFR said:
Forgot this is Zen 7 thread, so would this be feasible at all for a CPU architecture? Maybe AMD processing in-cache 3D V-Cache for zen7 xD

AVX 512 implementations maybe ??

adroc_thurston · Sep 30, 2025

MrMPFR said:
so they could be coupled to Shader Engine private cache

no such thing, they hacked out the L1 outta hierarchy with RDNA4.

MrMPFR said:
Wonder how AMD engineers would tackle this.

By making it a CPU.

dangerman1337 · Oct 28, 2025

Reviving this thread to avoid going off topic in the Zen 6 thread, since @Kepler_L2 said A14 was way too off IIRC for Zen 7 timeframe (does that mean Zen 7 is a RZL Competitor than say Titan/Hammer Lake, early 2028 or even late 2027?) 6 months ago. If it's A16 process is it possible they can still do 16 Cores per CCD (A16 & A14 aren't that both far off in terms of density and performance) and doubling of L2 Cache per core? I mean maybe the difference will be A16 being a slightly bigger CCD and slightly lower clockspeeds. I mean if AMD really wants to they could do DDR6/AM6 Zen 8 in late 2029 on A14 SPR probably.

If there is a Zen 7 product on A14 I think that'll be some 2H of 2028 server product. If A16 CCD Zen 7 is like 30% cheaper than one on A14 and only few % slower (maybe still able to do 16 ore CCD Zen 7X3D at 7Ghz 😛) and it's on AM5 anyways now... I'd just do that if I was AMD. Do Zen 8 on A14 SPR late 2029 and just have that sold along with Zen 7 (which will probably be done for a long time, Zen 7 is going to be likely "good enough" for a very long time).

marees · Oct 28, 2025

dangerman1337 said:
Reviving this thread to avoid going off topic in the Zen 6 thread, since @Kepler_L2 said A14 was way too off IIRC for Zen 7 timeframe (does that mean Zen 7 is a RZL Competitor than say Titan/Hammer Lake, early 2028 or even late 2027?) 6 months ago. If it's A16 process is it possible they can still do 16 Cores per CCD (A16 & A14 aren't that both far off in terms of density and performance) and doubling of L2 Cache per core? I mean maybe the difference will be A16 being a slightly bigger CCD and slightly lower clockspeeds. I mean if AMD really wants to they could do DDR6/AM6 Zen 8 in late 2029 on A14 SPR probably.

If there is a Zen 7 product on A14 I think that'll be some 2H of 2028 server product. If A16 CCD Zen 7 is like 30% cheaper than one on A14 and only few % slower (maybe still able to do 16 ore CCD Zen 7X3D at 7Ghz 😛) and it's on AM5 anyways now... I'd just do that if I was AMD. Do Zen 8 on A14 SPR late 2029 and just have that sold along with Zen 7 (which will probably be done for a long time, Zen 7 is going to be likely "good enough" for a very long time).

Nvidia will take-up one node completely (to maintain their AI dominance, as they don't have a 2nm product as of now, I think )

Zen 7 will have to use what's left after that

StefanR5R · Oct 28, 2025

2028 wafer allocations might be up to Beijing to decide.

yuri69 · Oct 28, 2025

dangerman1337 said:
Reviving this thread to avoid going off topic in the Zen 6 thread, since @Kepler_L2 said A14 was way too off IIRC for Zen 7 timeframe (does that mean Zen 7 is a RZL Competitor than say Titan/Hammer Lake, early 2028 or even late 2027?) 6 months ago.

By late 2027 AMD would most probably still have a backlog of Zen 6 launches - think of Threadrippers, various APU SKUs, etc.

dangerman1337 · Oct 28, 2025

yuri69 said:
By late 2027 AMD would most probably still have a backlog of Zen 6 launches - think of Threadrippers, various APU SKUs, etc.

True, but with AMD still supporting Zen 2 and Zen 3 with rebranding into the new numbering scheme. We're going to see probably SKUs of different architectures cross over each other because Zen 7 will be tad more costly than low end Zen 6 (let's say Zen 6 CCD is like 65mm2 on N2P, A16 16 Core Zen 7 could be like 80+ mm2). I mean Zen 6 low end and Zen 7 X3D could very well co-exist as launches (IMV if I was AMD I'd prioritize Zen 7 X3D launching first before other lower end client Zen 7, hell I won't be surprised if Zen 6 X3D launches on time Vs Nova Lake bLLC if that is Q4 2026 next year).

Like a 9/10 Core Zen 6 non-X3D SKU is going to be cheaper than a Zen 7 A16 X3D SKU that uses 4nm cache tiles.

marees · Oct 29, 2025

dangerman1337 said:
Reviving this thread to avoid going off topic in the Zen 6 thread, since @Kepler_L2 said A14 was way too off IIRC for Zen 7 timeframe (does that mean Zen 7 is a RZL Competitor than say Titan/Hammer Lake, early 2028 or even late 2027?) 6 months ago. If it's A16 process is it possible they can still do 16 Cores per CCD (A16 & A14 aren't that both far off in terms of density and performance) and doubling of L2 Cache per core? I mean maybe the difference will be A16 being a slightly bigger CCD and slightly lower clockspeeds. I mean if AMD really wants to they could do DDR6/AM6 Zen 8 in late 2029 on A14 SPR probably.

If there is a Zen 7 product on A14 I think that'll be some 2H of 2028 server product. If A16 CCD Zen 7 is like 30% cheaper than one on A14 and only few % slower (maybe still able to do 16 ore CCD Zen 7X3D at 7Ghz 😛) and it's on AM5 anyways now... I'd just do that if I was AMD. Do Zen 8 on A14 SPR late 2029 and just have that sold along with Zen 7 (which will probably be done for a long time, Zen 7 is going to be likely "good enough" for a very long time).

This is what I was referring to

marees said:
Nvidia will take-up one node completely (to maintain their AI dominance, as they don't have a 2nm product as of now, I think )

Zen 7 will have to use what's left after that

marees said:
The A16 process, scheduled for mass production in 2027, is TSMC’s first 2-nanometer node to incorporate backside power delivery network (BSPDN) technology — one of the most advanced innovations in semiconductor manufacturing.

BSPDN is a groundbreaking process technology with no commercial precedent. Traditionally, both power and signal interconnects are placed on the front side of a chip. However, as circuit dimensions shrink, interference increases, complicating design and fabrication. BSPDN flips this structure by routing the power network on the backside and the signal network on the front, thereby alleviating interconnect bottlenecks and improving power efficiency.

Samsung Electronics and Intel are also preparing BSPDN adoption, and industry consensus expects both companies to implement it at the 2-nanometer node as well.

NVIDIA’s GPU roadmap follows the sequence Hopper → Blackwell → Rubin → Feynman. The Blackwell series is currently in shipment, with Rubin expected next year. The Feynman GPU, planned for release in 2028, is believed to be the first to use TSMC’s A16 process. Although the product launch is slated for 2028, production using A16 will likely begin in the second half of 2027, allowing about a year for ramp-up to improve yield and productivity.

https://twitter.com/x/status/1983038293739745550

엔비디아, TSMC 최첨단 ‘A16’ 공정 첫 고객 유력…‘성숙 공정’ 전략 선회 - 이비엔(EBN)뉴스센터

엔비디아가 2027년 세계 1위 파운드리(반도체 위탁생산) 업체 TSMC의 최신 공정을 사용하는 첫 번째 고객이 된다. 엔비디아는 지금껏 성숙 공정을 사용했으나

www-ebn-co-kr.translate.goog

Joe NYC · Oct 29, 2025

marees said:
Nvidia will take-up one node completely (to maintain their AI dominance, as they don't have a 2nm product as of now, I think )

Zen 7 will have to use what's left after that

I don't think that's how TSMC operates. There is no benefit to TSMC in letting one customer push out the rest of TSMC customers. More like the opposite of that.

Doug S · Oct 29, 2025

Joe NYC said:
I don't think that's how TSMC operates. There is no benefit to TSMC in letting one customer push out the rest of TSMC customers. More like the opposite of that.

There isn't any way for one customer to push out the rest. If TSMC was planning on 100K wafer starts for A14 when mass production began and Nvidia came along and wrote them a megacheck for 100K wafer starts TSMC would use that money to expand capacity and they'd have 150K or 200K wafer starts when mass production begins.

That's why the claims that Apple is "buying up so much N2 capacity to hurt the competition" is stupid. Apple is buying the capacity they NEED. They aren't going to pay for huge numbers of N2 wafers just to toss a portion of them in the trash to screw the competition. Throwing away billions of dollars is no way to beat the competition! The fact it is ~50% of N2 capacity is because TSMC's had enough other committed or anticipated N2 orders for the rest. If Nvidia had come along and bought the same amount as Apple at the same time they did then it isn't like Apple would have 50% and Nvidia would have 50%. Apple would have 33%, Nvidia would have 33%, and there would be 33% for everyone else, because TSMC would have built more initial capacity.

There are some limits to this, TSMC couldn't jump from 100K wpm they are producing for N2 next year to 500K if someone had come along a few years ago with a check for 400K wpm, because at some point they'll be resource limited in building fab shells, equipping them with everything from plumbing to EUV machines, etc. But such concens are theoretical, TSMC would be able to handle any real world surge so long as orders are placed far enough in advance.

Joe NYC · Oct 29, 2025

Doug S said:
There isn't any way for one customer to push out the rest. If TSMC was planning on 100K wafer starts for A14 when mass production began and Nvidia came along and wrote them a megacheck for 100K wafer starts TSMC would use that money to expand capacity and they'd have 150K or 200K wafer starts when mass production begins.

That's why the claims that Apple is "buying up so much N2 capacity to hurt the competition" is stupid. Apple is buying the capacity they NEED. They aren't going to pay for huge numbers of N2 wafers just to toss a portion of them in the trash to screw the competition. Throwing away billions of dollars is no way to beat the competition! The fact it is ~50% of N2 capacity is because TSMC's had enough other committed or anticipated N2 orders for the rest. If Nvidia had come along and bought the same amount as Apple at the same time they did then it isn't like Apple would have 50% and Nvidia would have 50%. Apple would have 33%, Nvidia would have 33%, and there would be 33% for everyone else, because TSMC would have built more initial capacity.

There are some limits to this, TSMC couldn't jump from 100K wpm they are producing for N2 next year to 500K if someone had come along a few years ago with a check for 400K wpm, because at some point they'll be resource limited in building fab shells, equipping them with everything from plumbing to EUV machines, etc. But such concens are theoretical, TSMC would be able to handle any real world surge so long as orders are placed far enough in advance.

100% to this. TSMC, when facing strong demand from one customer, always seeks to expand the capacity, rather subtract it from other customers. And this practice has served TSMC very well. Not just TSMC, the entire fabless ecosystem as well.

Most of the "Magnificent 7" and number of others right behind them are all beneficiaries of this.

adroc_thurston · Oct 29, 2025

marees said:
Nvidia will take-up one node completely (to maintain their AI dominance, as they don't have a 2nm product as of now, I think )

Zen 7 will have to use what's left after that

That's not how any of that works.
A16 is just a bad node.

poke01 · Oct 29, 2025

adroc_thurston said:
A16 is just a bad node.

why is NV so eager to use it or is that just media spreading FUD?

adroc_thurston · Oct 29, 2025

poke01 said:
why is NV so eager to use it or is that just media spreading FUD?

Because anything goes in a GPGPU rat race.
If they don't, AMD will.

marees · Nov 7, 2025

AMD Zen 7, Grimlock

A14 node for CCD, N4P for V-Cache / CCD ~98mm2
Up to 16C per CCD
FP512, 2MB L2, 4MB L3 per core
EPYC is up to 264C, 2122MB L3 Cache (wo. 3D Cache)
overall ~20% perf increase in non-gaming
160MB X3D Cache
20~36% faster than Zen 6 at 3~12W

Tape out in 2 years
Launch in 4 years ?

https://twitter.com/x/status/1986996913644978207

marees · Nov 7, 2025

Mobile APUs

Grimlock halo
Grimlock point 1 & Grimlock point 1 hi
Grimlock point 2 (krackan replacement)
Grimlock point 3 (bumblebee replacement)
Grimlock point 4 (Mendocino replacement)

30% more performance per core at 3 watts

techjunkie123 · Nov 8, 2025

zen7 go zoom.

35% overall perf increase vs zen6 in server, 15% in DT. let's go.

DrMrLordX · Nov 8, 2025

2029? No way.

marees · Nov 8, 2025

DrMrLordX said:
2029? No way.

I doubt A14 will be ready for mass market before that

Discussion Zen 7 speculation thread

Senior member

Lifer

Diamond Member

Diamond Member

Senior member

Platinum Member

Platinum Member

Diamond Member

Senior member

Platinum Member

Elite Member

Senior member

Senior member

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Platinum Member

Member

Lifer

Platinum Member