• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Discussion Beyond zen 6

Page 17 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Core logic density ain't nowhere near process limits with high clock targets. With different kind of design philosophy there might be density gains to be found in same process that just waits to be found.
You're typing a lot of words but none of them mean anything.
Logic density is very much within the process limits for all designs.
You know, they're (mostly) dense 2-2 and 2-1 and whatever cells.
 
They won’t. You will just prolly pay them more money.
Also as others have said, AMD isn’t targeting anything Intel, their competitors are ARM on server and Apple on client. Intel is irrelevant for them right now.
Yea, well I think AMD CAN'T charge more money except at the very top of the food chain. This is how markets work.

As for this talk of competitors not having an impact on AMD's pricing, I suggest you do some more research into the subject of product management and specific, product pricing practices.
Moving SRAM off expensive new processes will be the only way to build cost effective CPUs going forward, and the costs looks like are minimal anyway. On top of that they'll charge premium.
For L3 perhaps, but still .....

Adding a complete new chiplet then bonding it ....either 3D or chiplet isn't free either. I am guessing it is LESS expensive than using cutting edge lithography though.

Can still count on L1 and L2 staying on the compute die.

As for charging a premium, I think that in the high volume markets, that isn't possible. Gaming, HEDT, and DC for sure, but not the higher volume markets IMO.
A16 is N2 with BSPDN, Zen7 will be on next actual node - A14.
Yes, but AMD is not using the version of A14 that has BSPDN. That process will come later than 2028.

My current guess is that yields aren't good enough for AMD and that the added
 
View attachment 142011
I made a chart to better illustrate the difference IPC has when using the median of the MLID perf/power chart.
As you can see Z7 has a sizeable frequency advantage at low power that diminishes to parity or a deficit at peak power.
For reference, a standard/good tock core will have the frequency dot be at 1. Zen 5 actually takes less power to hit it's Fmax then Zen 4. Zen 3 has similar power draw to Zen 2 CPUs in single core workloads too.

But maybe Zen 7 is such a large IPC uplift that the core requires that much more power. Fine.

But then you also have to remember that Zen 7 is also a tick core. You see that in the insane 35% faster perf at 3 watts. That's not due to arch. Or at least the vast majority of it is not due to arch.

Honestly I think you should add a separate line for an IPC uplift of ~10%. If the perf numbers are true, I'm expecting between 10-15% IPC.

I don't think a node shrink won't result in lower power per GHz, even at say high 6GHz.
I also don't think that AMD will use much more than ~22 watts per core for Fmax. Maybe marginally higher, say ~25. But the lower on the perf curve you are, usually the better the perf/watt uplift from the node too, so it should all balance out there.
 
Core logic density ain't nowhere near process limits with high clock targets. With different kind of design philosophy there might be density gains to be found in same process that just waits to be found.
The day's of big core logic density increases AND clock speed increases has long been over. Just look at the last 10 years. It's easy to see.

Worse than that, each new process not only yields decreasing improvements over the previous generation, it also costs increasingly (super linearly) more to make the tools, and more process steps (more expensive) than the previous step.

Sure, there are occasional jumps. Intel's new High NA machines they bought for an arm and a leg (and maybe a bit more) will provide a jump, but these are once in a decade (or more) jumps. You aren't going to get them every 18months, nor will you even get a true die shift every 18 months. More like 3 years right now, and I am guessing this will continue to lengthen out.

AMD ZEN has looked like this:

YearZen GenProcess Node (CCD)CCD Size (mm²)Max Clock (GHz)Desktop GainDensity ↑ vs Prior
2017Zen14nm (GF)~213 (mono)~4.0
2018Zen+12nm (GF)~213 (mono)~4.3~+5–8%~+10–15%
2019Zen 27nm (TSMC N7)~74~4.7~+12–15%~2.0× (~100%)
2020Zen 37nm (TSMC N7)~80~4.9~+18–22%~0% (same node)
2022Zen 45nm (TSMC N5)~72~5.7~+15–20%~1.8× (~80%)
2024Zen 54nm (TSMC N4/N4P)~70–75~5.7–5.8~+5–12%~+6–10%
2026–27 (est.)Zen 6~2nm (TSMC N2/N2P)~90–120 (est.)~6.2–6.5 (est.)~+15–25% (est.)~1.15–1.3× (~15–30%)

Zen 3 was a fixed Zen 2 (Zen 2 move to chiplet design) so got a decent lift without a process change.

Look at Zen 5 though. With a 6-10% better transistor budget, it only got 5-12% IPC increase (not including server).... thus the "Zen 5%" criticism.

Zen 7 is essentially another Zen 5. It's a cleanup node for N2, not a big shrink. I wouldn't expect big IPC increases and only modest clock speed increases that are going to be mostly due to TSMC getting the process smooth on N2 class GAA.

It's just my opinion, but those thinking Zen 7 is going to be a big leap from Zen 6, I think you are setting yourselves up for disappointment.

The BIG change I see with Zen 7 is the move to higher core CCD's. I do kinda wonder how this is going to work out on desktop vs Intel's 52c NVL (Zen 7 with 32c/64t). I'm guessing that Intel still wins in Cinebench 😉.
 
It's just my opinion, but those thinking Zen 7 is going to be a big leap from Zen 6, I think you are setting yourselves up for disappointment.
Ppl should prepare for a Zen 6% style of disappointment first. I mean its a mere refinement core. Mobile/servers will surely benefit from the node jump, although desktop simply hopes that those 7Ghz rumors materialize.
 
Yeah, but it's indicative enough for FP bloat in da core.

No real argument. But even Blender (real Blender, not "old Blender with all of its platform-specific paths removed running as n separate identical processes" like 526.blender_r) or something would show that without relying on the weird vestigial organ that is specfp. I would be legitimately in favor of dropping it from the suite entirely.

Maybe my distaste for it is not entirely rational, but specfp deserves to be taken behind the barn.
 
You know I still 'member those AMD CAGR slides.
Yeah, perf is what matters.
Teens IPC in server and your-standard-AVX512-max-pro workloads is not really interesting. Teens in standard games, Adobe slop, web workloads, and regular video editing is... fine.
It's both (but that's true for every AMD tock ever).
Try not to overdose on copium.
Maybe my distaste for it is not entirely rational, but specfp deserves to be taken behind the barn.
Oh it's horribly old and gunky, but still.
SIR's also old as balls tbf.

In any case, SPEC2026 is this year and it gotta have something innit.
 
Yeah, perf is what matters.

It's both (but that's true for every AMD tock ever).
Try not to overdose on copium.

Oh it's horribly old and gunky, but still.
SIR's also old as balls tbf.

In any case, SPEC2026 is this year and it gotta have something innit.

Specint subtests are mostly reasonable, though (though I personally loathe 657.xz - it takes forever, inexplicably does multithreading with OpenMP in the specspeed profile, and still produces bizarre results that don't correlate well with anything else.) Like, perlbench, gcc, omnet, etc are actually useful proxy tests for a wide range of real code that exists. I have doubts about the usefulness of 625.x264_s, along with hating 657.xz, but most of the int suite holds up well.

Specfp is just "why are you running this stuff that normally runs as MPI with platform paths in this bizarre, unnatural way that mostly gives you information you could have gotten by looking at the L3 capacity and memory interface?" specfp rate, especially, feels like STREAM with extra steps.
 
It's teens IPC with moar fmax.
Wonder how much the improved inifnity fabric + new IOD die will help that regard. IPC seems tricky to estimate it because Zen 5 is bandwidth bottlenecked, if something loves clocks, cores and memory bandwidth at the same time then Zen 6 could be a really huge improvement. And then Zen 7 on top will go beyond.

Hopefully both Zen 6 and 7 scale well beyond DDR5 8000 Vs Zen 5 right now which gets AFAIK to 6400 MT/s and then struggles above.
 
Wonder how much the improved inifnity fabric + new IOD die will help that regard.
It will help with nT (hello uncucked IMC on cIOD).
1t, well, your caches cover it well enough.
Memlat improvements are "we'll see" territory since the current validation samples are cucked and don't run prod fclk.
Hopefully both Zen 6 and 7 scale well beyond DDR5 8000 Vs Zen 5 right now which gets AFAIK to 6400 MT/s and then struggles above.
8k JEDEC is the platform strap for OMR yes.
 
I find it interesting how people still have such high expectations for Intel to make better cores and always downplay AMDs.
cough Zen 40% cough
Yes it would, process shrinks provide Cdyn reduction.
it would result in lower power per GHz? Cuz that's what I said, I just used a double negative lol
Because it's a different core with a different physical implementation and lower Vmax.
I don't think there was a single Zen core so far that has resulted in higher power to hit the same frequency at the previous gen's Fmax yet.
specfp is also a pretty poor benchmark as a proxy for... well, anything, tbh.
Code's pretty old there.
But, again, SPEC2026 is soon(tm).
Specint has held up much better than specfp from what numerous companies use in their benchmarks publicly.
IPC seems tricky to estimate it because Zen 5 is bandwidth bottlenecked,
Is it? Backend cpu bound has always been a large bottleneck in many workloads from cpu profiling and yet I doubt it's bandwidth but more latency being the bottleneck. I haven't seen anyone profile that specifically, other than in gaming, where c&c found it to be latency bound not bandwidth bound.

I guess also you could tell if Zen 5 has a large bandwidth bottleneck if it performs much better relative to Zen 4 at low frequencies since there's less pressure on memory bandwidth when performance is also lower? But I don't think that's the case either.
 
it would result in lower power per GHz?
Yes.
That's what Cdyn reduction does.
Specint has held up much better than specfp from what numerous companies use in their benchmarks publicly.
Doesn't mean it's not antique (it is).
Again, it's this year.
I don't think there was a single Zen core so far that has resulted in higher power to hit the same frequency at the previous gen's Fmax yet.
Welcome to the world of shrinks and tocks and physdes updates.
You're trying to single out stuff you'll never be able to without having direct access to super mega confindential live process node data.
 
Wonder how much the improved inifnity fabric + new IOD die will help that regard. IPC seems tricky to estimate it because Zen 5 is bandwidth bottlenecked, if something loves clocks, cores and memory bandwidth at the same time then Zen 6 could be a really huge improvement. And then Zen 7 on top will go beyond.
It's not membw bottlenecked in 1t. It tops out at 50 GB/s read bw from memory which is below your typical 64GB/s IOD <-> CCD throughput. It's more than LNC at 40 GB/s, but nowhere close to 80-90 GB/s monsters from Apple/QC.

unscientific comparison alert!
Compared to its predecessor in Zen 4, the core seemingly shows smallest improvements in branchy stuff due to severe regression in misprediction penalty. It's evident from branch-heavy SPEC subtests such as 541.leela (third green column counting from the right), where the score didn't change compared to 7840U at all and is stuck at 7.6:
1776638761461.png
or GB6 Navigation.
https://browser.geekbench.com/v6/cpu/compare/17232642?baseline=17059876 (I took two runs with high scores).
The misprediction rates and IPC are taken from the Geekbench 6 Benchmark Internals doc. The subtests are sorted in increasing order of perf uplift.
1776639616332.png

It's only one run, but it's enough to see that low-IPC/branchy stuff tends to show the smallest improvement.
I'd say that's the main area that requires improvement, followed by undersized INT PRF, then the latency regression of 128-bit scalar ops.
 
I find it interesting how people still have such high expectations for Intel to make better cores and always downplay AMDs.
Intel should have a good boost with NVL only because ARL was so seriously crippled by latency issues that even small changes should free it up quite a bit IMO.

AMD should have a good boost because they are getting ~25% density improvement AND clock speed increase of about 12% clock speed increase. So lets say IPC goes up by ~15% (I'm guessing a little lower than this, but we will see), that still gives you a performance boost of close to 30% which is pretty impressive. Just don't expect this to happen with Z7.
Ppl should prepare for a Zen 6% style of disappointment first. I mean its a mere refinement core. Mobile/servers will surely benefit from the node jump, although desktop simply hopes that those 7Ghz rumors materialize.
See above. I think Z6 is going to be more like a total of 25-30% improvement over Z6, but I seriously agree on Z7.
Improvement from Zen to Zen+ was 9% on average, beside they got no density improvement as they used the same mask for 14 and 12nm.

The research I see shows 15% density improvement. Getting 9% performance improvement with a 15% bigger transistor budget and a 7.5% clock speed bump hardly seems unreasonable.

Fast forward to Z7, the process is going to get 20% more transistor budget BUT, the CCD is going from 12 to 16 cores (33%). They are doubling the L2 from 1Mb to 2Mb so I am guessing that this pretty much eats up the transistor budget and then some so I am mystified as to how people are considering some big jump in performance (other than cinebench) from Z6 to Z7. I'm guessing 10-15% IPC and little to no clock speed improvement.

Now it is possible that they make some smaller core count CCD's and that the big CCD is just for super high margin and DC stuff. If that's the case, then perhaps the crazy 15-25% IPC increase could be real from MLID, but my guess is that reality is going to fall south of that on desktop.

Really though, at this point in time these are all WILD speculations only.
 
Back
Top