Question Zen 6 Speculation Thread

Fjodor2001 · Jan 23, 2026

Thing is that some continue to talk about core count spam, implying that many cores with low perf/core is bad. But actually we should talk about Cinememe thread count spam, with many threads with low perf/thread. Because in the end, it’s threads that are executing, not cores.

gdansk · Jan 23, 2026

Fjodor2001 said:
Because in the end, it’s threads that are executing, not cores.

No, actually I'm pretty sure front ends do not execute a instruction (they can, however, fold and eliminate instructions).

Moreover it's almost deliberately misleading to focus on perf per thread. It obscures core performance in the heterogenous configurations. In the M5, for example, there is nothing working at the rate of "perf per thread". There are 4 cores working at some higher rate and 6 cores working at some lower rate (more or less).

Fjodor2001 · Jan 23, 2026

gdansk said:
No, actually I'm pretty sure front ends do not execute a instruction (they can, however, fold and eliminate instructions)

Not sure what your point is.

gdansk said:
Moreover it's almost deliberately misleading to focus on perf per thread. It obscures core performance in the heterogenous configurations. In the M5, for example, there is nothing working at the rate of "perf per thread". There are 4 cores working at some higher rate and 6 cores working at some lower rate (more or less).

Zen is not heterogenous, so N/A.

But otherwise you’d have to consider average perf/thread instead. And then what I mentioned is still valid.

And w.r.t. Zen6 vs NVL-S, even an NVL-S E-Core will have higher perf/thread than an Zen6 SMT thread.

gdansk · Jan 23, 2026

Fjodor2001 said:
Not sure what your point is.

Threads don't execute anything. They are simply steams of (decoded) instructions. The only valid thing to care about for rendering is total throughput. Any uniform divisor is nearly useless. Perf per core is misleading in heterogenous designs and perf per thread is misleading for SMT designs.

adroc_thurston · Jan 23, 2026

"here's how NVL-S can still win", part 11.

Fjodor2001 · Jan 23, 2026

gdansk said:
Threads don't execute anything. They are simply steams of (decoded) instructions. The only valid thing to care about for rendering is total throughput. Any divisor is nearly useless. Perf per core is misleading in heterogenous designs and perf per thread is misleading for SMT designs.

See my previous post. You’re trying to overcomplicate the issue to diverge from the main point.

But go with this instead then:

MT throughput = Perf thread on average * Thread count

Fjodor2001 · Jan 23, 2026

adroc_thurston said:
"here's how NVL-S can still win", part 11.

will

adroc_thurston · Jan 23, 2026

Fjodor2001 said:
will

Sure buddy sure, if you can sustain 350W off that socket, maybe.

adroc_thurston · Jan 23, 2026

You'd think he would start understanding™ things after seeing Venice .pptware.
Alas.

inquiss · Jan 23, 2026

gdansk said:
No that's a bad proxy.
Total MT throughput in Cinebench = (Perf per core * number of cores with that performance level)

Really, I'm not sure why you would want to bring perf per thread into the discussion at any point. It's needlessly inaccurate.

Sometimes a thread will be happily meandering along on topic and then *pow* a 48T fairy appears and it's hard to know why or if anyone wished the fairy to change the topic at all. But it happens.

inquiss · Jan 23, 2026

Fjodor2001 said:
Not sure what your point is.

Zen is not heterogenous, so N/A.

But otherwise you’d have to consider average perf/thread instead. And then what I mentioned is still valid.

And w.r.t. Zen6 vs NVL-S, even an NVL-S E-Core will have higher perf/thread than an Zen6 SMT thread.

No it won't. You keep saying this as fact based on your assumption that "real cores" are faster than threads but it really depends on the workload and the cores in question.

inquiss · Jan 23, 2026

adroc_thurston said:
You'd think he would start understanding™ things after seeing Venice .pptware.
Alas.

Honestly a complete lost cause. Definitely failed the turing test of this *cough* thread

adroc_thurston · Jan 23, 2026

inquiss said:
You keep saying this as fact based on your assumption that "real cores" are faster than threads but it really depends on the workload and the cores in question.

He doesn't even need to think.
You can model Olympic Ridge nT bump off Venice numbers really.

inquiss · Jan 23, 2026

adroc_thurston said:
He doesn't even need to think.
You can model Olympic Ridge nT bump off Venice numbers really.

Yeah but that requires basing something in the real world, rather than notions

adroc_thurston · Jan 23, 2026

inquiss said:
Yeah but that requires basing something in the real world, rather than notions

Well we have Venice numbers, Zen2 numbers, Zen4 numbers (both client and server).
Allows you to model for shrinks, IP updates, cache bumps, CC bumps etc.
Mind you, SIR2017 but still counts.

AMDK11 · Jan 23, 2026

SMT thread performance is irrelevant if both increase the efficient use of single-core resources at the cost of 5% complexity.

SMT Zen5 in CB26 gives +37%.

And 20%+ vs ST LionCove.

Fjodor2001 · Jan 23, 2026

AMDK11 said:
SMT thread performance is irrelevant if both increase the efficient use of single-core resources at the cost of 5% complexity.

SMT Zen5 in CB26 gives +37%.

Capped at 48T. So perf/thread given that thread count is what matters.

AMDK11 · Jan 23, 2026

It matters, especially since Intel's 48T are physical cores that occupy physical space in the silicon as full-fledged cores. 48T Zen has 24 physical cores and only a 5% increase in SMT.

As I mentioned, SMT is designed to fully utilize core resources where ST is difficult or impossible.

adroc_thurston · Jan 23, 2026

Fjodor2001 said:
So perf/thread given that thread count is what matters.

dawg that's not a real metric.
For sockets in nT you rate peak throughput at given power.

Venice is >1.7x SIR2017 vs 9965.
OMR is ???? SIR2017 vs 9950X.

AMDK11 · Jan 23, 2026

The performance of the SMT thread does not matter if both threads provide significantly higher performance for a single core.

And that's precisely what SMT is all about. It's not just about the performance of the thread itself, but the sum of both effects on the resource of the entire core.

If SMT gives in certain conditions (I don't mean in every case) 20-40+% compared to ST at the cost of complexity of 5%, then SMT still gives a big profit, especially since it allows to increase performance within the same core.

DAPUNISHER · Jan 23, 2026

Fjodor2001 said:
will

You keep forgetting which forums you are on. We don't allow that here. Shake your pom poms in the Intel thread. Doing it here is trolling.

Geddagod · Jan 23, 2026

adroc_thurston said:
N3 has Cac reduction vs N4.

Doesn't seem all that large tbh

But also how does improvements in active power help them reduce idle/SOC power usage that would make up the much lower reported core V/F curve?

adroc_thurston said:
read the chart again.

I'm referring to Huang's performance profiling for the BPU comments, not the mis predict penalty graph

adroc_thurston said:
Pipeline flushes mean L2 fetches with tiny L1's.

I don't even think this graph would show that for 3 reasons:
1, I don't think the array length or branch count are high enough to necessitate a fetch from the L2
2, even if it did, they would require that from both the predictable and random cases, meaning that it should be eliminated from the difference,
3, and if it wasn't, missing the L1i and requiring an L2 hit would cause a latency penalty in cycles almost as much as the mis predict penalty already is. It doesn't make any sense, the graph can't be showing that.

adroc_thurston said:
Yes?
AMD front-end latency is down to having a miserably small 32K L1i (with a much nicer core otherwise).

That would be reflected in a graph like this:

Or this:

But not on the graph I posted.

Kepler_L2 said:
Doesn't seem likely when NVL-S has 75% higher TDP than OR

Way more cores though.

Fjodor2001 · Jan 23, 2026

AMDK11 said:
The performance of the SMT thread does not matter if both threads provide significantly higher performance for a single core.

And that's precisely what SMT is all about. It's not just about the performance of the thread itself, but the sum of both effects on the resource of the entire core.

If SMT gives in certain conditions (I don't mean in every case) 20-40+% compared to ST at the cost of complexity of 5%, then SMT still gives a big profit, especially since it allows to increase performance within the same core.

Agreed in principle. But to get the total MT perf you also have to multiply by core count, which differs per CPU. I.e. something like this:

Total MT perf = Core count * Perf/core without SMT * SMT boost (if applicable)

So e.g. if core count is 2x for CPU A vs B, then SMT boost 20-40% of B will not be sufficient to counter that, all else equal.

poke01 · Jan 23, 2026

no point discussing this further lets just wait for Q4 26

Fjodor2001 · Jan 23, 2026

Regarding release dates, are we expecting the X3D Zen6 SKUs to be available already in 2026H2? Or like for Zen5 only non-X3D in first wave, then X3D in e.g. 2027Q3, and X3D2 some time after that?

Question Zen 6 Speculation Thread

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Senior member

Senior member

Diamond Member

Senior member

Diamond Member

Senior member

Diamond Member

Senior member

Diamond Member

Senior member

Super Moderator CPU Forum Mod and Elite Member

Golden Member

Diamond Member

Diamond Member

Diamond Member