Question Zen 6 Speculation Thread

511 · 2026-03-20T10:11:46-0400

adroc_thurston said:
Well, yeah, the same way phones will reach 5GHz later this year. you're welcome.

Only for a second before throttling down

511 · 2026-03-20T10:14:06-0400

Hulk said:
Both have advantages and disadvantages. I'm saying that the comparison is not as straightforward as it might seem on the surface. I'm sure if AMD and Intel could "start over" with a completey new instruction set and OS they could do what Apple has done with the M series.

What you compile for also matters X86_64 people still doesn't compile for AVX despite it being ages old Intel is one of the reason as well.
Google chrome is compiled with SSE3/4 not exactly sure which version.

Hulk · 2026-03-20T10:16:44-0400

511 said:
What you compile for also matters X86_64 people still doesn't compile for AVX despite it being ages old Intel is one of the reason as well.
Google chrome is compiled with SSE3/4 not exactly sure which version.

Yes. x86 developers are still writing for Skylake. A necessary downside of a completely open infrastructure.

OneEng2 · 2026-03-20T10:55:10-0400

Tuna-Fish said:
Bulldozer's FPU was not a problem, though. It was so starved on the memory side that it was not really possible to saturate the FPU execution units on any realistic load.

It's weird how the narrative around that core worked out. It was a bad design, but the parts that made it a bad design (combination of write-through L1 and slow-as-molasses L2), were harder to see on a block diagram than the shared FPU, so everyone concentrated on that even though it was basically never the bottleneck.

Back in the day, on an old forum now defunct, I raised this question to an AMD rep that participated in the forum. His reply to me was

"Don't worry, we are going to keep these executions units fed!".

He ended up being quite wrong and my architectural concerns were absolutely on target after all.

511 · 2026-03-20T11:08:57-0400

Hulk said:
Yes. x86 developers are still writing for Skylake. A necessary downside of a completely open infrastructure.

Skylake is rather new people write for SSE4 aka before Nehlam🤣 🤣

adroc_thurston · 2026-03-20T11:25:35-0400

511 said:
Only for a second before throttling down

No. Same volts and all.
If they ship 5.3/5.4 bins in a phone, yes, that one would throttle heavily.

Jan Olšan · 2026-03-20T11:28:49-0400

511 said:
Skylake is rather new people write for SSE4 aka before Nehlam🤣 🤣

Pre Nehalem only had SSE4.1 and that's just one generation, Penryn, which actually fell out of support by Microsoft Windows, of all things.
Nehalem/SSE4.2 is the new baseline requirement of *Windows*.

Hulk · 2026-03-20T12:54:48-0400

OneEng2 said:
Back in the day, on an old forum now defunct, I raised this question to an AMD rep that participated in the forum. His reply to me was

"Don't worry, we are going to keep these executions units fed!".

He ended up being quite wrong and my architectural concerns were absolutely on target after all.

From what I remember the biggest flaw with Bulldozer was CMT. Basically two logical processors sharing execution resources. It was an attempt to maximize compute/die area. Kaveri fixed a lot of the front end starvation issues by adding per core decoders and better branch prediction but by this time it was too late because Skylake was "born" about this same time and the Bulldozer era cores couldn't compete. A new direction was needed and AMD realized that and Zen was born.

adroc_thurston · 2026-03-20T12:55:26-0400

Hulk said:
From what I remember the biggest flaw with Bulldozer was CMT.

No.
The biggest issue with Bulldozer was awful caches.

Hulk said:
A new direction was needed and AMD realized that and Zen was born.

Zen was started in 2012, long-long before SKL reached the market.

Hulk · 2026-03-20T13:49:03-0400

adroc_thurston said:
No.
The biggest issue with Bulldozer was awful caches.

Zen was started in 2012, long-long before SKL reached the market.

No. Based on AMD trajectory.

Z O X · 2026-03-20T15:42:20-0400

Jan Olšan said:
Pre Nehalem only had SSE4.1 and that's just one generation, Penryn, which actually fell out of support by Microsoft Windows, of all things.
Nehalem/SSE4.2 is the new baseline requirement of *Windows*.

IMHO, AVX2 will be common minimum pretty fast if not already (except for games, maybe).
Had to depart my beloved x79 system because some software dropped pre-AVX2 support...

Joe NYC · 2026-03-20T15:59:15-0400

Gideon said:
I hope most of their IPC gains this time around come from integer workloads. There are certainly low hanging fruits (of small benefit) on the decoder side, like finally allowing decoding 2 branches for a single thread, or bringingback some Zen-4 optimizations. Not that decoding is even the main bottleneck:

Yeah, majority of new resources / optimization should be on Integer.

AVX-512 is already fine and AMD is adding separate units for Matrix Multiplications. So, it will be all about integer on the cores and the SoCs on the client.

poke01 · 2026-03-20T17:02:46-0400

Hulk said:
While I understand your point, and I would agree the efficiency of Apple's M core is astounding, I don't think it is better at running x64 code than Skymont.

It's like comparing a car built for road racing to one built for off-road. Sure they are both vehicles but they run on completely different infrastructures. Apple has a tightly controlled ecobase. No DIY builds, very limited upgraded, limited software base, no third party hardware, limited third party software (only though Apple store, etc..), not required to run apps from 50 years ago, etc... It's a tightly controlled, perfectly paved race track with no bumps. Due to their tightly controlled system they've even been able to totally start from scratch with their code base a few times over the years.

x64 on the otherhand is a jungle of thousands of third party hardware parts and software apps spanning 50 plus years that has to work across generations and generations of processors and hardware. CPUs built for x64 have to be able to handle wild terrain.

Both have advantages and disadvantages. I'm saying that the comparison is not as straightforward as it might seem on the surface. I'm sure if AMD and Intel could "start over" with a completey new instruction set and OS they could do what Apple has done with the M series.

Hulk I was talking about new the M core as in the Middle core architecture in M5.

we reached a point where x64 isn’t the only performance side making fast cores. This isn’t 2010. Yes, I know that Apple doesn’t do DIY/thirdparty hardware but DIY is very SMALL piece of the TAM for x64 CPUs. I don't really care if x64 can run an app that was made 50 years ago natively, most people don't care and when comparing uArchs that is completely irrelevant as that doesn't make a design worse.

Apple laptops however are very popular here and elsewhere, which pretty much own the premium laptop segment, i.e laptops over a $999. Now they are moving down to the $499-$599 budget segment which is where the bulk of Windows laptops are sold and its selling well.

As for the Apple OS side, you can install outside the App Store and yes you can get to a root folder on a Mac. don't know why you have this idea that Macs run a iOS, they don't.

Apple’s CPUs are not good because they are built for a controlled environment but because they are simply just good, ( see Apple CPUs running on macOS vs Linux, it’s the same performance for CPU tasks or even better for some server tasks on Linux).

Your argument for Apple making cores their cores the best is thats that the ARM ISA is "new" and that Intel/AMD can do somthing simliar. Thats not even remotely true, RISC-V is a newer ISA and RISC-V Linux there doesn't have the same baggage as x86 Linux and yet not yet a single RISC-V core can compete with Skymont in terms of perf/mm2 or absolute perf. Not a SINGLE company managed it with their latest RISC-V designs.

adroc_thurston · 2026-03-20T17:07:02-0400

Joe NYC said:
So, it will be all about integer on the cores and the SoCs on the client.

No, it's just perf. FP included.
Never waste your lead and all.

poke01 said:
As for the Apple OS side, you can install outside the App Store and yes you can get to a root folder on a Mac. don't know why you have this idea that Macs run a iOS, they don't.

Atta boy, Gatekeeper is annoying.

poke01 said:
Apple’s CPUs are not good because they are built for a controlled environment but because they are simply just good, ( see Apple CPUs running on macOS vs Linux, it’s the same performance for CPU tasks or even better for some server tasks on Linux).

YES

poke01 · 2026-03-20T17:19:08-0400

adroc_thurston said:
Atta boy, Gatekeeper is annoying.

sudo spctl --master-disable
is what I do in every Mac to disable it. But if its a work issued Mac good luck lol, probably will need IT support permission.

Z O X said:
except for games

some sony games require AVX2

adroc_thurston · 2026-03-20T17:21:14-0400

poke01 said:
sudo spctl --master-disable

Telling Windows migrants to type something into terminal is your first mistake.

itsmydamnation · 2026-03-20T17:25:48-0400

the only "issue" im aware of for apple cores and server workloads would be the 16k min page size and the fact many real servers spend a stupid amount of IO writing tiny log messages to disk/network.

What would be interesting is what does an apple stedy state high load core performance look like when you bolt it onto a memory sub system design to scale to large numbers ( > 128 ) cores with "good" NUMA performance. i suspect still very very good , but that fast 16mb L2 is yum and that thing probably not that easy to scale.

coercitiv · 2026-03-20T17:28:17-0400

adroc_thurston said:
Telling Windows migrants to type something into terminal is your first mistake.

yes, we need the reg file

adroc_thurston · 2026-03-20T17:43:17-0400

itsmydamnation said:
but that fast 16mb L2 is yum and that thing probably not that easy to scale

They have normal triple level caches M5 onwards now.
Fabric, yeah, PITA.
Non-AMD server fabrics are generally ass (ok CMN-700 is solid. But not 256c solid).

itsmydamnation · 2026-03-20T17:47:30-0400

adroc_thurston said:
They have normal triple level caches M5 onwards now.
Fabric, yeah, PITA.
Non-AMD server fabrics are generally ass (ok CMN-700 is solid. But not 256c solid).

its like we are back in 2007 and hyper transport is king even against the much higher IPC Penryn.

adroc_thurston · 2026-03-20T17:59:15-0400

itsmydamnation said:
its like we are back in 2007 and hyper transport is king even against the much higher IPC Penryn.

This time AMD has a nice and meaty roadmap, though.

itsmydamnation · 2026-03-20T18:07:58-0400

adroc_thurston said:
This time AMD has a nice and meaty roadmap, though.

Yeah intel was also at peak of there semi conductor powers and foundries where dropping like flys, can get away with alot when moores law was alive just by being ahead in semi. I dont have much faith in intel these days, so i guess the question is , i wonder when ARM will decide it would like some money from Vendors for DC cores?

Hulk · 2026-03-20T18:17:30-0400

poke01 said:
Hulk I was talking about new the M core as in the Middle core architecture in M5.

we reached a point where x64 isn’t the only performance side making fast cores. This isn’t 2010. Yes, I know that Apple doesn’t do DIY/thirdparty hardware but DIY is very SMALL piece of the TAM for x64 CPUs. I don't really care if x64 can run an app that was made 50 years ago natively, most people don't care and when comparing uArchs that is completely irrelevant as that doesn't make a design worse.

DIY x64 might be small but the percentage of computers using x86/x64 is huge. It IS harder to maintain compatibility for over 50 years than to just be able to start from scratch when you feel like it because you have a tightly controlled ecosystem.

poke01 said:
Apple laptops however are very popular here and elsewhere, which pretty much own the premium laptop segment, i.e laptops over a $999. Now they are moving down to the $499-$599 budget segment which is where the bulk of Windows laptops are sold and its selling well.

I'm sure they are great for people who don't know what a directory tree is or could care less about it.

poke01 said:
As for the Apple OS side, you can install outside the App Store and yes you can get to a root folder on a Mac. don't know why you have this idea that Macs run a iOS, they don't.

Got it. It is still a very closed (East Berlin) type of system.

poke01 said:
Apple’s CPUs are not good because they are built for a controlled environment but because they are simply just good, ( see Apple CPUs running on macOS vs Linux, it’s the same performance for CPU tasks or even better for some server tasks on Linux).

poke01 said:
Your argument for Apple making cores their cores the best is thats that the ARM ISA is "new" and that Intel/AMD can do somthing simliar. Thats not even remotely true, RISC-V is a newer ISA and RISC-V Linux there doesn't have the same baggage as x86 Linux and yet not yet a single RISC-V core can compete with Skymont in terms of perf/mm2 or absolute perf. Not a SINGLE company managed it with their latest RISC-V designs.

So are you saying if Apple wanted to they could make their cores run x64 natively and immediately clean the clocks of Intel and AMD in the x64 space? I'm sorry but find that hard to believe. I find it more likely they would run into the same design challenges that Intel and AMD have been struggling with for the past 5 decades. ARM was designed with efficiency in mind. x86 not so much. I mean ARM was designed from the start to be a full 32 bit RISC architecture, no need for uops. AMD and Intel have been working around that limitation since the PowerPC and still are!

adroc_thurston · 2026-03-20T18:25:30-0400

itsmydamnation said:
i wonder when ARM will decide it would like some money from Vendors for DC cores?

They already did, v9 rates are up and CSS rates are especially up.
Makes Venice an offer no one refuses.

Hulk said:
So are you saying if Apple wanted to they could make their cores run x64 natively and immediately clean the clocks of Intel and AMD in the x64 space?

Yes.
I mean, they already do TSO (which is the tricky part).

Doug S · 2026-03-20T18:31:16-0400

itsmydamnation said:
the only "issue" im aware of for apple cores and server workloads would be the 16k min page size and the fact many real servers spend a stupid amount of IO writing tiny log messages to disk/network.

VM page size doesn't affect that stuff. For years we had most CPUs with a 4K VM page size and hard drives with a 512 byte sector size. It wasn't until they crossed the 2 TB barrier that sector sizes went to 4096 bytes to match VM page size. Hopefully if you are writing log files you are allowing some level of write caching rather than syncing to storage with every 50 byte message. That's gonna be a MUCH bigger problem for NAND than it is for the difference between 4K and 16K VM page size.

Ditto for network, interfaces use DMA to take data from DRAM to the wire and the VM page size doesn't enter into it.

The major efficiency hit of larger pages are for applications that do a lot of small memory allocations and deallocations and fragment your allocation area. Moden allocators do a good job of delaing with this, but applications that think they can do a better job than the OS and manage it themselves may not. But there are some advantages to larger page sizes such as making it easier to implement larger low level caches and simpler page table structures so it is probably mostly a wash.

DEC Alpha seemed to work pretty well for servers, despite its 8K VM page size. And that was over 30 years ago, when server main memory was three orders of magnitude smaller than it is today. Apple's 16K page sizes are a hassle for compatibility with poorly written software that assumes a 4K page size, meaning porting would be a problem, but that's about its only meaningful impact.

Question Zen 6 Speculation Thread

Diamond Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member