• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Question Zen 6 Speculation Thread

Page 399 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Why should they? If you're tossing out 16 and 32 bit encodings that frees up a lot of prefixes. Why wouldn't you want to use them for all the crap using those EREX or whatever the heck ridiculous 8 and 10 byte instructions some of the 64 bit stuff is saddled with?

The CPU could still ALSO handle the current 64 bit encodings (to avoid needing a translator for those) but code using the new encodings would be smaller and make more efficient use of L1I and L2 cache.

That's where I get the small single digit boost - better use of icache. Not worth all the hassle, but you would get SOMETHING out of it if you went through all that.

I think saying "a lot" is a bit more than would actually be affected.

16 REX prefixes that were repurposed from the inc dec instructions

~15 a few instructions that are just declared illegal in 64 bit.

~30 size of push pop and the like instructions that have different logic based on the mode. Some 32 bit to 16 bit addressing in 64bit mode.

So ~64 of 256 = ~25% of the base instructions removed.

When you consider that most of the instructions dealing with all the other vector, complicated immediate mode operands and the like are already ModR/M provisioned. You're looking at ~64 / 65536 = 0.0029 = 0.03 % of the current operand map.

There may be some advantages gained by reduction in complexity of the instruction decoders, but considering everything is micro-op cached, another few micro %?

The one advantage you would get is mostly security based. The legacy modes often directly map interrupt and privilege escalations in unsafe ways.
 
Last edited:
Seems like "Intel is back" and it always surprises me seeing how much it consumes in power with so many cores.
I imagine that starting with Zen 6 AMD will also be forced to part large cores with a lot of compact cores to be able to keep the competition. But will AMD also be able to keep power consumption under control? The current design of an IOD fabbed on older process with CCDs will continue and work?
Intel doesn't appear to have an answer for Venice variants in DC. Ironically, Intel is looking strong in low end desktop and laptop, but still has no answer to AMD's HEDT and gaming solutions.

In other words, AMD is claiming all the most profitable markets and Intel is forced into lower margin, but higher volume markets.

Seems like Intel is falling behind from this POV.
 
On the subject of old instructions, I doubt the transistor count is worth a hill of beans 😉.

I think that Intel actually has it right with more specific extensions to raise IPC. It's my thought that moving forward we will have many more different kinds of CCD's that get mixed together for specific types of uses.

Right now we really have GPU core, big core, little core, and AI core.
 
Those extensions are why the strict definition of IPC doesn't really tell us much anymore. Some programs emit fewer, but denser instructions and others, especially older and unoptimized ones, emit more simple instructions, or less data throughout efficient instructions to be more pedantic. IPC is really only useful when you are talking about a specific program (or suite of programs running an identical script) running on a specific operating system version in isolation.

Now, you can talk about a given processor's peak data operation throughput when using it's most efficient instruction set extensions, with sufficient power, and sufficient cooling, while using it's maximum specified performance RAM. You can also talk about that number when restricted to various extensions. But just throwing IPC out there with zero context tells us next to nothing.
 
Those extensions are why the strict definition of IPC doesn't really tell us much anymore. Some programs emit fewer, but denser instructions and others, especially older and unoptimized ones, emit more simple instructions, or less data throughout efficient instructions to be more pedantic. IPC is really only useful when you are talking about a specific program (or suite of programs running an identical script) running on a specific operating system version in isolation.

Now, you can talk about a given processor's peak data operation throughput when using it's most efficient instruction set extensions, with sufficient power, and sufficient cooling, while using it's maximum specified performance RAM. You can also talk about that number when restricted to various extensions. But just throwing IPC out there with zero context tells us next to nothing.
Now there's a big bottle of worms!

I agree. I think IPC is becoming quite dated as a metric.

It still comes down to how much you can jam through the pipe ..... just like a car. In automotive, I always tell young engineers not to think about the engine as an internal combustion engine. Think of it as an air pump. The more air you can move through the engine (with the right mix of fuel), the higher the power it will generate.

Processors are not that different I think. The more data you can move through, the more work that gets done.

You are definitely correct about needing context though. IPC alone isn't very representative IMO.
 
Back
Top