• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Question Zen 6 Speculation Thread

Page 403 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Thanks.

From your linked document:


From your link:

As much as I respect this source, this is speculation ..... and I seriously doubt it.

I believe that future e-cores will support AVX10 with a 128 bit path and that P cores will likely support a full 512 bit path. This is my speculation.; however, the SPEC linked by @511 clearly shows that it supports 3 different bit paths .... not ONLY 512bit.

Also, his statement is incomplete. "AVX10 had intended to allow 256 or 512 bit modes depending upon processor capabilities". He left off 128bit further raising doubt to the speculation statement.

And I don't know how you interpret adroc's statement:


... any other way than adroc doesn't believe AVX10 supports 128b or 256b (which it clearly does). Of course he COULD clarify his difficult to justify statement to explain himself.
128-bit-only mode was already killed a couple years ago.

 
Thanks.

From your linked document:


From your link:

As much as I respect this source, this is speculation ..... and I seriously doubt it.

I believe that future e-cores will support AVX10 with a 128 bit path and that P cores will likely support a full 512 bit path. This is my speculation.; however, the SPEC linked by @511 clearly shows that it supports 3 different bit paths .... not ONLY 512bit.
That's not true so initially there were three implementation of vector length 128/256/512b 128 got killed and so did 256bit vector linked in phoronix article now we only got 512 vector length . I mean the size supported by HW.

And than there is the size of data you can have 128/256/512b vectors with respective instructions which is different from hardware Maz supported vector length.
 
It's my understanding that AVX10 is designed to provide a single code base the ability to run on a variety of cores having various levels of bit widths in the execution path.

AMD are certainly not going to need to support anything other than 512b.

Intel p cores ? Likely 512b.

Intel e cores? I'm guessing 128b and maybe 256b.

What is the point of AVX10 over AVX512 if not for the variable bit widths?
 
It's my understanding that AVX10 is designed to provide a single code base the ability to run on a variety of cores having various levels of bit widths in the execution path.
No. It's 512b only. The underlying FMA hardware can be smaller, but regs/shuffle etc no luck.
What is the point of AVX10 over AVX512 if not for the variable bit widths?
It was the point.
Then AMD showed the rightful way.
 
It's my understanding that AVX10 is designed to provide a single code base the ability to run on a variety of cores having various levels of bit widths in the execution path.
AVX10 does exactly the same register bit width support as AVX512 with extensions.

ISA however does not specify how it should be implemented in hardware.


What is the point of AVX10 over AVX512 if not for the variable bit widths?
That it fixes the extension mess of AVX512. By unifying all popular and de facto standard extensions under one name.

Intel also intended to make 512b registers optional so they could make their life easier on desktop and keep other nice features while keeping 512b for server.

Fortunately they were made to understand that it was stupid approach because it would fragment the ISA they wanted to unify. If they want to save space they can do it via hw implementation (like AMD does).

This matters because if the same instructions are supported everywhere then you can write and test your software on cheap CPU (that does 256b datapaths in hw) and then deploy it on faster more expensive CPU (that does 512b datapaths in hw) and get the expected speed up without lifting a finger, at least in theory as there are always corner cases😉

If they kept AVX10/256 that would not be the case making life harder for everyone.
 
No. It's 512b only. The underlying FMA hardware can be smaller, but regs/shuffle etc no luck.

It was the point.
Then AMD showed the rightful way.
Robert Hormuth describes it as follows:
Regular x86 Ecosystem Advisory Group working sessions have already surfaced “aha” moments that only an open forum can uncover. One highlight to note, cloud providers made it clear that memory tagging, once considered a debug only feature, is mission critical for protecting production workloads. Another important moment to highlight is that maintaining identical ISA support across all form factors dramatically streamlines software delivery; letting developers write once and deploy anywhere.
 
From practical point of view you can do that with AVX512, as only Xeon Phi does not support 128b/256b AVX512 extension afaik. Yes you need to check one more flag for dynamic dispatch compared to AVX10 but I would say that's hardly a problem.

Are there any instances of developers using AVX-512 to handle 256b length vectors?

No, that is desd.
AVX10 is 512b only.

Hmm. Learn something new every day. I guess it makes sense.
 
Back
Top