Discussion AMD's Soundwave ARM APU: The Beginning of Transformation !!!

Page 21 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

soresu

Diamond Member
Dec 19, 2014
3,899
3,331
136
As long X86 is strategic, it won't die.
As long as legacy software exists x86 won't die.

Emulation might be fine for some, but for critical use cases of businesses and infrastructure reliant on legacy x86 code they are not going to risk "mostly fine" implementations ever.
 

soresu

Diamond Member
Dec 19, 2014
3,899
3,331
136
2-3 Neoverse licensing hikes and they'll be crawling back and begging for mercy.
The thing is that Amazon and others like them with custom Neoverse based SoCs are not selling chips, they are selling VM instances.

Which means tracking exactly how many chips have been manufactured without ARM Ltd somehow having spies within TSMC or other foundries is going to be very difficult.

If ARM Ltd start pulling a fast one the licensees can just start playing creative accounting with how many chips they have actually ordered from foundries.

Or they can start modifying older Neoverse cores on the sly so that they are getting more performance for the same license cost.

c91caf96-d480-4f2a-89cc-416f38712878_text.gif
 

soresu

Diamond Member
Dec 19, 2014
3,899
3,331
136
Which would like a giant fail to me. ARM ridiculed again.
Not ridiculed per se.

Their cores seem to be quite competitive in the high and mid range, it's only the so called "efficiency" low range A5xx that has become worthy of ridicule for its relative inefficiency with the long time frame after A55 only to be replaced with A510 which didn't blowing anyone away and then A520 which seems pretty meh for a 2 yr update when compared to the gains of both A7xx and X9nn/Xn.

I'll eat my hat double time if A530 comes out with a sOoO style core and doubled perf/watt at ISO node, but I don't see that happening.

As far why other companies are going custom it's more a matter of product differentiation and was bound to happen (plus of course the companies licensing Neoverse V and N cores are basically subsidizing A7xx and X9nn development anyway so there's no need to worry on that score).

The question is if any new companies like Cix are going to step up and buy into the higher end ARM core consumer space with the broadening horizon of WoA opening up.
 

soresu

Diamond Member
Dec 19, 2014
3,899
3,331
136
Where did the name Prometheus come from?

The only time I heard it mentioned outside of a Zen C core variant was MLID talking about it being the codename for Zen7 a yr ago.
 

DZero

Golden Member
Jun 20, 2024
1,291
472
96
Not ridiculed per se.

Their cores seem to be quite competitive in the high and mid range, it's only the so called "efficiency" low range A5xx that has become worthy of ridicule for its relative inefficiency with the long time frame after A55 only to be replaced with A510 which didn't blowing anyone away and then A520 which seems pretty meh for a 2 yr update when compared to the gains of both A7xx and X9nn/Xn.

I'll eat my hat double time if A530 comes out with a sOoO style core and doubled perf/watt at ISO node, but I don't see that happening.

As far why other companies are going custom it's more a matter of product differentiation and was bound to happen (plus of course the companies licensing Neoverse V and N cores are basically subsidizing A7xx and X9nn development anyway so there's no need to worry on that score).

The question is if any new companies like Cix are going to step up and buy into the higher end ARM core consumer space with the broadening horizon of WoA opening up.
In order cores are no longer useful in the phone and PC aspect, only works in the small things which don't need an interface, and even that is starting to lose the protagonism
 
  • Like
Reactions: Thibsie

adroc_thurston

Diamond Member
Jul 2, 2023
6,051
8,534
106
The thing is that Amazon and others like them with custom Neoverse based SoCs are not selling chips, they are selling VM instances.

Which means tracking exactly how many chips have been manufactured without ARM Ltd somehow having spies within TSMC or other foundries is going to be very difficult.
Neoverse fees are per core shipped.
ARM doesn't have to track anything, it's in the licensing agreement.
 

soresu

Diamond Member
Dec 19, 2014
3,899
3,331
136
In order cores are no longer useful in the phone and PC aspect, only works in the small things which don't need an interface, and even that is starting to lose the protagonism
There is a mid point that seems to be developing in the research space between InO and OoO, thought I'm not sure if anyone has actually produced a CPU core based on these concepts as yet.

Some of them are on the level of 10s of milliwatts over the consumption of an in order core, but performing somewhere on the level of 70-95% of an out of order core of equal width.
 

gdansk

Diamond Member
Feb 8, 2011
4,210
7,076
136
So who tells them how many cores/chips then?

It has to be TSMC or the company that designed the SoC.
Lying about core shipments is not a winning move or a cost-cutting measure. You will end up paying more damages and for lawyers the moment ARM notices revenue not growing despite more customers using Grav.
 

soresu

Diamond Member
Dec 19, 2014
3,899
3,331
136
Lying about core shipments is not a winning move or a cost-cutting measure
I don't think so either, but within Amazon's VM instance business model of internal chip use as opposed to actually selling the chips on as Intel, AMD or Samsung do it seems like getting an accurate accounting of exact numbers would be difficult at best if they decided to fudge the numbers.
 
  • Like
Reactions: marees

StefanR5R

Elite Member
Dec 10, 2016
6,558
10,310
136
There is a mid point that seems to be developing in the research space between InO and OoO, thought I'm not sure if anyone has actually produced a CPU core based on these concepts as yet.
Ah yes, this goes way back to this patent on cores which are neither in-order nor out-of-order, yet become simultaneously in-order and out-of-order if nobody is looking. AFAIU, it hasn't been productized yet, but nevertheless the inventor already made boatloads of money off it.
 
  • Like
Reactions: soresu

soresu

Diamond Member
Dec 19, 2014
3,899
3,331
136
Ah yes, this goes way back to this patent on cores which are neither in-order nor out-of-order, yet become simultaneously in-order and out-of-order if nobody is looking. AFAIU, it hasn't been productized yet, but nevertheless the inventor already made boatloads of money off it.
Boatloads of money?

Sounds like big bois already came knocking and product is on roadmaps.

Either way there are many research papers on these "slice out of order" cores, so I imagine there are more than a handful of different patent holders in this new IP sector.
 

StefanR5R

Elite Member
Dec 10, 2016
6,558
10,310
136
Any sources for that,
The notion that there is supposed to be something *between* "in order" and "out of order" reminded me of phenomena in quantum physics, in which something can be in two antagonistic states at once — until observed, at which point it collapses into either one or the other. Then, thinking of one W. Heisenberg reminded me of the other W. Heisenberg…


IGMC.
 
  • Haha
Reactions: MS_AT

Tigerick

Senior member
Apr 1, 2022
782
750
106
1. Power Efficiency

On the x86 side, the higher performance / higher frequency devices are the established market that our devices have to compete in, so that's where our design focus is. We could build the same Zen microarchitecture with an ARM ISA on top instead. We could deliver the same performance per watt. We don't view the ISA as a fundamental input to the design as far as power or performance.

Mike Clark mentioned above statement regarding performance per watt. For comparison, I have created a table comparing x86 and ARM with GeekBench 6 ST performance, and I hope nobody believe what Mike claimed about PPW. Qualcomm, Mediatek not to mention Apple's M4 are all surpassing AMD's KRK/STX in single thread performance with fanless design. You could argue Qualcomm's X Elite is performed slower than KRK/STX, but that is about to change with upcoming 18-core Oryon V3 by the end of this year.

AMD upcoming Venice server SoC which is fabbed by TSMC's N2 process is having 32C/64T per compute tile. We don't know what total TDP with 256-core but judging by the changing of the socket supporting 16-channel DDR5, Venice with 256/192 cores should require at least 600W power. Meanwhile, NV has pre-annouced custom ARM core, Vera/Olympus with 88-core supporting SMT but consumes just 50W. That's mean with 2 tiles link together through NVLink, NV could have Vera Superchip with 176c/352T consuming 100W only...Go figure
 

Tigerick

Senior member
Apr 1, 2022
782
750
106
2. Performance

ArchitectureDate AnnouncedFeaturesARM coreCustom CoreRISC-V
ARMv8.6-AQ4-2019
  • GEMM
M3, A17 Pro
ARMv8.7-AQ4-2020Oryon V1
ARMv9.0-AQ1-2021 ?
  • SVE2
  • TME
  • CCA
Cortex X2-X3RVA-23 - SiFive P870
ARMv9.2-AQ4-2020
  • SME
Cortex X4, X925, X930M4, A18
ARMv9.3-AQ4-2021
  • Non-maskable interrupts
  • Instructions to optimize memcpy() and memset() style operations
  • Enhancements to PAC
  • Hinted conditional branches
ARMv9.4-AQ4-2022
  • Virtual Memory System Architecture (VMSA) enhancements.
  • SME2
  • Guarded Control Stack (GCS)
  • Confidential Computing
Vera ?, Prometheus ?
ARMv9.5-AQ4-2023
  • FP8 support (E5M2 and E4M3 formats) added to SME2, SVE2, Advanced SIMD (Neon)
ARMv9.6-AQ4-2024
  • Improved SME efficiency with structured sparsity and quarter tile operation
  • New SVE instructions for expand/compact and finding first/last active element

Table said thousands of words...I also listed Risc-V latest profile, RVA23, the performance is about Cortex-X2 level. RISC-V still very behind in the development cycle, let's see how much improvement in newer profile..
 
Last edited:
  • Like
Reactions: igor_kavinski

Tigerick

Senior member
Apr 1, 2022
782
750
106
3. The Arrival of LPDDR6

LPDDR6 will bring huge bump in memory bandwidth, that's mean all vendors will bump the core counts of CPU, GPU and NPU. With more cores, so does the power consumption. x86 with high clock speed won't be able to scale without big bump in power usage. There was rumor about Qualcomm's next gen X-Elite with 192-bit memory bus comes with 18 cores. And if my speculation is correct, upcoming Apple M5 Pro with 192-bit LPDDR6 should get bump of 14-core to 18-core as well ? We are talking about monolithics SoC, not chiplet design which is more expensive and power hungry..

GraceVeraVera SuperchipTurin / DenseVenice SP8Venice SP7PrometheusClearwater Forest
CPUNeoverse V2OlympusOlympusZen 5 / 5cZen 6 / 6cZen 6cCustom Armv9.4 w/SMT ?CWF-SP
Boost Clock4.1 GHz / 3.7 GHz
All-core3.0 GHz4.1 GHz / 3.35 GHz
L3 Cache114 MB32MB / 32MB per CCD48MB / 128MB per CCD128MB per CCD32MB per Tile ?
Total L3 Cache114 MB512MB / 384MB384MB / 512MB768MB / 1024 MB384MB ?
CPU Tiles11216 / 12 + 1 IOD8 / 4 + 1 IOD6 / 8 + 2 IOD4 x 3 + 2 IOD
Total Cores7288176128 / 19296 / 128192 / 256192
Total Threads72176352256 / 384192 / 256384 / 512192
TDP (CPU only)50 W ?100 W ?500 W350 - 400W ?600 W ?
Memory512-bit LPDDR5X384-bit LPDDR6 ?768-bit LPDDR6 ?12-channel DDR58-channel DDR5-800016-channel DDR5-8000
Memory Bandwidth240GB - 512 GB/s
480GB - 384 GB/s
614 GB/s1228 GB/s614 GB/s512 GB/s1 TB/s
 
Last edited:

Tigerick

Senior member
Apr 1, 2022
782
750
106
4. BOM

NV's upcoming N1X SoC should be entry ARM SoC targeting below $600 fanless tablet/notebook PC. 2+2 A725 might not be performance leader: they are in range of Mendocino/N200 with fanless design meanwhile AMD and Intel have to rely on the fans to keep up the performance.
 
Last edited:

Tigerick

Senior member
Apr 1, 2022
782
750
106
5. WoA 12 / Windows Server for ARM

Have you wonder how Microsoft going to support upcoming ARM vendors like NV, AMD, Mediatek and Samsung? Microsoft could hard patch the OS to allow all upcoming CPU vendors to access low level kernel; Or Microsoft could launch brand new OS like Windows 12 ARM supporting all CPU vendors ?

According to Wiki, MacOS 15 has been updated to support ARMv9 in the operating system level. Microsoft could follow steps to modermize OS by supporting ARMv9 as well. That's mean Microsoft could launch Windows 12 by the end of this year with NV/AMD's Soundwave/Mediatek.

As for Windows Server ARM, it should be based on WoA12. The only issue is we don't have CPU vendors with flagship performance: if Microsoft is waiting for NV's Vera / AMD's Prometheus, then Microsoft could launch Windows Server on ARM by end of 2026...we shall see...
 
Last edited:

Tigerick

Senior member
Apr 1, 2022
782
750
106
6. Soundwave - The Beginning of Transformation!

As I predicted many months ago, AMD's Soundwave SoC signals the beginning of transformation from x86 to ARM platform. Soundwave is not custom SoC for Microsoft, period. Saying AMD is making custom SoC for Microsoft does not understand how business works. AMD won't spend billions of dollars making SoC for Microsoft alone, AMD will have to create flagship SoC to compete with Qualcomm, NV and Mediatek for ARM PC market share.

Microsoft just launched latest Surface Pro 12 and Surface Laptop with Qualcomm's X Plus 8-core SoC. It is mainstream SoC targeting sub-$1000 PC market. We should assume Microsoft won't replace this model soon, then what family Microsoft will fit in with NV and AMD's SoC?

NV is rumored to launch three ARM SoC; I suspect the low-end model, N1X which might be manufactured by Intel's 3 process will be the choice for Surface Go family.

That left AMD's Soundwave which should/could be replacing X Elite model. X Elite is the flagship ARM SoC with 12-core CPU and 3.8TF GPU, then what should we expect for AMD's Soundwave? That's why I don't believe in MLID leaks about Soundwave specification.

What I believe:
  • TSMC's N3P process targeting 2026 launch
  • 16MB of MALL Cache
What I don't believe:
  • 128-bit LPDDR5X-9600 - 192-bit LPDDR6 ?
  • 2P + 4E total 6 cores. I am expecting 12-18 CPU cores
  • 4 RDNA3.5 CU. I am expecting 16-24 CU
  • NOT for 5-10W TDP