Discussion ARM Cortex/Neoverse IP + SoCs (no custom cores) Discussion

Page 29 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

eek2121

Diamond Member
Aug 2, 2005
3,384
5,011
136
Uh, congrats to ARM, I guess, for almost catching up to AMD/Intel. They only needed a large process advantage to do it.
 

Attachments

  • pic.png
    pic.png
    6.3 KB · Views: 32

poke01

Diamond Member
Mar 8, 2022
3,733
5,077
106
these chips will make for a nice comparison against A18 and 8 Gen 4.
 

poke01

Diamond Member
Mar 8, 2022
3,733
5,077
106
This also confirmed that Apple’s high clocks is just not due to N3E but their design permits higher clocks. If ARM really had a up to 4.4GHz core they would be shouting thru the rooftops.
 

Shivansps

Diamond Member
Sep 11, 2013
3,916
1,570
136
I think Sarah answered this. I only will add that NEON has twice the number of registers, enough for the 16 AVX2 registers. Though that leaves you with no room for temporaries which might create complications, though nothing really blocking.


SVE2 is already there. It's unrelated with vector length.
Thats matter here is that in Windows, they only emulate x64 SSE... i dont remember what version, but it dosent expose AVX at all, not even the 128bit one.
 

Doug S

Diamond Member
Feb 8, 2020
3,298
5,734
136
Indeed. Not sure what adroc_thurston read.

That being said - I'd expect SVE and NEON to use the same backend resources. I believe this is six full 128b pipes.

I believe the six 128b pipes, but I don't buy it is the same backend resources - I doubt it can schedule 6 NEON instructions per cycle, but rather up to six 128b in a combination of at most 4 NEON and 2 SVE128 (no SVE256 support)
 

soresu

Diamond Member
Dec 19, 2014
3,895
3,331
136
I believe the six 128b pipes, but I don't buy it is the same backend resources - I doubt it can schedule 6 NEON instructions per cycle, but rather up to six 128b in a combination of at most 4 NEON and 2 SVE128 (no SVE256 support)
arm_x925_01.jpg
This says 2x increase in SIMD queues - dunno what it was before but I'd wager it fits the bill of the new backend resources.
 

soresu

Diamond Member
Dec 19, 2014
3,895
3,331
136
Thats matter here is that in Windows, they only emulate x64 SSE... i dont remember what version, but it dosent expose AVX at all, not even the 128bit one.
From the way the FEX-emu devs talk about AVX on their discord I'm pretty sure that they don't emulate it either.
 

soresu

Diamond Member
Dec 19, 2014
3,895
3,331
136

trivik12

Senior member
Jan 26, 2006
348
318
136
Apple should call their next core M500 to make it bigger and AMD should call their next Zen 500 as well. /s
 

soresu

Diamond Member
Dec 19, 2014
3,895
3,331
136
Apple should call their next core M500 to make it bigger and AMD should call their next Zen 500 as well. /s
If that is a zing on ARM for naming X5 as X925 then it is unwarranted.

They are basically just realigning branding, something they had already done in the past when they aligned Cortex and Mali branding with A76 and G76 back in 2018.

The Immortalis branding has likewise been changed to match Cortex X so that the top most IP is named Immortalis G925.

Contrasted to AMD's recent branding changes it's positively intelligible 🤣
 

ikjadoon

Senior member
Sep 4, 2006
241
519
146
Why are new 2024 smartphone SoCs using the 4 year old Cortex A78.


Some combination of price, performance, area, efficiency. Arm Ltd. confirmed ARMv9 royalties are pricier vs ARMv8 & the A78 is the fastest ARMv8 core excluding the notably larger & pricier Cortex-X1.

Unfortunate for 1T perf & this trend stays alive.

That would likely explain why MediaTek aren't using a faster / newer A7xx core (all ARMv9). Yet, I still wish Arm Ltd. (and licensees) would be interested in cheaper X1, X2, X3, etc cores for mid-range SoCs.

// the bigger picture

Smartphones using the Dimensity 7xxx SoCs definitely do not have fat margins; smartphone manufacturers likely want to spend the extra $$ from a faster 1T CPU on better screens, better cameras, more features, etc.

Versus "this website loaded somewhat faster when I have great internet" or "these apps are super snappy".

A lot of negotiations, middlemen / margins, guesses about the market, etc.

Smartphone consumers -> smartphone manufacturers <-> SoC OEMs <-> CPU architects

//

Unrelatedly: I can't pretend to understand MediaTek's model numbering: the 7200 uses 2x A715 @ 2.8 GHz + 6x A510, meanwhile the 7300 uses 4x A78 @ 2.5 GHz + 4x A55. But, right, these are only the CPUs: I've not examined the complete SoC differences.
 
  • Like
Reactions: Tlh97 and soresu

FlameTail

Diamond Member
Dec 15, 2021
4,384
2,761
106
Mediatek isn't the only culprit. Samsung is still using A78 cores for their midrange SoCs.

Exynos 1280 : 2×A78 + 6×A55
Exynos 1380 : 4×A78 + 4×A55
Exynos 1480 : 4×A78 + 4×A55

3 generations of SoCs using the same cores. I am beyond outraged.

I would be somewhat sated if they had put an X1 in the 1480. But they didn't. The poor ST performance of the A78 really hurts the experience in the midrange Galaxy phones, becuase Samsung's OneUI android skin is really heavy and needs strong ST performance to have a smooth experience.
 
  • Like
Reactions: Tlh97 and ikjadoon

FlameTail

Diamond Member
Dec 15, 2021
4,384
2,761
106
Are there any Geekbench 6 subtests that benefit from SVE2, like object detection does with SME?
 

SpudLobby

Golden Member
May 18, 2022
1,041
701
106
So I'm thinking that the Client CSS stuff we're seeing now is basically what Qualcomm was talking about in their lawsuit a couple of years ago with the whole "you'll have to bundle Cortex with Mali" complaint. Going to be interesting to see what that ends up looking like in practice - I have a hard time believing that Nvidia and Samsung (and, for that matter, Renesas) will just be forced onto Mali for their lineups.
Yes, I agree. I think it was an exaggeration. But it is kind of interesting, where they’re going with CSS.
Indeed. Not sure what adroc_thurston read.

That being said - I'd expect SVE and NEON to use the same backend resources. I believe this is six full 128b pipes.
Yes exactly. I don’t see why it would be SVE or NEON only. Should just be SVE x 6.
 
  • Like
Reactions: Tlh97 and FlameTail

ikjadoon

Senior member
Sep 4, 2006
241
519
146
On a different topic, Arm's hesitancy to put absolute numbers on its charts is not inspiring, but if we want to do some pixel peeping:

cortex-x925-scperf.png


With the helpful note from @SarahKerrigan that "2023 Best-in-Class @ 3.8GHz" is likely referring to the late 2023 Apple A17 Pro (P-cores @ 3.78 GHz), as it lines up quite closely:

Each tick mark appears to be 10%. And, it seems to align up: the A17 Pro is ~28% faster in 1T than the 8G3 for Galaxy, which is close to the chart's ~26% faster via the rough bar width.

From @uzzi38's earlier image, I've added it, too, as 1.15x multipliers if "ISO" seemingly means frequency only.

x925-performance.png


Predicted 1T GB6.2 scores & "IPC" are in bold and were calculated from the bar widths. Thus, for the "Cortex-X925 @ 3.8 GHz" score, it's the X4 2287 base score * 1.33 bar width = 3041.71, rounded to 3042.

Arm Marketing NameRough Bar WidthPossible SoCClockGB6.2 1TGB6.2 1T Pts / GHz "IPC""IPC" Relative
2023 Premium Android1.0QC SD8G3 for Galaxy3.39 GHz2287674.6100.0%
2023 Best-in-Class @ 3.8 GHz1.26xApple A17 Pro3.78 GHz2930775.1114.9%
Cortex-X925 @ 3.8 GHz1.33x??3.80 GHz3042800.5118.7%
Cortex-X925 @ 38 GHz (+ sw & sys "optimizations")1.36x??3.80 GHz3110818.5121.3%
X925 vs X4 "ISO"1.15x??3.60 GHz2793775.8115.0%
X925 vs X4 "ISO"1.15x??3.80 GHz2948775.8115.0%
n/an/aApple M44.38 GHz3715848.2125.7%

Sources:
//

So is X925 a GB6.2 1T Perf / GHz gain of +15% or +19% over X4? I can't argue that my pixel counting nor rounding are very precise.

Thus a range of GB6.2 1T runs, assuming +15% Pts / Ghz (2nd chart estimate from "ISO" comparisons) to +18.7% Pts / Ghz (1st chart estimates from "1.33x").

Hypothetical 3.6 GHz X925 core: ~2793 → ~2883
Hypothetical 3.8 GHz X925 core: ~2948 → ~3042

I'd rather not consider the "optimized" estimate (+21.3% IPC gain) as it's unclear what those optimizations actually are and how they would affect the Cortex-X4.

All this, when Arm could just label their silly charts. I might have some typos to fix tomorrow, too.