Discussion Qualcomm Snapdragon Thread

jdubs03 · Oct 24, 2024

Another review:

FlameTail · Oct 24, 2024

Snapdragon 8 Elite hits new record Geekench 6 score

This is a QRD.

FlameTail · Oct 24, 2024

Snapdragon 8 Elite
2 × Phoenix-L @ 4.32 GHz [12 MB sL2]
6 × Phoenix-M @ 3.53 GHz [12 MB sL2]

Snapdragon 7 Gen 4 (Speculation)
2 × Phoenix-L @ 3.9 GHz [8 MB sL2]
2 × Phoenix-M @ 3.2 GHz [4 MB sL2]

Will a quad core CPU be unsuitable for Android today?

Even $150 phones have '8-core CPU' nowadays.

SteinFG · Oct 24, 2024

ARM really went with a nuclear option, huh. I can see their reasoning: Arm is stuck in court for over a year already, and Qualcomm gets good in-house core, intends for this momentum to carry forward. So arm cancells the agreement at the point where oryon future is bright but slightly uncertain, best time to negotiate high % fees. ARM is betting that qualcomm doesn't want their progress to stop and they'll pay extra.

FlameTail · Oct 24, 2024

SPEC numbers extracted from Geekerwan's graphs.

	A18-P	A18-E	Oryon-L	Oryon-M
Clock speed	4.04 GHz	2.2 GHz	4.32 GHz	3.53 GHz
SPEC INT	10.7	3.3	8.9	5.2
SPEC FP	16.0	5.0	14.0	8.0
Core size	3.0 mm²	0.8 mm²	2.2 mm²	0.9 mm²

Performance wise (rough estimation);

Oryon-L : Oryon-M
100% : 60%

A18-P : A18-E
100% : 33%

Core size wise ;

Oryon-L : Oryon-M
100% : 40%

A18-P : A18-E
100% : 27%

FlameTail · Oct 24, 2024

It's been an eventful month for ARM and Qualcomm;

- Mediatek Dimensity 9400
- Qualcomm cancels Snapdragon Dev Kit
- Qualcomm Snapdragon 8 Elite, Snapdragon Ride Elite and Cockpit Elite with 2nd gen Oryon cores
- ARM sends notice about revoking Qualcomm's ALA
- Massive Google Tensor G4/G5 leak
- Apple M4 Pro and M4 Max unveil (October 30th?).

jdubs03 · Oct 24, 2024

FlameTail said:
It's been an eventful month for ARM and Qualcomm;

- Mediatek Dimensity 9400
- Qualcomm cancels Snapdragon Dev Kit
- Qualcomm Snapdragon 8 Elite, Snapdragon Ride Elite and Cockpit Elite with 2nd gen Oryon cores
- ARM sends notice about revoking Qualcomm's ALA
- Massive Google Tensor G4/G5 leak
- Apple M4 Pro and M4 Max unveil (October 30th?).

I’d say the 8 Elite and the Summit as a whole was a bright spot in all of this.
But the rest? Ranges from meh to not good.

DrMrLordX · Oct 24, 2024

poke01 said:
Yeah Qualcomm isn’t getting an x86 license unless something very very crazy happens

Maybe Qualcomm could sell itself to Intel. Remember when Tim Horton's "bought" Burger King in a tax inversion?

MS_AT · Oct 24, 2024

FlameTail said:
SPEC numbers extracted from Geekerwan's graphs.

A18-P A18-E Oryon-L Oryon-M
Clock speed 4.04 GHz 2.2 GHz 4.32 GHz 3.53 GHz
SPEC INT 10.7 3.3 8.9 5.2
SPEC FP 16.0 5.0 14.0 8.0
Core size 3.0 mm² 0.8 mm² 2.2 mm² 0.9 mm²

Performance wise (rough estimation);

Oryon-L : Oryon-M
100% : 60%

A18-P : A18-E
100% : 33%

Core size wise ;

Oryon-L : Oryon-M
100% : 40%

A18-P : A18-E
100% : 27%

Do you have compiler settings that Geekerwan used? For quoting SPEC results they are essential, I mean even Qualcomm itself claims this on their slides when pointing fingers at Intel😉

Cardyak · Oct 24, 2024

mvprod123 said:
Comparison of microarchitectures.

A18 P-core and Oryon-L

View attachment 110078

A18 E-core & Oryon-M

View attachment 110080

These diagrams have a few mistakes (but the overall picture is broadly correct). However: The M4 Small core does not feature 192KB L1 Instruction cache, it's 128KB. Also the uops count below the "Map and Rename" sections artificially inflate the scheduling bandwidth. For example the A18/M4 P Core cannot dispatch 10uops to each individual dispatch buffer simultaneously, there are limitations within both the allocation stage, and also the number of uops each scheduler can accept per cycle. Theres also something strange happening with the ST queue on the Oryon M core, Qualcomm themselves have stated that the Oryon L core has a ST Queue length of 56, and I'm skeptical that the smaller core will have a larger buffer. I think there's some interference here from non-scheduling queues and it's affecting the measurements.

For what it's worth my diagrams can be located here: http://bit.ly/32qLLew

There's inevitably some mistakes on these also, but over time as more testing is conducted we can hopefully collate together all of this information and present a more accurate picture.

FlameTail · Oct 24, 2024

Snapdragon Summit 2024

Day 1 Keynote (Mobile)

Day 2 Keynote (Automotive)

Day 1 Deep Dive (Mobile)

- YouTube

Auf YouTube findest du die angesagtesten Videos und Tracks. Außerdem kannst du eigene Inhalte hochladen und mit Freunden oder gleich der ganzen Welt teilen.

youtu.be

Day 2 Deep Dive (Automotive)

- YouTube

Auf YouTube findest du die angesagtesten Videos und Tracks. Außerdem kannst du eigene Inhalte hochladen und mit Freunden oder gleich der ganzen Welt teilen.

youtu.be

jdubs03 · Oct 24, 2024

MS_AT said:
Do you have compiler settings that Geekerwan used? For quoting SPEC results they are essential, I mean even Qualcomm itself claims this on their slides when pointing fingers at Intel😉

This should help (top right).

MS_AT · Oct 24, 2024

jdubs03 said:
This should help (top right).
View attachment 110177

He used to add this information in the past. But glancing on his 8 Elite video I haven't seen it so the question is it still the same. He might have changed things in between, he might have not.

FlameTail · Oct 24, 2024

The Hexagon NPU of 8 Elite adds two more Scalar cores and 2 more Vector cores.

They say it's "45% faster". If that means 45% more INT8 TOPS, then we are going from 45 TOPS (8G3) to 65 TOPS (8 Elite).

"6nm single-chip"

@Ghostsonplanets This confirms that the Fastconnect 7900 isn't integrated into the 8 Elite SoC (which is made on 3nm), but that it's a discrete chip.

Also "AI Enhanced Wifi". ROFLed at that one.

FlameTail · Oct 24, 2024

MS_AT said:
He used to add this information in the past. But glancing on his 8 Elite video I haven't seen it so the question is it still the same. He might have changed things in between, he might have not.

Andrei's comment about this matter;

he's using NDK binaries on SPEC for Android which will have an inherent handicap vs iOS shared runtime libraries
it's fine given that this represents the userspace experiences between the OS', however if you would really want to look at just µarch you'd deploy glibc+jemalloc binaries on Android get get somewhat of a similar allocator behavior to what iOS does in which case the competitive performance differences here are going to be fundamentally smaller

geekbench difference is smaller because of the way it's built counteracts some of these differences at the moment, and actually if you run 6.2 (non-SME) that's probably as close as you can reasonable get for a 1:1 µarch comparison

Posted in Chips&Chees Discord.

FlameTail · Oct 24, 2024

MS_AT said:
But I guess no one in android space, due to marketing reasons, will dare to release a SoC with less than 8 CPU cores. Even Qualcomm had to spam M cores to keep the magical 8 number going😉

I wouldn't call it spamming. 2L+6M is a perfectly fine configuration for 8 Elite.

That's why I am wondering what they will do for lower tier chips?

If they are going Oryon L/M, then they'll have to put <8 cores.

FlameTail said:
Snapdragon 8 Elite
2 × Phoenix-L @ 4.32 GHz [12 MB sL2]
6 × Phoenix-M @ 3.53 GHz [12 MB sL2]

Snapdragon 7 Gen 4 (Speculation)
2 × Phoenix-L @ 3.9 GHz [8 MB sL2]
2 × Phoenix-M @ 3.2 GHz [4 MB sL2]

Will a quad core CPU be unsuitable for Android today?

Even $150 phones have '8-core CPU' nowadays.

A wild bit of speculation is that Qualcomm has an Oryon-S core in the works.

If L = Large and M = Medium, doesn't that suggest that there will be a "Small" core?

Cortex X ≈ Oryon-L
Cortex A7xx ≈ Oryon-M
Cortex A5xx ≈ Oryon-S

Oryon-S would allow Qualcomm to scale the Oryon CPU all the way down to low end Snapdragon 4 series chips and Snapdragon Wearable chips.

MS_AT · Oct 24, 2024

FlameTail said:
Andrei's comment about this matter;

Posted in Chips&Chees Discord.

What Andrei said is raising an interesting dillema does the end user care about uarch comparison or the actual performance in context of Geekbench version (6.2 vs 6.3). I mean I understand that introducing SME will have less impact in android ecosystem due to fragmentation, but I guess that vertically integrated Apple is already making use of it due to having much better control over software stack.

At least this gives some context to Qualcomm slides where 6.2 was used, as from uarch vs uarch it's more fair seeing SME unlike Neon is not part of the core.

On the other hand if Andrei wanted to do accurate uarch vs uarch comparisons he should have enabled AVX for x64 when doing comparisons on Qualcomm slides as the AVX units belong to the core the same Neon units do (I guess he is somewhat involved with those if the rumours are accurate). But yea, that is a digression😉

FlameTail · Oct 24, 2024

FlameTail said:
A wild bit of speculation is that Qualcomm has an Oryon-S core in the works.

If L = Large and M = Medium, doesn't that suggest that there will be a "Small" core?

Cortex X ≈ Oryon-L
Cortex A7xx ≈ Oryon-M
Cortex A5xx ≈ Oryon-S

Oryon-S would allow Qualcomm to scale the Oryon CPU all the way down to low end Snapdragon 4 series chips and Snapdragon Wearable chips.

Oryon-S doesn't even need to be an entirely new microarchitecture. To create Oryon-S, they can just take Oryon-M, tweak it a bit and massively reduce the clock speed to about 2 GHz (Oryon-M in 8 Elite runs at a crazy 3.53 GHz).

jdubs03 · Oct 24, 2024

MS_AT said:
He used to add this information in the past. But glancing on his 8 Elite video I haven't seen it so the question is it still the same. He might have changed things in between, he might have not.

Yea I hear ya. The A18 Pro numbers match so my assumption is it’s still the same.

FlameTail · Oct 24, 2024

There was also this interesting slide from the Mobile Deep Dive;

128 KB L1 = 1 ns
12 MB L2 = 5ns

Can someone convert nanoseconds to clock cycles?

gdansk · Oct 24, 2024

4 cycle L1
~21 cycle L2
But they rounded so who knows

FlameTail · Oct 24, 2024

gdansk said:
4 cycle L1
~21 cycle L2

So L1 latency is same as Oryon core in X Elite, whereas L2 latency regressed by a few cycles.

Almost directly explained by the fact that Oryon -> Oryon-L, L1 capacity regressed from 192 KB -> 128 KB, but L2 capacity stayed the same at 12 MB.

gdansk · Oct 24, 2024

FlameTail said:
So L1 latency is same as Oryon core in X Elite, whereas L2 latency regressed by a few cycles.

Not sure it regressed, they rounded to 5ns so it's probably the same.

FlameTail · Oct 24, 2024

It would be a great feat if they can hit 5 GHz next year with Oryon Gen 3, without regressing the latency.

FlameTail · Oct 24, 2024

FlameTail said:
View attachment 110062
It's interesting that Qualcomm has changed the rendering technique from Tile-Based Rendering (TBR) to something more akin to the Immediate Mode Rendering (IMR) used in desktop class GPUs. It's fed by the huge 12 MB of GPU cache. This signals that Qualcomm is evolving Adreno to become a desktop class GPU architecture.

But Adreno 830 wouldn't be the first smartphone GPU without TBR. Exynos 2200 with it's AMD mRDNA2 GPU demonstrated that it's possible for a smartphone to have a non-TBR GPU.

Does Adreno 830 exclusively use Binned Direct mode?

SDX_CPU_GPU Architecture Overview_26.jpg

Vince suggests that Adreno 830 could be using TBIM (Tile Based Immediate Mode Rendering), which is a hybrid of the tradional mobile TBDR (Tile Based Deferred Rendering) and traditional desktop IMR (Immediate Mode Rendering).

https://www.reddit.com/r/hardware/comments/1ga2nhf/comment/ltb8h5t

Discussion Qualcomm Snapdragon Thread

Golden Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Golden Member

Lifer

Senior member

Member

Diamond Member

Golden Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member