Question Qualcomm's first Nuvia based SoC - Hamoa

poke01 · Nov 8, 2022

Qualcomm's working on a 2024 PC chip codename "Hamoa" with up to 12 (8P+4E).

Said to the same cache layout as M1. large private L1$, per-cluster L2$ (cluster = 4 cores, 12MB for every cluster) and a lot of LLC.

source:

https://twitter.com/x/status/1589405172979339264

FlameTail · Nov 3, 2023

https://x.com/Tech_Reve/status/1667125995063889922?s=20

Leak from June about Oryon E-core

Tup3x · Nov 3, 2023

FlameTail said:
https://x.com/Tech_Reve/status/1667125995063889922?s=20

Leak from June about Oryon E-core

I'm pretty sure that's just a fantasy and not a leak.

FlameTail · Nov 4, 2023

Tup3x said:
I'm pretty sure that's just a fantasy and not a leak.

Revegnus isn't really.a bad leaker. He may be getting his information from combing through Korean forums or whatnot, but his accuracy is actually not bad.

See this for instance:

https://x.com/Tech_Reve/status/1664224458968829952?s=20

Nuvia's IPC goal is Apple A15 Avalanche

Turned out to be pretty much spot on, eh?

FlameTail · Nov 4, 2023

I mean... think about it guys.

How could they not be working on an E-core?

To me the strongest sign they are, is that they announced Oryon CPU is coming to the Snapdragon 8 gen 4.

You can't make a phone SoC without E-cores.

SarahKerrigan · Nov 4, 2023

FlameTail said:
I mean... think about it guys.

How could they not be working on an E-core?

To me the strongest sign they are, is that they announced Oryon CPU is coming to the Snapdragon 8 gen 4.

You can't make a phone SoC without E-cores.

Sure you can, if your DVFS curve is favorable enough. Intel did - relatively recently, even. Mediatek is too, depending on how broadly one wishes to stretch the definition of "e-core." Also, not terribly long ago, Qualcomm themselves did - the Snapdragon 820/821 was Kryo-only (just two clusters, one with smaller caches.)

Doug S · Nov 4, 2023

FlameTail said:
I mean... think about it guys.

How could they not be working on an E-core?

To me the strongest sign they are, is that they announced Oryon CPU is coming to the Snapdragon 8 gen 4.

You can't make a phone SoC without E-cores.

They could use ARM licensed cores, or use their own cores clocked lower. Not as efficient power or space wise as a proper e-core but likely better than using an ARM designed core.

SpudLobby · Nov 5, 2023

Doug S said:
They could use ARM licensed cores, or use their own cores clocked lower. Not as efficient power or space wise as a proper e-core but likely better than using an ARM designed core.

Yeah. Scheduling Arm's Cortex cores is a mess if you mix custom and reference cores per Samsung's Exynos experience and Andrei. It may well be better than the tradeoff but the baseline assumption should be that it's not for their case.

The counterfactual case here would be exactly as you said (and this is also what I expect they'll do, roughly): Take their P cores and then change the physical design to denser layout - which would save area and lower leakage, and also make it more efficient for some clock ranges. In effect, reducing the clocks but making the most of doing so.

Like AMD did with Zen 4 -> 4C.

Alternatively, they can do that *and* make some architectural trims on their P core to make it more fit for a lower power, lower area core like shave off some L1 + backend + front end stuff. Whatever they do I doubt it will be a "from scratch" core, but that doesn't mean it'll be bad.

I wouldn't be surprised to see a bit of that last part, but I think lower clocks and a new physical design are baseline assumptions.

SpudLobby · Nov 5, 2023

FlameTail said:
Revegnus isn't really.a bad leaker. He may be getting his information from combing through Korean forums or whatnot, but his accuracy is actually not bad.

See this for instance:

https://x.com/Tech_Reve/status/1664224458968829952?s=20

Turned out to be pretty much spot on, eh?

That doesn't mean Revegnus is a good leaker, because the target space is so easy to fudge given how virtually nonexistent Apple's IPC gains have been from the A14 to the A16, or the M1 to M2 and now even M3 which looks like it might be different from the A17 Pro's cores.

It's a given Qualcomm were targeting a level of performance/GHz, ST, and efficiency that is at *least* on par with Apple's M1/M2 or A14/15 too if that's what you're trying to say. All the leaks pointed in this range, as it makes the most sense as a prerequisite for competing in the Windows market with a lower power, lower clocked but performant core with great battery life. It's also just the same guys that built Apple's last stuff anyway.

As for the forums, he gets a lot more noise than he does signals. We saw this with his completely nonsensical crap about Samsung's fabs and the idea Qualcomm's 8 Gen 2 was on 4NM or whatever for Galaxy, which made virtually no sense given the clocks were even higher for the Galaxy version and the recency of their absence.

That was memory holed of course, and it's one of many claims that do. This is the problem with leakers like him. He's said other absolutely delusional or out of nowhere garbage on the subject of fabs, Android parts. Also, the Nuvia cores he was all over the place on tbh.

SpudLobby · Nov 5, 2023

SpudLobby said:
That doesn't mean Revegnus is a good leaker, because the target space is so easy to fudge given how virtually nonexistent Apple's IPC gains have been from the A14 to the A16, or the M1 to M2 and now even M3 which looks like it might be different from the A17 Pro's cores.

It's a given Qualcomm were targeting a level of performance/GHz, ST, and efficiency that is at *least* on par with Apple's M1/M2 or A14/15 too if that's what you're trying to say. All the leaks pointed in this range, as it makes the most sense as a prerequisite for competing in the Windows market with a lower power, lower clocked but performant core with great battery life. It's also just the same guys that built Apple's last stuff anyway.

As for the forums, he gets a lot more noise than he does signals. We saw this with his completely nonsensical crap about Samsung's fabs and the idea Qualcomm's 8 Gen 2 was on 4NM or whatever for Galaxy, which made virtually no sense given the clocks were even higher for the Galaxy version and the recency of their absence.

That was memory holed of course, and it's one of many claims that do. This is the problem with leakers like him. He's said other absolutely delusional or out of nowhere garbage on the subject of fabs, Android parts. Also, the Nuvia cores he was all over the place on tbh.

Seriously, are you guys this gullible? I don't mean to be rude here, but think critically. He didn't even post a graph of IPC as an independent figure. That's a chart of a surrogate outcome, that being SPECint2017. It did move from A14 -> A15 by more than the clock frequency did which indicates some performance/GHz gain, but barely, and he didn't even make an attempt to show the actual perf/GHz features, which of course are very small changes.

Absolutely not am I going to give him credit on "A15 IPC was the target", and posting Andrei F's SPECint results seemingly tangentially, this is the lowest bar imaginable and like shooting fish in a barrel.

SpudLobby · Nov 5, 2023

FlameTail said:
https://x.com/Tech_Reve/status/1667125995063889922?s=20

Leak from June about Oryon E-core

Tup3x said:
I'm pretty sure that's just a fantasy and not a leak.

To be fair, I think it's quite possible this one is legitimate and just regurgitated from other forums, because Qualcomm certainly do have E cores that are clocked lower for the 8 Gen 4, and we do know they're using Oryon for the 8 Gen 4 now. Which means some development has probably been ongoing, so via testing and all on Chinese and Korean forums there have been leaks indicating similar to what he implies there.

The key is just that he's not alone on that and doesn't deserve much credit, he ruins his Signal:Noise with a lot of BS>

FlameTail · Nov 5, 2023

SpudLobby said:
As for the forums, he gets a lot more noise than he does signals. We saw this with his completely nonsensical crap about Samsung's fabs and the idea Qualcomm's 8 Gen 2 was on 4NM or whatever for Galaxy, which made virtually no sense given the clocks were even higher for the Galaxy version and the recency of their absence.

Are you sure Revegnus said that? I don't remember but this guy definitely did:

https://x.com/RGcloudS/status/1708899849595437293?s=20

RGCloudS

Tigerick · Nov 5, 2023

There is no way Gen3 for Galaxy is using N3E process cause N3E process only start valume production in fourth quarter of this year. Normally OEM has to spend few months to do packaging, testing, assembly and so on.

Plus Samsung will have to pay more for Gen3 if it is made by N3E, so totally nonsense..

SpudLobby · Nov 5, 2023

FlameTail said:
Are you sure Revegnus said that? I don't remember but this guy definitely did:

https://x.com/RGcloudS/status/1708899849595437293?s=20

RGCloudS

I apologize.

You’re right that’s who said it.

FlameTail · Nov 5, 2023

GB6

X Elite:
• 15,226 multi core
• 2,956 single core

M3 Pro:
• 15,171 multi core
• 3,035 single core

I wonder if there's some deficiencies in the Oryon uncore.

How is a 6P+6E configuration (Apple M3 Pro) matching a 12P one (SD X Elite)?

Fwiw both are 12-core, so the argument that GB6 MT scales poorly with higher core counts does not apply here.

FlameTail · Nov 5, 2023

SarahKerrigan said:
Sure you can, if your DVFS curve is favorable enough. Intel did - relatively recently, even. Mediatek is too, depending on how broadly one wishes to stretch the definition of "e-core."

Mmm. Perhaps. Have a look at these graphs

SpudLobby · Nov 5, 2023

FlameTail said:
GB6

X Elite:
• 15,226 multi core
• 2,956 single core

M3 Pro:
• 15,171 multi core
• 3,035 single core

I wonder if there's some deficiencies in the Oryon uncore.

How is a 6P+6E configuration (Apple M3 Pro) matching a 12P one (SD X Elite)?

Fwiw both are 12-core, so the argument that GB6 MT scales poorly with higher core counts does not apply here.

GB6 changed the methods for MT for what it’s worth, so that’s possible, but everyone other than Apple seems to do poorly with it relative to other MT tests. Not saying they’re biased in their favor, just that QC isn’t alone in looking bad.

Gideon · Nov 5, 2023

FlameTail said:
GB6

X Elite:
• 15,226 multi core
• 2,956 single core

M3 Pro:
• 15,171 multi core
• 3,035 single core

I wonder if there's some deficiencies in the Oryon uncore.

How is a 6P+6E configuration (Apple M3 Pro) matching a 12P one (SD X Elite)?

Fwiw both are 12-core, so the argument that GB6 MT scales poorly with higher core counts does not apply here.

Part of it might also be the "Windows Tax". Geekbench has always performed better on Linux and MacOS.

The linux scores were 3236 ST and 17387 MT

And while true, that these were with 100% fans, it shouldn't really matter that much. GB6 tests are very short and bursty and don't even tend to throttle phones too hard (especially in ST). Now way the fans alone are responsible for the nearly 10% difference in ST perf.

That said, thr uncore probably is the weakest part with the most low hanging fruits.

FlameTail · Nov 5, 2023

SpudLobby said:
GB6 changed the methods for MT for what it’s worth, so that’s possible, but everyone other than Apple seems to do poorly with it relative to other MT tests. Not saying they’re biased in their favor, just that QC isn’t alone in looking bad.

You or me are not the first people nor will be the last people to entertain the idea that Geekbench is biased to Apple.

There is a widespread belief especially in the smartphone community on twitter that GB is biased to Apple. When Snapdragon 8 gen 3 Multi-threaded scores came out showing that it rivalled the A17 Pro, some users were sarcastically commenting that GB7 will be released soon.

FlameTail · Nov 5, 2023

https://x.com/Tech_Reve/status/1720748052737335352?s=20

So SD8G4 is rumoured to have the same GPU perf as Snapdragon X Elite.

Extraordinary.

Qualcomm in recent years has been on a crusade of pushing smartphone GPU performance to the stratosphere.

The Snapdragon 8 gen 2 was terrific.

And now 8 gen 3 is here.

FlameTail · Nov 5, 2023

check this out.

Looks like the Snapdragon 8 gen 3 at 32 FPS has roughly 75% of the GPU performance of the Snapdragon X Elite.

FlameTail · Nov 5, 2023

Oh and incidentally, it's faster than the best iGPU on PC: the Radeon 780M.

LOL.

I know it's only one benchmark, but still...

FlameTail · Nov 5, 2023

So in the end we have a device that is really a pain in the proverbial backside to measure but we tried. The die size is between 165 and 182mm^2 but we are confident in saying the real answer likes in the low 170s. Lets go out on a limb and call it at 171 +/- 2mm^2. The thickness is also quite impressive being more akin to cell phone dimensions than PC CPU ones. Even with preliminary numbers in hand, this could be a device to watch.

So the Snapdragon X Elite is on N4P.

Hypothetical question:

What would the die size be if this was fabbed on N3E?

Thibsie · Nov 5, 2023

FlameTail said:
So the Snapdragon X Elite is on N4P.

Hypothetical question:

What would the die size be if this was fabbed on N3E?

Not much smaller.

darkswordsman17 · Nov 5, 2023

The question is, how much are games (and other apps) actually using the GPU. Same is true of Apple GPU. Both had to build it before games would start to focus on it, but will we see a shift to that support growing? Its been very slow so far.

Software support is huge. Its why the Switch and Steamdeck punch well above their weights. But it also shows that there's more to making a device worth using for something than just performance/specs or features.

Its exciting to finally see these types of chips. I'm frankly disappointed as I feel like we could have and should have had products like these several years earlier. I feel like Apple just did a very logical idea (lowest of low hanging fruits) and embarassed the market (because everyone else was refusing to make such chips), seeing that consumers don't really need dGPU outside of gaming, and even for that, many could get by with integrated GPU if you just made it strong enough. AMD could have and should have been building chips like these years ago (since they already were for consoles). They would've been popular. We likely would have had viable Steamboxes, which would have actually made the move to a portable easier/better. It would've helped AMD immensely (both in showing up Intel - which AMD waited until Intel started talking about making such chips; but also it would've helped keep Nvidia from dominating as much since I think console like chips would've gotten a lot of focus of development; there's posts of mine from years ago arguing that Zen could've really changed things if we'd gotten premium APUs from it at the outset). The saddest part though is it seems in many ways we're getting these when it almost looks like games making use of the resources available is at some of the worst its been. Last time it was this bad, it caused AMD to make Mantle which led to DX12 and Vulkan. Further disappointing on that front, I think that's what AMD wanted to do back in the early 2010s (and was a big part behind their development of Mantle even), but they were so starved for resources and then the construction cores weren't competitive (although their design was actually about enabling large mixed CPU/GPU).

I'm wondering if there's any chance we might see dGPUs come out of this from others. That would be interesting. Even Apple could benefit (3D modeling for artists and CAD; especially if they build in 3D scanning features into the Vision Pro, imagine being able to go look at real objects, laser scan them into models, and then adjust them on the fly). And it could get really interesting if anyone builds dGPUs that go very heavy on ray-tracing hardware (maybe leaving raster for large integrated GPU.

FlameTail · Nov 5, 2023

On Device AI – Double-Edged Sword

Scaling limits, model size constraints, why server based AI wins, and future hardware improvements The most discussed part of the AI industry is chasing ever-larger language models that can only be…

www.semianalysis.com

In the memory hierarchy, caching frequently accessed data on chip is common across most workloads. The problem with this approach for on-device LLM is that the parameters take too large of memory space to cache. A parameter stored in a 16-bit number format such as FP16 or BF16 is 2 bytes. Even the smallest “decent” generalized large language models is LLAMA at a minimum of 7 billion parameters. The larger versions are significantly higher quality. To simply run this model requires at minimum 14GB of memory at 16-bit percision

Well well well.

I hope this means laptops with the Snapdragon X Elite (and even Meteor Lake chips that have an NPU), will come with a baseline 16 GB of RAM.

Hopefully the days of paying $200 more to get a reasonable amount of RAM like 16 GB, is over.

Question Qualcomm's first Nuvia based SoC - Hamoa

Diamond Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Golden Member

Golden Member

Golden Member

Golden Member

Diamond Member

Senior member

Golden Member

Diamond Member

Diamond Member

Golden Member

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Lifer

Diamond Member