Question Qualcomm's first Nuvia based SoC - Hamoa

Page 35 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Doug S

Platinum Member
Feb 8, 2020
2,302
3,605
136
Have you ever heard people say megahertz doesn’t matter? This is why. It is possible to build a 1ghz chip that runs just as fast. AMD and Intel justdesign their chips around a very specific performance/power/cost/marketing profile.

No its not. I don't know what the lowest frequency you could get that level of performance it, but it is far higher than 1 GHz. We can't come close to extracting enough parallelism to quadruple IPC over M3 or sextuple it over Intel/AMD, which is what would be required to match that performance level at 1 GHz.
 

mikegg

Golden Member
Jan 30, 2010
1,785
413
136
I definitely don’t think the 4.3GHz ST will look great. And we know the MT clocks are at best 3.8GHz fwiw. Tops out at 80W platform power there so I mean in practice for ST I could see the power draw at 9-12W or something at 3.8GHz. Which isn’t too bad, the perf on GB6 is around 2770 or so at that point? I think AMD still draws more for similar.



Here’s a review that took power from the wall minus idle (so display mostly and statics removed). M2 ST is about 8W.

You also notice MT power — and we saw this in the Mac Mini Andrei review too — is a bit higher than Apple’s claims probably because they do roughly Intel/AMD on their power figures when they say the M2 is 15W at top (or M3, 17W) — they are only referring to the CPU or SoC.

Also see the ST is a bit higher from even the AMD and Intel stuff. I’m not 100% sure what Phoenix looks like but I’d wager it peaks as badly as Rembrandt just with more performance.
They had to use Cinebench R23 for this. Cinebench R23 is hand optimized for AVX and poorly translated to use NEON.

Once again, lazy reviewer basing perf/watt on Cinebench between ARM and x86.

1699959643198.png

1699959649224.png
 

SpudLobby

Senior member
May 18, 2022
638
382
96
They had to use Cinebench R23 for this. Cinebench R23 is hand optimized for AVX and poorly translated to use NEON.

Once again, lazy reviewer basing perf/watt on Cinebench between ARM and x86.

View attachment 88835

View attachment 88836
I am aware of that, but it’s not going to affect the wattage itself substantially and you see the same totals elsewhere. I am only interested in the wattage here, because it will have overlap with ST workload dynamic power elsewhere where the code is fair, and this is what was available. Even Andrei’s M1 Mini Test ended up around 7W dynamic power ST. Point is, nearly everyone underestimates what their chips in any meaningful sense are consuming, and this is abetted by AMD/Intel and to a lesser extent Apple given how they qualify consumption.
 

Nothingness

Platinum Member
Jul 3, 2013
2,450
777
136
They had to use Cinebench R23 for this. Cinebench R23 is hand optimized for AVX and poorly translated to use NEON.
For what's it's worth:

Cinebench R23 ST score: M2 is 28% below 14900K
Cinebench R24 ST score: M2 is 14% below 14900K.

For M1 it's -35% for R23 and -20% R24.

It's hard to say how that'd translate to power usage and efficiency, but I guess power would be very similar given that, even though translation is poor, NEON is used in R23.

Refs:
 

SpudLobby

Senior member
May 18, 2022
638
382
96
For what's it's worth:

Cinebench R23 ST score: M2 is 28% below 14900K
Cinebench R24 ST score: M2 is 14% below 14900K.

For M1 it's -35% for R23 and -20% R24.

It's hard to say how that'd translate to power usage and efficiency, but I guess power would be very similar given that, even though translation is poor, NEON is used in R23.

Refs:
Yep. It would harm energy efficiency by virtue of being less performant for our outcome, but the *power* figures themselves are unlikely to change much, and I posted them more as proxies for what these chips look like under various workloads when stressed.

C24 is probably more fair to Arm I’d bet
 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,027
136
No they are not. It’s not on the die and it’s optional. You’re in for a ride.

Buddy if you think opposing MHZ wars helps you I have awful news for you. We’re talking frequencies because we care about performance and AMD and Intel need higher clocks for the same performance. Asking where the frequency/power points on a core’s curve lie is just standard.
Frequencies, power, and IPC are mostly disconnected. You can have a 6 ghz CPU that sips power and a 1 Ghz CPU that drinks it, both made on the same process.

I suppose you think Intel is the first company with a 6ghz CPU? IBM had this game down years ago.
No its not. I don't know what the lowest frequency you could get that level of performance it, but it is far higher than 1 GHz. We can't come close to extracting enough parallelism to quadruple IPC over M3 or sextuple it over Intel/AMD, which is what would be required to match that performance level at 1 GHz.
NVIDIA and AMD disagree: GPUs are what are powering AI right now, mind you. They are massively parallel by design, low frequency, powerful, efficient for what they do, and turing complete.
 

Doug S

Platinum Member
Feb 8, 2020
2,302
3,605
136
Frequencies, power, and IPC are mostly disconnected. You can have a 6 ghz CPU that sips power and a 1 Ghz CPU that drinks it, both made on the same process.

I suppose you think Intel is the first company with a 6ghz CPU? IBM had this game down years ago.

NVIDIA and AMD disagree: GPUs are what are powering AI right now, mind you. They are massively parallel by design, low frequency, powerful, efficient for what they do, and turing complete.


GPUs are only useful for running massively parallel code, which is what both shaders and 'AI' is. They are nothing like CPUs, and comparing them is this way is just plain foolish. Compile normal straight line code like Geekbench's LLVM test on one and it would absolutely choke, with performance far worse than an Apple or Intel/AMD CPU downclocked to the same frequency, because it isn't designed for straight line code.
 

gdansk

Platinum Member
Feb 8, 2011
2,179
2,751
136
They had to use Cinebench R23 for this. Cinebench R23 is hand optimized for AVX and poorly translated to use NEON.

Once again, lazy reviewer basing perf/watt on Cinebench between ARM and x86.

View attachment 88835

View attachment 88836
And just like that it's a reminder of the real world. Plenty of AVX optimized open source software without NEON ports yet.
And only a few in the opposite direction.
 

ikjadoon

Member
Sep 4, 2006
118
167
126
To be fair, Oryon only has a rough timeline of late Q2 / early Q3 ("mid-year 2024"). Arm v Qualcomm's trial won't start until September 2024, so perhaps just after Qualcomm's intended timeline.

The lawsuit is still heavily ongoing; you can follow along here:


Interesting tidbit #1: AMD, Apple, Ampere, MediaTek, TSMC, NVIDIA, Cadence, Google, Synopsys, Intel, Cadence etc. are all involved in the trial now. Lots of people giving depositions / receiving subpoenas to testify in Court.

Interesting tidbit #2: So far with discovery & depositions, the judge is mostly siding with Arm and against Qualcomm.
  1. ALAs from Arm: Judge says Qualcomm's motion is partly granted, partly denied. Can't see the details.
  2. Qualcomm tried to get a deposition from Masayoshi Son. Judge rules against Qualcomm here.
  3. Qualcomm wanted discovery of Arm's IPO. Judge rules against Qualcomm here.
  4. Qualcomm wanted docs from Antonio Viana at Arm. Judge rules against Qualcomm here.
  5. Qualcomm wanted Apple's & Ampere's specific ALAs. Judge rules against Qualcomm here.
The last update is October 25, so literally yesterday haha.

Not many updates (as expected after just 3 weeks).

//

Pure speculation on my part: Axios reported recently that "Arm is in advanced talks on a large deal with an existing customer that, if it closes by year-end, would bring Q3 revenue at the high end of its guidance. But Haas [Arm CEO] says it's a 'complex deal' that might bleed into January, particularly given how negotiations can slow around the holidays. If so, Q3 would come in light but the full fiscal year would be okay. He adds that Arm has a very high degree of confidence the new contract will close."

I initially thought it can't be Qualcomm; Hamoa hasn't launched, so Arm's upcoming Q3/Q4 financials won't be significantly affected. But, then I wondered, perhaps Arm & Qualcomm are restructuring all of Qualcomm's Arm licenses as part of the NUVIA settlement deal.

Or maybe this is another customer (e.g., NVIDIA or AMD for their alleged consumer desktop CPUs), so it might not matter re: this lawsuit.
 

Nothingness

Platinum Member
Jul 3, 2013
2,450
777
136
Nope. Just uncool stuff like PHP.
That's significant, I think PHP is still used a lot. I wonder what the impact of AVX use is, I couldn't find any benchmark. The JIT supports AArch64, but doesn't emit NEON instructions it seems.

But that's just one example and one that doesn't matter for end users, we're not talking servers are we? I know AArch64 is a bit behind x86, but it's getting better and better support, and I don't think there's that many OSS that has AVX support and no NEON.
 

FlameTail

Platinum Member
Dec 15, 2021
2,356
1,276
106
This SoC supports only DirectX12.

So will DX11 and older games work on it?

Edit: Okay, so as per my research- DirectX is reverse compatible with older versions. So in theory games using older DX versions should run.
 
Last edited:

soresu

Platinum Member
Dec 19, 2014
2,701
1,904
136
Edit: Okay, so as per my research- DirectX is reverse compatible with older versions. So in theory games using older DX versions should run.
DX12 is a completely different beast from DX10/11 or DX7/8/9.

Otherwise DXVK wouldn't need to write new code to support DX8 given how well it already supports DX10/11.

What you call "reverse compatible" is most likely MS speak for "we force ODMs to write drivers for everything else we did earlier before they get the thumbs up for DX12".

If DX12 is anything like Vulkan in versatility you could write a translation layer for everything that came before it - but given the sorry state of MS's internal OGL -> DX12 trans layer perf relative to Zink (OGL -> VK) it doesn't seem to be so hot for that task.
 
Jul 27, 2020
16,646
10,656
106
Edit: Okay, so as per my research- DirectX is reverse compatible with older versions. So in theory games using older DX versions should run.
Nuh uh. If it were that simple, Intel's driver team wouldn't have had gone through so much trouble with ARC, trying to support older DirectX AAA titles. QC isn't targeting gaming so most likely, older DX games will not run or suffer from glitches that no one other than the community will have to fix.
 
  • Like
Reactions: Tlh97 and soresu

uzzi38

Platinum Member
Oct 16, 2019
2,662
6,163
146
This SoC supports only DirectX12.

So will DX11 and older games work on it?

Edit: Okay, so as per my research- DirectX is reverse compatible with older versions. So in theory games using older DX versions should run.
So far QC's D3D11 driver sucks balls.

QC have to improve a LOT if they want the iGPU to perform reasonably across a wide variety of games. Or well... even work at all in some games, going off of prior 8cx devices.
 

soresu

Platinum Member
Dec 19, 2014
2,701
1,904
136
QC isn't targeting gaming so most likely, older DX games will not run or suffer from glitches that no one other than the community will have to fix.
By then QC will likely have a Vulkan driver running on WoA with DXVK providing all the magic, leaving only x86 binary translation as the significant stumbling block.
 

FlameTail

Platinum Member
Dec 15, 2021
2,356
1,276
106
By then QC will likely have a Vulkan driver running on WoA with DXVK providing all the magic, leaving only x86 binary translation as the significant stumbling block.
Wow DXVK sounds amazing, converting DX9,10,11 to Vulkan.

Do you think Qualcomm will get Zink working on it too? Zink converts OpenGL to Vulkan.

That would centralise a lot of the GPU performance on Vulkan, which is not necessarily a bad thing, since Adreno GPUs in their smartphone Snapdragons have excellent Vulkan performance.
 

soresu

Platinum Member
Dec 19, 2014
2,701
1,904
136
Do you think Qualcomm will get Zink working on it too? Zink converts OpenGL to Vulkan.
That would centralise a lot of the GPU performance on Vulkan, which is not necessarily a bad thing, since Adreno GPUs in their smartphone Snapdragons have excellent Vulkan performance.
Not sure - it's possible that they will just leave that to devs to do much like some have been doing with DXVK-Native which linked it directly to the game code so that it produces Vulkan output without the OS/drivers coming into it.

In the future I could see Zink displacing OGL driver code compilers for all significantly new GPU µArchs that demand a serious compiler rewrite.

OGL is a mountain to implement all the way up to v4.6 of the API, so it seems like the path of least resistance is likely to be taken going forward, even with the translation penalty vs a well optimised native OGL driver.