• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Question Post your Geekbench AI scores!

Here ya go, folks!


So what are we looking at here?

Same system. The faster one is just overclocked using Intel XTU (5.1 GHz P-cores 4.2ghz E-cores 275W 320A).

The red scores? That's when the system thermally throttled pretty hard.

Hail's 7950X beating the crap out of my system: https://browser.geekbench.com/ai/v1/compare/4729?baseline=6860

Det0x's 9950X ES taking the wind out of my PC: https://browser.geekbench.com/ai/v1/compare/4408?baseline=6860
 
Can someone post the Ryzen HX 370 result?
On paper it has the highest, 50 TOPS.

However Apple claimed to the fastest NPU with the M4 with 38 TOPS but that was before HX 370 came out.
True to Apple's claims they do have the fastest NPU from what I seen so far at least in (INT8).

M4 NPU:

8 Gen 3, Qualcomm's best NPU:

Snapdragon X Elite - X1E78100 NPU:
 
M4 NPU:

8 Gen 3, Qualcomm's best NPU:

Snapdragon X Elite - X1E78100 NPU:
There is something else to note there.

Correction: The Apple NPU is more accurate than the CPUs!
 
True, thats why I'm curious to see how Strix point NPU performs.
I don't think we are gonna see that soon. Either AMD hasn't been able to get the necessary software support ready or no one who has a HX 370 laptop knows enough to test the NPU. I'm kinda leaning towards the former.
 
Can someone post the Ryzen HX 370 result?
On paper it has the highest, 50 TOPS.

However Apple claimed to the fastest NPU with the M4 with 38 TOPS but that was before HX 370 came out.
True to Apple's claims they do have the fastest NPU from what I seen so far at least in (INT8).

M4 NPU:

8 Gen 3, Qualcomm's best NPU:

Snapdragon X Elite - X1E78100 NPU:

This benchmark doesn’t support the AMD NPU yet.
 
This benchmark doesn’t support the AMD NPU yet.
Yes, the signal65 article noted that;
Despite having one of the fastest NPUs on paper, Geekbench AI 1.0 still doesn’t have the ability to measure the AMD Ryzen AI NPU performance. When I asked Primate Labs about this, I was told the reasoning was that it was a representation of where AMD stood today in terms of its consumer AI framework implementations, and that trying to integrate support through Vitis software (carryover from Xilinx) just wasn’t working out. Disappointing for sure, but also is mirrored by the fact that you cannot run the Procyon AI benchmark on AMD NPUs. Hopefully we’ll have a solution from AMD on this soon.
 
DeviceSingle PrecisionHalf PrecisionQuantized
Intel Core Ultra 9 185H (NPU)7172717611000
Qualcomm Snapdragon X Elite 80-100 (NPU)21771106921549
M3 (NPU)24991397114877
M4 (NPU)47023205240743
NVIDIA GeForce RTX 4090368005053127568

Source: https://signal65.com/research/ai/new-geekbench-ai-1-0-benchmark-analysis-and-early-results/

Using best framework for each NPU. Added RTX GPU (ONNX DirectML) for reference.

Based on this benchmark, we can clearly see that a GPU is geared towards training (FP32 & FP16) and is not very efficient for inference (INT8/INT4).
 
Last edited:
DeviceSingle PrecisionHalf PrecisionQuantized
Intel Core Ultra 9 185H (NPU)7172717611000
Qualcomm Snapdragon X Elite 80-100 (NPU)21771106921549
M3 (NPU)24991397114877
M4 (NPU)47023205240743
NVIDIA GeForce RTX 4090368005053127568

Source: https://signal65.com/research/ai/new-geekbench-ai-1-0-benchmark-analysis-and-early-results/

Using best framework for each NPU. Added RTX GPU (ONNX DirectML) for reference.

Based on this benchmark, we can clearly see that a GPU is geared towards training (FP32 & FP16) and is not very efficient for inference (INT8/INT4).
is it driver restriction ? Nvidia do that alot on allowed throughput rates of "datacentre" formats , Nv should be eating most of the various precision alive.
 
No. Highly doubt it is.

Gaming emphasizes FP32.
Not really been using packed math for years

The GA10x SM continues to support double-speed FP16 (HFMA) operations which are
supported in Turing. And similar to TU102, TU104, and TU106 Turing GPUs, standard FP16
operations are handled by the Tensor Cores in GA10x GPU
 
Anyone know how one may ascertain the NPU TOPS from these GB scores? Or should one just double the Quantized score to arrive at the TOPS? That would mean the M4 NPU has 80 TOPS!
 
Anyone know how one may ascertain the NPU TOPS from these GB scores? Or should one just double the Quantized score to arrive at the TOPS? That would mean the M4 NPU has 80 TOPS!

Since different vendors are reporting different things with "TOPS" (e.g. some may be INT8, some INT4, some FP8) there's no formula for conversion. But we'll be able to see what "TOPS" figures their marketers claim, and compare to GB AI scores, and figure out a "fudge factor" to compare e.g. the TOPS figure for Qualcomm to Intel, or whatever. Obviously that's pointless once GB AI scores are available, but when something new is announced but not yet released, vendor claimed TOPS are all you have to go by.
 
Did some testruns in preperation for hwbot, found out that this benchmark dont care about threads at all.. Getting pretty much same score with SMT enabled/disabled on my 9950X

16/32 SMT enabled
View attachment 107381

16/16 SMT disabled
View attachment 107382
Have you observed thread utilization? OpenVino might limit itself to physical cores since HT won't give you lots of benefits in backend bound code. What you might see is noticeable performance scaling with DDR MT/s if the benchmark is using LLMs underneath.
 
Back
Top