Let's also keep in mind that performance of CB24 scales much more with memory subsystem than CB23.
https://chipsandcheese.com/p/cinebench-2024-reviewing-the-benchmark The memBW available to single CCD on Strix Halo is 1/4 of the total memory BW. 1/2 of total memory BW for MT loads. There is also the fact that most used SIMD width is 128b, which fits Neon perfectly, but still total percentage of SIMD instruction < 10% overall. Of course when checked on x64, not sure if somebody tried to check that for ARM. Since we don't have access to the source code we don't know if the SIMD usage was targetted by the developers (and then it would most likely look the same on all archs) or was a result of autovectorization (then the instruction mix might look different between archs despite using the same compiler). Since the percentage is really low, I would assume autovectorization, so the result might be different on aarch64. Would be nice if somebody could check.
In other words CB24 is more playing to M4 strengths than Zen5 strengths. Just something to keep in mind when you argue about few watts here and there
Disclaimer:
My position on the topic is that Macbooks have generally better perf and perf/W than x64 counterparts but pinpointing accurately how much of that is due to the SoC, how much due to better laptop design (more efficient power stages, displays, etc) or MacOS optimizations based on random benchmarks from internet is near impossible task.