Question x86 and ARM architectures comparison thread.

Page 12 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Covfefe

Junior Member
Jul 23, 2025
14
27
46
That's because 395 clocks higher than 365 and probably has more cache that needs to be fired up.

0.1Ghz higher clocks speeds on one core and 30MB of L3 cache aren't going to increase power consumption by 16W, lol.
 
  • Like
Reactions: Tlh97

Anacapols

Junior Member
Mar 2, 2025
6
10
41
Replying to milkegg again.

You are very right that the 395 for single core boosts further out of it's optimal efficiency range than the M4 does to maximize performance (which they need cause apple cores are very fast), but imo it's fairer for architectural comparisons (core vs core only) to run both cores at a similar point in their efficiency curves, which is what happens in multicore loads on most sanely configured processors, including these two, that way you're not just testing how much power the SKU is allowed to pull (which can make any processor arbitrarily inefficient).
If you do test with both cores at similar spots in their v/f curve it'll be as fair as any arbitrary core comparison can be in terms of putting both in their best light.

Although for specific cpu SKUs or other products (like a specific macbook model) this is only a small piece of the efficiency puzzle, and actual power limits and their corresponding efficiency levels are obviously also relevant, this is just for core and architectural comparisons.

You could argue that it's a little unfair as apple's cache hierarchy is more focused on 1T (fast shared L2 that a single core can access) than nT, so this slightly advantages AMD's caching setup, but apple has the memory bandwidth advantage so I don't think it's too bad of a disadvantage.
 

MS_AT

Senior member
Jul 15, 2024
780
1,588
96
Let's also keep in mind that performance of CB24 scales much more with memory subsystem than CB23. https://chipsandcheese.com/p/cinebench-2024-reviewing-the-benchmark The memBW available to single CCD on Strix Halo is 1/4 of the total memory BW. 1/2 of total memory BW for MT loads. There is also the fact that most used SIMD width is 128b, which fits Neon perfectly, but still total percentage of SIMD instruction < 10% overall. Of course when checked on x64, not sure if somebody tried to check that for ARM. Since we don't have access to the source code we don't know if the SIMD usage was targetted by the developers (and then it would most likely look the same on all archs) or was a result of autovectorization (then the instruction mix might look different between archs despite using the same compiler). Since the percentage is really low, I would assume autovectorization, so the result might be different on aarch64. Would be nice if somebody could check.

In other words CB24 is more playing to M4 strengths than Zen5 strengths. Just something to keep in mind when you argue about few watts here and there;)

Disclaimer:
My position on the topic is that Macbooks have generally better perf and perf/W than x64 counterparts but pinpointing accurately how much of that is due to the SoC, how much due to better laptop design (more efficient power stages, displays, etc) or MacOS optimizations based on random benchmarks from internet is near impossible task.
 

CouncilorIrissa

Senior member
Jul 28, 2023
668
2,592
106
In other words CB24 is more playing to M4 strengths than Zen5 strengths. Just something to keep in mind when you argue about few watts here and there;)

Disclaimer:
My position on the topic is that Macbooks have generally better perf and perf/W than x64 counterparts but pinpointing accurately how much of that is due to the SoC, how much due to better laptop design (more efficient power stages, displays, etc) or MacOS optimizations based on random benchmarks from internet is near impossible task.
Furthermore, 128-bit SIMD is sort of a worst-case scenario for Zen 5 since it regressed from 1 clk -> 2 clk compared to Zen 4. It's an obvious weakness of the design sure, but it's not exactly representative of the rest of the workloads. CBR24 underperformance has been discussed ad infinitum at this point.
 

poke01

Diamond Member
Mar 8, 2022
3,910
5,225
106
other words CB24 is more playing to M4 strengths than Zen5 strengths. Just something to keep in mind when you argue about few watts here and there
Let's also keep in mind that performance of CB24 scales much more with memory subsystem than CB23. https://chipsandcheese.com/p/cinebench-2024-reviewing-the-benchmark The memBW available to single CCD on Strix Halo is 1/4 of the total memory BW. 1/2 of total memory BW for MT loads. There is also the fact that most used SIMD width is 128b, which fits Neon perfectly, but still total percentage of SIMD instruction < 10% overall. Of course when checked on x64, not sure if somebody tried to check that for ARM. Since we don't have access to the source code we don't know if the SIMD usage was targetted by the developers (and then it would most likely look the same on all archs) or was a result of autovectorization (then the instruction mix might look different between archs despite using the same compiler). Since the percentage is really low, I would assume autovectorization, so the result might be different on aarch64. Would be nice if somebody could check.

In other words CB24 is more playing to M4 strengths than Zen5 strengths. Just something to keep in mind when you argue about few watts here and there;)

Disclaimer:
My position on the topic is that Macbooks have generally better perf and perf/W than x64 counterparts but pinpointing accurately how much of that is due to the SoC, how much due to better laptop design (more efficient power stages, displays, etc) or MacOS optimizations based on random benchmarks from internet is near impossible task.
Exactly why I used Blender in the earlier pages as a comparison as it uses AVX2 on Zen5.

Obviously the topic has gone from can Apple keep with AMDs thread difference? to calculating the efficiency of an architecture using CB2024 which is bizarre.
 

DavidC1

Golden Member
Dec 29, 2023
1,711
2,778
96
While I agree with the 10v16 core argument, it doesn't really apply here as the 4+8 core zen 5 part (370) scores significantly better against the 10+4 in terms of multicore efficiency than the 16 core part does
Apple's E cores are way weaker than their Ps, so it's still 10 vs 12 + SMT, which applications like Cinbench will get 20% increase by itself.
 

DavidC1

Golden Member
Dec 29, 2023
1,711
2,778
96
Isn't Apple's E cores true E cores in a sense, so its way weaker?
They are still very, very good, but their P cores are way higher performance.

Also, single thread performance is still very valued, otherwise Intel for example could have just had a 32E core part and it would have been fine. Which Apple M chips are still king of the hill at.

One also has to wonder the practicality of near full desktop chip on laptops as the power consumption makes it almost irrelevant under load and it becomes an expensive, small screen desktop at that point. Apple has that part right as well.

Like me and someone else said. Imagine *insert your favorite vendor here* doing well as Apple, and see how much more intense the hype would be. AMD? Intel? It's truly an envious situation to be in. I am not fan of Apple as a company but man they make good chips.
 
  • Like
Reactions: S'renne

OneEng2

Senior member
Sep 19, 2022
736
983
106
I am not fan of Apple as a company but man they make good chips.
I think the main point of contention in this thread isn't if Apple makes good chips, it is if the design they use is somehow superior to x86.

I can't speak for everyone, but based on the evidence shown, my personal conclusion is that M4 is very good at some things, and quite weak at others. The same can be said of x86.

The two architectures came at different problems from the get go. It isn't surprising that M4 operates very efficiently. It isn't surprising that x86 scales to higher outright performance in HPC and DC.

In engineering, you never get something for nothing. It's rare when you get a win all the way around for all the things you want (nearly impossible).
 

DavidC1

Golden Member
Dec 29, 2023
1,711
2,778
96
In engineering, you never get something for nothing. It's rare when you get a win all the way around for all the things you want (nearly impossible).
You miss one really important point: Some are faster than others, smarter than others, and more capable than others. And when you have a group of people, that effect is magnified, because the smart people might be stressed, worried, or apathetic.

We blame management, but the truth is management is necessary. You can't just put 50 people in a room and expect magic to come out. An excellent team can't have mediocre engineers and good management. You need management that is forward thinking, is disciplined, has experience and knows who to pick for his team. Also being in a position of a good engineer doesn't just take work, because you may simply not be fit for that task.
However, since we are talking about very low CPU core consumption values (maximum 6.5 watts),
6.5W with 285K level of performance! That's 4-5x the advantage. x86 chips would perform at a level most would call useless at that level of power. The whole system uses a mere 16W, which is exceeded by the chip alone for competitors.
 
Last edited:

johnsonwax

Senior member
Jun 27, 2024
267
433
96
Apple's E cores are way weaker than their Ps, so it's still 10 vs 12 + SMT, which applications like Cinbench will get 20% increase by itself.
Scheduler issue here as well - typically on MacOS once a process has threads moved to P core, they don't fill up E cores unless they are given a thread QoS to require them to. Lots of documented examples of apps going from all E cores/no P cores to all P cores/no E cores. MacOS is really hesitant to completely turn over the die to a single app so if you want to do that, you have to explicitly tell it to do that, and you can't do it external to the app through a taskpolicy command since that sets it for a process, not for individual threads. So Cinebench needs to be coded internally to do that.
 

DavidC1

Golden Member
Dec 29, 2023
1,711
2,778
96
If we go with Geekerwan’s board power measurement that’s 9-10 watts per core@ 4.4 GHz, still great though.
Amberlake-Y with 5W TDP uses 15-16W system power at load so I would go with NBC's measurements.

 

poke01

Diamond Member
Mar 8, 2022
3,910
5,225
106
Amberlake-Y with 7W TDP uses 15-16W system power at load so I would go with NBC's measurements. Note that the Apple system has a higher res display as well.

I’m sure that SoC article uses software readings but NBC uses wall readings when testing laptops.

It makes sense when you look at the power consumption figures for the laptop reviews.
 

DavidC1

Golden Member
Dec 29, 2023
1,711
2,778
96
I’m sure that SoC article uses software readings but NBC uses wall readings when testing laptops.

It makes sense when you look at the power consumption figures for the laptop reviews.

Another system with 7.5W TDP uses 18.4W under load, approximately the TDP increase.

Microsoft's article a while ago noted that the software readings are actually pretty accurate, as long as you are measuring the right thing. Intel also uses software to measure their power, since it's merely reporting back what the sensors are reading.

It would be an issue if you are talking in timeframes where it's faster than the display/our eyes can perceive but it's no problem in averages.
 

poke01

Diamond Member
Mar 8, 2022
3,910
5,225
106

Another system with 7.5W TDP uses 18.4W under load, approximately the TDP increase.
well this makes no sense then
1754445543579.png
 

DavidC1

Golden Member
Dec 29, 2023
1,711
2,778
96
well this makes no sense then
View attachment 128342
Screen uses about 2W? Intel and others were researching "1W" displays for a while, meaning in average under acceptable conditions it would be 1-2W. That's about what my Yoga 710 consumes, unless I set it at 100%.

This would reflect professional and user reviews that the Macbooks are mostly very silent and quiet.
 

johnsonwax

Senior member
Jun 27, 2024
267
433
96
6.5W with 285K level of performance! That's 4-5x the advantage. x86 chips would perform at a level most would call useless at that level of power. The whole system uses a mere 16W, which is exceeded by the chip alone for competitors.
Story:

Last year I did a big online Factorio event (I'm retired so I can spend my day playing video games, thank you very much) and a bunch of players get together with me to try and figure out why my design isn't performing well on the 7950X3D that's hosing my instance. It's running 60FPS on my system but can only manage about 50 on the 7950X3D. This is top of the line gaming processor here, so it doesn't make sense. People take copies and struggle to get it to 60 even on their overlocked gaming rigs. And they finally ask what I'm running - oh, a MacBook Pro. It's top of the line, so not cheap, but still. And they can't believe I'm gaming on a laptop for one, and for another, that it's faster than their gaming rigs at this. And someone asks why I don't use a proper gaming PC and I explain that it's February, I'm California, it's 70 degrees and I'm on my patio drinking a beer with my feet up and why would I want to be inside? Someone asked 'Wait are you on battery?' Yeah, though I turned performance all the way up so it's running like it's plugged in - it'll get cold in a few hours and it should last that long. And everyone just deflated at that, like I just proved god didn't exist or something. It was so funny.

So yeah, don't let anyone tell you there aren't benefits to low power - dunking on a bunch of (extremely nice) PC master race guys is one of them. So much so that if we do it again, I may just have to replace this machine with the newest Max just to dunk on their 9950s.
 

DavidC1

Golden Member
Dec 29, 2023
1,711
2,778
96
So yeah, don't let anyone tell you there aren't benefits to low power - dunking on a bunch of (extremely nice) PC master race guys is one of them. So much so that if we do it again, I may just have to replace this machine with the newest Max just to dunk on their 9950s.
I don't know where society decided to collectively agree on that protection at all costs even to the point of denial is for good of society?

The faster they realize the x86 camp is *that* behind, the faster they'll get off their fat, lazy asses and do something about it.

I love PCs. I hope they do well. But the fact that I have to fight them about this suggests that it's going to take a LOT more pressure to wake them up.
 

poke01

Diamond Member
Mar 8, 2022
3,910
5,225
106
Screen uses about 2W? Intel and others were researching "1W" displays for a while, meaning in average under acceptable conditions it would be 1-2W. That's about what my Yoga 710 consumes, unless I set it at 100%.

This would reflect professional and user reviews that the Macbooks are mostly very silent and quiet.
its tested using an external monitor. How is the M4 Pro using 80watts avg here but NBC reported 46 watts peak under load in that earliar article?One has be from wall power and the other a software based reading.

edit: both data is from NBC
 

poke01

Diamond Member
Mar 8, 2022
3,910
5,225
106
I love PCs. I hope they do well. But the fact that I have to fight them about this suggests that it's going to take a LOT more pressure to wake them up.
me too. Sometimes we need obnoxious compaines to set a storm. I hope Nova lake is good to turn heads.
 

DavidC1

Golden Member
Dec 29, 2023
1,711
2,778
96
its tested using an external monitor. How is the M4 Pro using 80watts avg here but NBC reported 46 watts peak under load in that earliar article?One has be from wall power and the other a software based reading.

edit: both data is from NBC
I see what you mean.

Let me check. The 79W is under R23. 46W is under R24.
 

poke01

Diamond Member
Mar 8, 2022
3,910
5,225
106
I see what you mean.

Notebookcheck's Load Average is basically a power virus, meaning full system stress using Prime 95 for CPU and Furmark for GPU.

The Cinebench test is just CPU, so it's only using 46W.
But it’s only testing Cinebench in that chart