Question x86 and ARM architectures comparison thread.

Page 7 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

OneEng2

Senior member
Sep 19, 2022
736
983
106
nope, we still compared Zen4 to Raptor Lake. AMD was ahead on node and result
Of course we compared them!

We didn't conclude that Raptor Lake was an inferior architecture though, only that Zen 4 had slightly better performance. These days, It is always always noted which processor is benefitting from a newer or more expensive process node. It has become hyper-critical. "The best" architecture is the one that makes the most money. This involves being able to do more with less.

I am an engineer. I can tell you that most any idiot can "do more with more". Doing more with less requires talent though.
 

poke01

Diamond Member
Mar 8, 2022
3,910
5,225
106
and I never stated within a power envelop. Zen 5 can pound M4 in MT by a large margin .... even in Laptop .... but ESPECIALLY everywhere else where Zen 5's infrastructure is simply light years ahead of M4.
no it can't. It only does so by pounding it with 2x the power.
 

poke01

Diamond Member
Mar 8, 2022
3,910
5,225
106
By all means. Compare away! I agree... BUT if we are evaluating an ARCHITECTURE, it seems like a very THIN argument to say .... "M4 is good at laptop therefore it is fundamentally better everywhere" .... and that is even more true since my arguement is that M4 isn't even fundamentally better than Zen 5 in laptop! It is better at some things, but not most things. Not sure how that makes it "better". Furthermore, it is more expensive AND has a process advantage.
How is the Apple M4 urach not better when Apple is only losing by 64% but AMD has 2x P cores and 2x the threads?

The NODE advanatge does NOT give apple this big of a lead.
 

poke01

Diamond Member
Mar 8, 2022
3,910
5,225
106
We didn't conclude that Raptor Lake was an inferior architecture though, only that Zen 4 had slightly better performance. These days, It is always always noted which processor is benefitting from a newer or more expensive process node. It has become hyper-critical. "The best" architecture is the one that makes the most money. This involves being able to do more with less.
it was though, Zen4 is better than Raptor Lake. Raptor Lake only seemed like it was winning benchmarks cause it consumed a LOT more power.
 

Doug S

Diamond Member
Feb 8, 2020
3,369
5,917
136
Apple does amazing things with their ARM processors, but to dream that this somehow makes ARM fundamentally superior to x86 is just silly IMO.

I can't speak for others but I've never claimed ARM is fundamentally superior to x86. I've even argued that x86's more complex instruction decoding doesn't make any real difference power wise in the era of billion transistor chips because that additional transistors that requires are such a tiny portion of a modern core.

But a lot of people here seem to trying to argue that ARM cores in general, or Apple's in particular, are somehow unsuitable for DC. That's ridiculous on its face. No one can point to a benchmark that shows Apple cores as not being appropriate for DC tasks. Test x number of Apple P cores against the same number of x86 cores P cores and unless you're talking tasks that "just happen" to be all about AVX512 (or alternatively all about SVE2) you can't find any big difference in either direction. The only thing people can point to is "well x86 scales up to 192 cores and Apple doesn't" trying to imply that this is proof that Apple can't. That's just ignorant reasoning. It is the exact same reasoning people used to use claiming Apple's cores weren't appropriate for PCs, because Apple was only using them in phones.

Assuming Qualcomm's next gen cores are competitive with Apple's (I think likely to be better since it appears Qualcomm will be binning on frequency) and they start selling server chips as rumored then we'll have a good comparison - the same ARM cores being used in both phones and servers. People will have to stop lying that ARM and/or Apple's cores are somehow unsuitable for DC loads. Well maybe they'll still try to claim that about Apple's cores despite the evidence from Qualcomm's cores, because they just can't help themselves, but they'll be hanging on by the thinnest of threads.
 

poke01

Diamond Member
Mar 8, 2022
3,910
5,225
106
1754181969604.png

we have the full blown M4 Pro here with 10P+4E/14 threads here against Strix Halo. AMD here has 60% more P cores and 32 threads here. Yet the difference is in scene completion is 5%.

How is Zen5 on N3E going to make up for that?


Let me put this way, if Intel had a uArch that is 5% slower and used much less power but it had 10P+4E with 14 threads compared to AMDs 16c/32t, the industry would go nuts.
 
  • Like
Reactions: Mopetar

poke01

Diamond Member
Mar 8, 2022
3,910
5,225
106
I do not know why we are picking random benchmarks but let's have at it.
Oh boy you should not have used that at ALL cause I know what benchmark exists and it’s run using a x86_64 kernel so Rosetta 2 is being used.
IMG_2345.png


If had the kernel had been native ie arm64, it would say so like below.
IMG_2346.png
 
  • Like
Reactions: Mopetar and gdansk

Schmide

Diamond Member
Mar 7, 2002
5,726
1,015
126
I think your power figures are off.

From notebookcheck

Cyberpunk 2077 2.2 Phantom Liberty Ultra preset 1920x1080
AMD Ryzen AI Max+ PRO 395, AMD Radeon 8060S 80.7 fps 49.1w = 1.64 fps/w
Apple M4 Max (16 cores), Apple M4 Max 40-Core GPU 47.4 fps 42.2w = 1.12 fps/w

other metrics Load Maximum 49.2w 42.2w Load Average 42.2w 42.2w respectively. (whatever that is) they are close.

Depending of the form factor it can go either way.
 

gdansk

Diamond Member
Feb 8, 2011
4,342
7,288
136
Oh boy you should not have used that at ALL cause I know what benchmark exists and it’s run using a x86_64 kernel so Rosetta 2 is being used.
View attachment 128134


If had the kernel had been native ie arm64, it would say so like below.
View attachment 128135
Good catch. I can't find any Blender 4.4 results for M4 Max that aren't Rosetta. I can run it on my M4 Pro but I have no way of instrumenting power.
But in either case the Strix results suggest using Linux again causes a large performance increase. Larger than the difference between M4P and STH. Which the chart you posted didn't consider.
 
  • Like
Reactions: Mopetar and Gideon

poke01

Diamond Member
Mar 8, 2022
3,910
5,225
106
I think your power figures are off.

From notebookcheck

Cyberpunk 2077 2.2 Phantom Liberty Ultra preset 1920x1080
AMD Ryzen AI Max+ PRO 395, AMD Radeon 8060S 80.7 fps 49.1w = 1.64 fps/w
Apple M4 Max (16 cores), Apple M4 Max 40-Core GPU 47.4 fps 42.2w = 1.12 fps/w

other metrics Load Maximum 49.2w 42.2w respectively. (whatever that is) they are close.

Depending of the form factor it can go either way.
Cyberpunk does not max out the CPU cores and power consumption is lower than Blender and games like Cyberpunk are not heavily nT applications
 
  • Like
Reactions: Mopetar

Schmide

Diamond Member
Mar 7, 2002
5,726
1,015
126
Cyberpunk does not max out the CPU cores and power consumption is lower than Blender and games like Cyberpunk are not heavily nT applications
Yeah but if it's lower by a few watts but the performance is higher it's basically a wash. If you normalize performance to watts you would be hard to find large differences.
 

poke01

Diamond Member
Mar 8, 2022
3,910
5,225
106
Yeah but if it's lower by a few watts but the performance is higher it's basically a wash. If you normalize performance to watts you would be hard to find large differences.

Yes buts that only if you ignore the number of physical cores and logical cores differences. But that’s the point, you cannot do so when comparing uArch against real world nT applications.
 

OneEng2

Senior member
Sep 19, 2022
736
983
106
no it can't. It only does so by pounding it with 2x the power.
... and who cares again? Explain why power should enter into an outright performance discussion?
How is the Apple M4 urach not better when Apple is only losing by 64% but AMD has 2x P cores and 2x the threads?

The NODE advanatge does NOT give apple this big of a lead.
It is losing .... full stop. It is just insult to injury that it is losing AND likely cost more to manufacture AND has a node process advantage.

You seem to be fixated on the number of cores. That's kind of silly in this day and age.

You buy processor X for some price and it performs at some level and processor Y for another price and it performs at another level. This is how the consumer sees it.

For the company, it cost X to produce processor 1 and I make this much on each one and it costs Y to produce processor 2 and I make something different.

These are the only things that matter.
I can't speak for others but I've never claimed ARM is fundamentally superior to x86. I've even argued that x86's more complex instruction decoding doesn't make any real difference power wise in the era of billion transistor chips because that additional transistors that requires are such a tiny portion of a modern core.

But a lot of people here seem to trying to argue that ARM cores in general, or Apple's in particular, are somehow unsuitable for DC. That's ridiculous on its face. No one can point to a benchmark that shows Apple cores as not being appropriate for DC tasks. Test x number of Apple P cores against the same number of x86 cores P cores and unless you're talking tasks that "just happen" to be all about AVX512 (or alternatively all about SVE2) you can't find any big difference in either direction. The only thing people can point to is "well x86 scales up to 192 cores and Apple doesn't" trying to imply that this is proof that Apple can't. That's just ignorant reasoning. It is the exact same reasoning people used to use claiming Apple's cores weren't appropriate for PCs, because Apple was only using them in phones.

Assuming Qualcomm's next gen cores are competitive with Apple's (I think likely to be better since it appears Qualcomm will be binning on frequency) and they start selling server chips as rumored then we'll have a good comparison - the same ARM cores being used in both phones and servers. People will have to stop lying that ARM and/or Apple's cores are somehow unsuitable for DC loads. Well maybe they'll still try to claim that about Apple's cores despite the evidence from Qualcomm's cores, because they just can't help themselves, but they'll be hanging on by the thinnest of threads.
That is a quite fair assessment of x86's "extra" decode into RISC like equal length instructions .... I agree.

M4 core itself is hard to prove one way or another that it would perform as well as Zen 5 in DC since no platform exists to test the theory. I SUSPECT that it would not perform as well simply because that is NOT what it was designed for. Zen 5 (and several previous generations) have been specifically architected "Server First" (AMD's quote, not mine). It is therefore likely that M4 wouldn't fare well in such a contest.

On the flip side, Zen 5 wouldn't work well at all in a phone or tablet.

To date, this is the only ARM vs Zen 5 benchmark in DC I have seen:

It didn't look very flattering for ARM.
View attachment 128130

we have the full blown M4 Pro here with 10P+4E/14 threads here against Strix Halo. AMD here has 60% more P cores and 32 threads here. Yet the difference is in scene completion is 5%.

How is Zen5 on N3E going to make up for that?


Let me put this way, if Intel had a uArch that is 5% slower and used much less power but it had 10P+4E with 14 threads compared to AMDs 16c/32t, the industry would go nuts.
I have definitely never said M4 was not good at anything. It does well at Blender, yet even then, it does so with a full node advantage .... and still loses to a Zen 5 part that likely costs less to make.

BTW, I also wonder how important memory bandwidth is to the Blender CPU benchmark. M4 Max has a huge memory bandwidth advantage that may well aid it significantly rendering a 1440p scene.
 
  • Like
Reactions: booklib28 and Tlh97