Question Intel Raptor Lake vs AMD Zen 4 vs Apple M2

Carfax83 · Feb 8, 2022

These CPUs are all going to square off against each other at some point this year assuming nothing catastrophic occurs to delay any of the product launches. So going by what we know from official sources and informed rumor mongers (many of which were very accurate before Alder Lake and the M1 launched), which CPU do you think will win out in these categories?

1) Single threaded performance
2) Multithreaded performance
3) Gaming performance
4) Performance per watt
5) Overall performance (who wins the majority of applications)

While I've been keeping a close eye on rumors and leaks for Zen 4 and Raptor Lake, I have not admittedly been doing so for the M2; as I'm unrepentant Apple hater

At least I'm honest about it... That said, this is my ranking based on what I've seen and heard:

I think the single threaded crown will go to Raptor Lake, and I say this based on informed rumors that Raptor Lake will have up to 10% more IPC from microarchitectural updates, cache upgrades and higher clock speeds than Alder Lake. From what I've seen, gauging IPC performance isn't easy as it varies so much based on application, but I'd say Alder Lake already has at least a 15% across the board IPC advantage over Zen 3, so Raptor Lake could conceivably have 25% better IPC than Zen 3, which is similar to what Zen 4 will reportedly possess. But I doubt Zen 4 will match Raptor Lake in clock speeds and memory latency performance, which is why I'm predicting Raptor Lake will take the single threaded performance crown.

For multithreaded performance, Zen 4 should easily take it due to having more big cores than its Intel counterpart and similar IPC.

Gaming performance is more complicated because while some games are inherently more reliant on single core performance (strategy games for instance), more and more 3D engines are becoming increasingly parallel due to the adoption of Vulkan and DX12 in addition to modernized programming methods. Still, very few 3D engines can scale beyond 8 threads and 6 to 8 cores remains the sweet spot for gaming and will be for some time. So overall, I feel more comfortable going with Raptor Lake for the gaming crown. Also if rumors are correct, Raptor Lake will officially support DDR5-5600 off the bat while Zen 4 will reportedly use DDR5-5200. The raw memory speed won't likely be a significant factor, but Intel's memory controller will be right next to the CPU cores while Zen 4's will be in the I/O die which while still on the same package will definitely incur a significant latency penalty; which I'm sure will be offset by a massive L3 cache.

On performance per watt, one would think the M2 should take this category easily......but from the small amount of research that I've collected on it, it seems that there won't be much of a performance increase with the M2, if any. Some rumors are even suggesting there may be a bit of a regression in that aspect. Also since Zen 4 will be on TSMC's 5nm node, it will undoubtedly have excellent performance per watt and I believe it will also easily crush Apple's best in single core. So for performance per watt, I'm going to go with Zen 4.

When it comes to overall performance, I'm leaning towards Zen 4 but it will be close. Raptor Lake will supposedly double the amount of Gracemont efficiency cores which will certainly help in multithreaded performance per watt, but ultimately they won't be a match for Zen 4's 16 big cores with SMT. AMD will have the core count advantage and when that's combined with IPC parity with Raptor Lake, Zen 4 will win the majority of the benchmarks.

Abwx · Jun 8, 2022

Doug S said:
That's not true. A CPU may get better perf/watt (or worse perf/watt) if it uses wider SIMD, depending on too many factors to list.

More FP calculations will imply more power proportionately to the throughput improvement, a SIMD instruction would allow for say 2x execution throughput per cycle and it would necessitate twice the amount of adder, multiplier and hence 2x the power for the most power hungry bloc, if anything AVX512 is a good exemple since frequencies are to be much lowered to keep with the higher throughput/cycle.

thunng8 · Jun 8, 2022

Abwx said:
More FP calculations will imply more power proportionately to the throughput improvement, a SIMD instruction would allow for say 2x execution throughput per cycle and it would necessitate twice the amount of adder, multiplier and hence 2x the power for the most power hungry bloc, if anything AVX512 is a good exemple since frequencies are to be much lowered to keep with the higher throughput/cycle.

I have no idea why you are continuing with this. Cinebench r23 is definitely not optimised to run on m1 Which many have already pointed out. The Simd Code runs through a wrapper. We have no idea how efficient that wrapper is and it affect on performance but anything that is not optimised will not be as efficient.

Abwx · Jun 8, 2022

thunng8 said:
We have no idea how efficient that wrapper is and it affect on performance but anything that is not optimised will not be as efficient.

As said if the wrapper was 100% efficient then it would dispatch more instructions/cycle and this would led of course to more FP calculations done/cycle, power would increase accordingly.

To say it otherwise if the wrapper led to only 80% CPU utilisation then we can agree that if it was perfect it would allow 100% CPU utilisation, then those 25% better throughput and CPU being at 100% wouldnt come for free power wise.

thunng8 · Jun 8, 2022

Abwx said:
As said if the wrapper was 100% efficient then it would dispatch more instructions/cycle and this would led of course to more FP calculations done/cycle, power would increase accordingly.

To say it otherwise if the wrapper led to only 80% CPU utilisation then we can agree that if it was perfect it would allow 100% CPU utilisation, then those 25% better throughput and CPU being at 100% wouldnt come for free power wise.

Now you are just being silly. We have no ideas how much more efficient it would be. No amount of hand waving will change that. I have measured in many other applications and the m1 performs very badly in cinebench in efficiency compared to other industry standard benchmarks and applications.

In fact, cinebench is possibly the worst benchmark to compare x64 to ARM and draw any conclusions.

Henry swagger · Jun 8, 2022

thunng8 said:
Now you are just being silly. We have no ideas how much more efficient it would be. No amount of hand waving will change that. I have measured in many other applications and the m1 performs very badly in cinebench in efficiency compared to other industry standard benchmarks and applications.

In fact, cinebench is possibly the worst benchmark to compare x64 to ARM and draw any conclusions.

Cinebench is the gold standard.. arm chips are weak because they have no smt2

Hans de Vries · Jun 8, 2022

Henry swagger said:
Cinebench is the gold standard

According to Intel: Cinema4D's Cinebench is the least relevant benchmark of all: Ranked 1331 th.

Oh wait...

Doug S · Jun 9, 2022

Abwx said:
More FP calculations will imply more power proportionately to the throughput improvement, a SIMD instruction would allow for say 2x execution throughput per cycle and it would necessitate twice the amount of adder, multiplier and hence 2x the power for the most power hungry bloc, if anything AVX512 is a good exemple since frequencies are to be much lowered to keep with the higher throughput/cycle.

You are assuming identical FP calculations in different functional blocks will require identical power. That's only true if they have the exact same design. Which isn't true, because the SIMD units don't perform the same functions (i.e. the FP units may perform sqrt while the SIMD units don't, the FP units don't support popcount while the SIMD units do) and beyond that may have completely different design points.

For example, a muladd in the FP units may use more power due to higher fanout circuits in order to achieve less latency (because you assume the SIMD units will be doing a lot of successive calculations where throughput matters more) They may not even use the same transistor types, with one unit using lower power transistors and another using high performance transistors.

Next we have to consider wider SIMD units that obviously won't support as many instructions executing in parallel, and may have different restrictions on the types of operations that can be issued/executed/retired in parallel than narrower units.

This is only scratching the surface of possible differences. You might as well try to tell us that an FP calculation in efficiency cores is the same performance/watt as the same calculation in performance cores.

TheELF · Jun 9, 2022

Hans de Vries said:
According to Intel: Cinema4D's Cinebench is the least relevant benchmark of all: Ranked 1331 th.

Oh wait...

View attachment 62821

Meh, I can easily believe that less than 1% of notebook users can even start up cinema4D let alone do any work with it.

uzzi38 · Jun 9, 2022

uzzi38 said:
As a much more ambiguous statement: with Rembrandt AMD for the first time added IPUs and other IPs on die. You should expect Phoenix to integrate more.

I didn't think this would come up so quickly.

Das right, Xilinx IP on die baybee

DrMrLordX · Jun 10, 2022

Is Phoenix now Phoenix Point? Kind of a silly question I know, but I figured I'd ask anyway.

uzzi38 · Jun 10, 2022

DrMrLordX said:
Is Phoenix now Phoenix Point? Kind of a silly question I know, but I figured I'd ask anyway.

I guess lol. Everyone I've talked to still calls it Phoenix though, so idk.

shady28 · Aug 16, 2022

Just like to point out, this is what happens when you bench Rocket Lake with a Maximus motherboard and tuned Kingston Hyper-X DDR4, and use crap motherboards and cheap DDR4-3200 / DDR5-4800 on everything else on an RTX 2080.

nicalandia · Aug 16, 2022

shady28 said:
Just like to point out, this is what happens when you bench Rocket Lake with a Maximus motherboard and tuned Kingston Hyper-X DDR4, and use crap motherboards and cheap DDR4-3200 / DDR5-4800 on everything else on an RTX 2080.

View attachment 66006

5800X3D still beat that tune Rocket Lake....

Search

Question Intel Raptor Lake vs AMD Zen 4 vs Apple M2

Carfax83

Diamond Member

Abwx

Lifer

thunng8

Member

Abwx

Lifer

thunng8

Member

Henry swagger

Senior member

Hans de Vries

Senior member

Doug S

Diamond Member

TheELF

Diamond Member

uzzi38

Platinum Member

DrMrLordX

Lifer

uzzi38

Platinum Member

shady28

Platinum Member

nicalandia

Diamond Member

TRENDING THREADS