Discussion Quo vadis Apple Macs - Intel, AMD and/or ARM CPUs? ARM it is!

moinmoin · Apr 12, 2020

Due to popular demand I thought somebody should start a proper thread on this pervasive topic. So why not do it myself?

For nearly a decade now Apple has treated their line of Mac laptops, AIOs and Pro workstations more of a stepchild. Their iOS line of products have surpassed it in market size and profit. Their dedicated Mac hardware group was dissolved. Hardware and software updates has been lackluster.

But for Intel Apple clearly is still a major customer, still offering custom chips not to be had outside of Apple products. Clearly Intel is eager to at all costs keep Apple as a major showcase customer.

On the high end of performance Apple's few efforts to create technological impressive products using Intel parts increasingly fall flat. The 3rd gen of MacPros going up to 28 cores could have wowed the audience in earlier years, but when launched in 2019 it already faced 32 core Threadripper/Epyc parts, with 64 core updates of them already on the horizon. A similar fate appears to be coming for the laptops as well, with Ryzen Mobile 4000 besting comparable Intel solutions across the board, with run of the mill OEMs bound to surpass Apple products in battery life. A switch to AMD shouldn't even be a big step considering Apple already has a close work relationship with them, sourcing custom GPUs from them like they do with CPUs from Intel.

On the low end Apple is pushing iPadOS into becoming a workable mutitasking system, with decent keyboard and, most recently, mouse support. Considering the much bigger audience familiar with the iOS mobile interface and App Store, it may make sense to eventually offer a laptop form factor using the already tweaked iPadOS.

By the look of all things Apple Mac products are due to continue stagnating. But just like for Intel, the status quo for Mac products feels increasingly untenable.

DrMrLordX · May 1, 2020

@Glo.

Someone reputable with access to Apple hardware could probably do some FOSS ports to iOS of command-line applications, preferably of applications that have already been properly ported to ARMv8 with NEON support. And it would be a great comparison that would add detail to arguments such as these. Curiously that work hasn't been done yet.

Thunder 57 · May 1, 2020

Richie Rich said:
I can tell you what's wrong. You don't use any single number based on real measurements to support your opinion.

Real measured data are:

A13 has 1.84 times higher integer PPC than Zen2

A13@2.65GHz is equal to Zen2@4.9 GHz (2.65 x 1.84)

An old A13 in 8-core config would have devastating performance in laptops over any x86 chip. The only competitor within 15W TDP is Cortex-A77 however it would need 12-cores to keep up in MT load. Any x86 will loose either in ST or MT load in 15W TDP. That's clear from actual measured data.

Gurman said..... Richie Rich says that A14 will have an increased signal speed over speed of light resulting in power generation from heat extraction instead power burning. But Apple will never use it because iPhone would be uncomfortably cold. No numbers, no validity. Like in science or engineering.

Do you really think multithreading was as important/effective in 2006 than today? Also, x86 had L2 cache sizes around 512kb - 1MB, not much has changed there. These Apple chips are built for ST performance with large L2 caches that plenty of 2006 code will happily fit into. They are different designs suited for different purposes. And no one cares about your PPC based of SPEC2006 alone.

Fact is, a higher power/performance A13/14 will not perform better than a high power Zen/Intel. On the other hand, low power AMD/Intel cannot compete with ARM in that space. Why is that so hard to understand? And why do you say x86 is dead in one post then suggest a magical Jim Keller K12 6ALU design moved to Zen 3 would suddenly be competitive? It's like you're just trying to stir the pot and it's annoying.

Glo. · May 1, 2020

Thunder 57 said:
Fact is, a higher power/performance A13/14 will not perform better than a high power Zen/Intel. On the other hand, low power AMD/Intel cannot compete with ARM in that space. Why is that so hard to understand?

Yep, this. Both architectures have their strengths and weaknesess. Its not that one can effectively beat the other in ALL of their purposes.

Can Apple scale their designs to HPC market? Can x86 scale into Mobile, low power Market?

Only fools believe that one architecture can replace the other fully in one of those areas. They have to complement each other, but they will never fully replace each other.

Glo. · May 1, 2020

DrMrLordX said:
@Glo.

Someone reputable with access to Apple hardware could probably do some FOSS ports to iOS of command-line applications, preferably of applications that have already been properly ported to ARMv8 with NEON support. And it would be a great comparison that would add detail to arguments such as these. Curiously that work hasn't been done yet.

We have to wait for what Nuvia has to show, with their designs. I presume only afterwards we will see more software ported to ARM ecosystem.

DrMrLordX · May 1, 2020

Glo. said:
We have to wait for what Nuvia has to show, with their designs. I presume only afterwards we will see more software ported to ARM ecosystem.

Well not really. You can get command-line versions of applications like video encoders and even Blender I think? So you can sidestep all the UI crap. Someone with Apple hardware can do a local-only compile (not uploadable to Apple Store) for a benchmark run and report results.

NostaSeronx · May 1, 2020

Glo. said:
Can Apple scale their designs to HPC market? Can x86 scale into Mobile, low power Market?

Yes, yes.

This isn't impossible, has been done by AMD & Intel with dumber, less capable teams than what Apple has.

Intel has a bigger cache with four cores in ~9W territory and it gets curbed stomp by Apple with cores that consume less. Apple A14 supposedly makes it a joke to even bother with Icelake-Y & Tigerlake-Y.

Entry-level my cabbages! It is the most premium of the bunch w/ no Intel frequency getting close to Apple ILP/MLP.

beginner99 · May 2, 2020

Glo. said:
We have to wait for what Nuvia has to show, with their designs. I presume only afterwards we will see more software ported to ARM ecosystem.

I don think ARM could easily be competitive in the Server space if they have all the needed additional tech. If you are doing high core count CPUs your overall design, interconnects and uncore will start to matter much much more than the individual core. We see it with Graviton2 that has problems when under full MT load.

Still even if all these hurdles are jumped, the real problem is the software. When intel (x86) took over the world server-side, the amount of software running was minuscule compared to today. all the custom apps that run inside corporations? forget it. These will take decades to migrate. And having a design that is great but only HPC guys will buy is probably not sustainable as the change will be too slow. Just look at NV and AMD. They still make most of the cash via consumer hardware.

Richie Rich · May 2, 2020

Thunder 57 said:
Fact is, a higher power/performance A13/14 will not perform better than a high power Zen/Intel. O

Fact is that you are lying here. A13 ouperforms Zen2 at 4.6 GHz.
SPECint2006 score:

A13 at 2.65 GHz ... 52.82 pts
Zen2 at 4.6 GHz ... 49.02 pts

The difference is quite big 8%. To reach same performace Zen2 would need to clock 5.0 GHz assuming linear scaling.

If somebody feels that SPEC2006 is outdated, what about GeekBench 5.1.1 from 7th April 2020. Because GB5 show same performance advantage and A13 is beating Ryzen 3950X at 4.6:

A13 at 2.65 GHz ... 1332 pts
Zen2 at 4.6 GHz ... 1269 pts

That's victory for A13 with 5% margin. GeekBench is combined score (not like SPEC has separated INT and FPU score) so it's even worse result for Zen2 because Apple A13 is much stronger in INT than in FPU (SPECint PPC is +84% but SPECfp PPC is only +54% over Zen2).

So the conclusion is that older SPEC2006 fits Zen2 and Coffee Lake better than A13 especially for FPU tests. Your denial hit another wall again. And what about next year new CPUs based on ARMv9 with 2048-bit SVE2? It will probably gain much more from modern Geekbench than from SPEC2006.

iPhone 11 Pro Max - Geekbench

Benchmark results for an iPhone 11 Pro Max with an Apple A13 Bionic processor.

browser.geekbench.com

System manufacturer System Product Name - Geekbench

Benchmark results for a System manufacturer System Product Name with an AMD Ryzen 9 3950X processor.

browser.geekbench.com

DrMrLordX · May 2, 2020

Richie Rich said:
SPECint2006 score:

Would you please run something on A13 other than SPEC or Geekbench?

Richie Rich · May 2, 2020

DrMrLordX said:
Would you please run something on A13 other than SPEC or Geekbench?

As long as those benchmarks corresponds with real applications it's valid. And that's reason why its used and still developed. Please feel free to provide an example of CPU which excelled in SPEC or Geekbench but terribly failed in real applications. Because otherwise it's just an another excuse to denial.

Please show me measured numbers of such a CPU.

Glo. · May 2, 2020

Richie Rich said:
Please feel free to provide an example of CPU which excelled in SPEC or Geekbench but terribly failed in real applications. Because otherwise it's just an another excuse to denial.

Please show me measured numbers of such a CPU.

Graviton v2 vs Multithreaded workloads.

Multithreaded Workloads won that battle.

Glo. · May 2, 2020

DrMrLordX said:
Well not really. You can get command-line versions of applications like video encoders and even Blender I think? So you can sidestep all the UI crap. Someone with Apple hardware can do a local-only compile (not uploadable to Apple Store) for a benchmark run and report results.

I should've cut your post to just last two sentences:

And it would be a great comparison that would add detail to arguments such as these. Curiously that work hasn't been done yet.

Only then my response would make sense from context perspective ;P

With this in mind my point still stands. We need to wait for Nuvia, who promises next-gen ARM(?) CPUs for the masses.

And to be honest, if there is anything that I am curious about its their design. Im looking forward to what they can bring up to the table, and potentially masses, also.

amrnuke · May 2, 2020

DrMrLordX said:
Well not really. You can get command-line versions of applications like video encoders and even Blender I think? So you can sidestep all the UI crap. Someone with Apple hardware can do a local-only compile (not uploadable to Apple Store) for a benchmark run and report results.

If an app has a GUI, that's part of the performance package you need to evaluate in order to make a purchase decision between two chips. Removing layers of complexity doesn't help us at all.

If all you're proposing is stripping the GUI in order to get another data point, then keep in mind that as one strips each layer of complexity from a benchmark, the results will likely incrementally approach the results of SPEC/GB, while simultaneously moving further from being a good benchmark for the actual performance of the full application.

As such, I feel strongly that a stripped-down Blender benchmark run from the command line (if even possible) would be about as useless as the SPEC and GB5 results.

DrMrLordX · May 2, 2020

Richie Rich said:
As long as those benchmarks corresponds with real applications it's valid.

They don't. Certainly not SPEC2006. So come on, let's get on with it. You do own an A13 or A12x and have a MacOS box for running Xcode, right?

moinmoin · May 2, 2020

amrnuke said:
If an app has a GUI, that's part of the performance package you need to evaluate in order to make a purchase decision between two chips.

And how do you intend to unify that when running a test across Windows, Linux (which desktop?), Android, macOS, iOS and any other UI system possibly worth testing (WebOS?)? Don't suggest porting the UI, since that's the opposite of unifying the test.

DrMrLordX · May 2, 2020

amrnuke said:
If an app has a GUI, that's part of the performance package you need to evaluate in order to make a purchase decision between two chips. Removing layers of complexity doesn't help us at all.

The GUI on most applications is a tiny sliver of the computational load. You don't really think running a CLI version of Blender is going to make it run significantly faster, do you? It would not be hard to replicate the results by using the same CLI application on other platforms. If you don't strip the GUI, you're going to have to deal with Apple's APIs which are not going to be supported by the existing codebase.

It's the only way to get numbers on something other than SPEC and Geekbench. Some other cross-platform benches like Antutu have come out and said, "don't compare scores between platforms" (iOS vs Android), so it has rendered itself useless.

@moinmoin

Porting the UI is also a lot more work than most reviewers want to do.

moinmoin · May 2, 2020

DrMrLordX said:
Porting the UI is also a lot more work than most reviewers want to do.

Regardless of who does it, it's close to impossible work since the UIs don't have a common denominator (one could pick e.g. Qt as GUI library across all systems, but then you are benchmarking that, not the underlying system). And once you do solutions native to each system you're already walking down the rabbit hole of both UI functionality and optimization.

amrnuke · May 2, 2020

DrMrLordX said:
The GUI on most applications is a tiny sliver of the computational load. You don't really think running a CLI version of Blender is going to make it run significantly faster, do you? It would not be hard to replicate the results by using the same CLI application on other platforms. If you don't strip the GUI, you're going to have to deal with Apple's APIs which are not going to be supported by the existing codebase.

It's the only way to get numbers on something other than SPEC and Geekbench. Some other cross-platform benches like Antutu have come out and said, "don't compare scores between platforms" (iOS vs Android), so it has rendered itself useless.

That's a good point you're making. Blender on Apple systems would have to run Apple's APIs. Apple's APIs as a result are a chief part of the real world performance of an application on Apple's OSes.

As for the difference between CL version of Blender versus GUI, until we can prove that there's no difference, I don't want to assume anything. That's all. Blender may not see a big hit (I don't know that for sure), but I do think that the majority of consumer facing programs rely heavily on the GUI, and removing it from the performance equation makes the benchmark far less useful.

moinmoin said:
And how do you intend to unify that when running a test across Windows, Linux (which desktop?), Android, macOS, iOS and any other UI system possibly worth testing (WebOS?)? Don't suggest porting the UI, since that's the opposite of unifying the test.

moinmoin said:
Regardless of who does it, it's close to impossible work since the UIs don't have a common denominator (one could pick e.g. Qt as GUI library across all systems, but then you are benchmarking that, not the underlying system). And once you do solutions native to each system you're already walking down the rabbit hole of both UI functionality and optimization.

Do unified tests that eliminate evaluation of the GUI reflect real-world situations as well as a test that does take into account GUI performance? I don't know the answer. GUIs can be complex and can introduce a lot of complications to the benchmarking process. I think that is valuable to benchmark since it is part of the real world experience. Of course, then you're not testing just the chip at that point. But if we want to know how Photoshop does on Macbook ARM, versus Windows x86, we need to measure the GUI too.

It seems that the more accurately the benchmark reflects the end-user experience (for programs requiring a lot of interaction with the app), the less accurate it can be at measuring the CPU's performance.

If your use case is sitting around waiting on the CPU to render or do some other computational work, then synthetic benchmarks are more useful. But if your use case involves a lot of interaction with the GUI and OS, then the benchmarks that remove the GUI/OS from the equation are less relevant. My worry with porting apps to a CLI version and eliminating the GUI portion is that we are removing useful data points for end users when deciding between systems.

I understand the technical reason for doing it, but the GUI and OS do matter for most people.

Overall, I'm opposed to the proposed idea, making CLI versions of the benchmarks, because while it might be fun to talk about a CLI Blender A12X benchmark result, it wouldn't really make much of a dent in the conversation because Blender isn't available on any machine that uses an A12X.

Further, if Apple release an ARM-powered Macbook at some point, we will surely be able to more easily compare its computational prowess to x86, but for the average person buying a Macbook on ARM, they wouldn't care anyway. All they'll care about is "it's Apple", "it feels as fast as my iPhone", and "it browses the internet well".

DrMrLordX · May 2, 2020

@amrnuke

There are certain circumstances where a graphical viewer component (see: CAD apps) can soak up system resources if the user (for example) pan & scans a model. For the raw UI though, that component has long since ceased to be a major drain on resources. Something like 7zip would not have major performance differences between its GUI and the CLI versions.

Note this Blender bug report from version 2.76:

Big performance difference between console and GUI rendering at large tile size

**System Information** Windows 7 64bit Intel i7 4Ghz Nvidia GTX 770 with latest CUDA drivers 16GB RAM **Blender Version** 2.76 **Short description of error** Certain scenes rendered at large tile size takes an enormously long time to render from the Blender Interface, while it can render at 3x-…

developer.blender.org

With an application like Blender, the duty of updating the displayed image during the render can be offloaded to the GPU via GLSL. Or you can not display the render result in realtime at all which is what is done in Blender Benchmark. Blender Benchmark shows a static image of the final render product during the render. You never see the result in realtime.

Thunder 57 · May 2, 2020

Richie Rich said:
Fact is that you are lying here. A13 ouperforms Zen2 at 4.6 GHz.
SPECint2006 score:

A13 at 2.65 GHz ... 52.82 pts

Zen2 at 4.6 GHz ... 49.02 pts

The difference is quite big 8%. To reach same performace Zen2 would need to clock 5.0 GHz assuming linear scaling.

If somebody feels that SPEC2006 is outdated, what about GeekBench 5.1.1 from 7th April 2020. Because GB5 show same performance advantage and A13 is beating Ryzen 3950X at 4.6:

A13 at 2.65 GHz ... 1332 pts

Zen2 at 4.6 GHz ... 1269 pts

That's victory for A13 with 5% margin. GeekBench is combined score (not like SPEC has separated INT and FPU score) so it's even worse result for Zen2 because Apple A13 is much stronger in INT than in FPU (SPECint PPC is +84% but SPECfp PPC is only +54% over Zen2).

So the conclusion is that older SPEC2006 fits Zen2 and Coffee Lake better than A13 especially for FPU tests. Your denial hit another wall again. And what about next year new CPUs based on ARMv9 with 2048-bit SVE2? It will probably gain much more from modern Geekbench than from SPEC2006.

iPhone 11 Pro Max - Geekbench

Benchmark results for an iPhone 11 Pro Max with an Apple A13 Bionic processor.

browser.geekbench.com

System manufacturer System Product Name - Geekbench

Benchmark results for a System manufacturer System Product Name with an AMD Ryzen 9 3950X processor.

browser.geekbench.com

So you don't like that I have to say therefore I am a liar and in denial? How about you answer some of my questions rather than spew talking points. This would be a great first start.

Richie Rich said:
(snip)

Relevant part:

Lisa Su could be calm only under condition of Zen 3 is Keller's K12 (x86 branch) with 6xALUs. If this is true a lot of people will be surprised by Zen 3 performance. Something like those leaks about 40-50% IPC uplift is true, but it's uplift everywhere (int + FPU) and not just FPU.

Yet almost every chance you get you say x86 is dead or a dinosaur or some other derogatory remark. So which is it? Jim Keller's magical K12 saves x86, or x86 is dead?

Glo. · May 2, 2020

DrMrLordX said:
@amrnuke

There are certain circumstances where a graphical viewer component (see: CAD apps) can soak up system resources if the user (for example) pan & scans a model. For the raw UI though, that component has long since ceased to be a major drain on resources. Something like 7zip would not have major performance differences between its GUI and the CLI versions.

Note this Blender bug report from version 2.76:

Big performance difference between console and GUI rendering at large tile size

**System Information** Windows 7 64bit Intel i7 4Ghz Nvidia GTX 770 with latest CUDA drivers 16GB RAM **Blender Version** 2.76 **Short description of error** Certain scenes rendered at large tile size takes an enormously long time to render from the Blender Interface, while it can render at 3x-…

developer.blender.org

With an application like Blender, the duty of updating the displayed image during the render can be offloaded to the GPU via GLSL. Or you can not display the render result in realtime at all which is what is done in Blender Benchmark. Blender Benchmark shows a static image of the final render product during the render. You never see the result in realtime.

Even on the same ISA, there can be differences between platforms with Blender performance:

I hope I added to the confusion!

DrMrLordX · May 2, 2020

Glo. said:
Even on the same ISA, there can be differences between platforms with Blender performance:

I hope I added to the confusion!

x86 hardware seems to be slower under MacOS as well.

Glo. · May 2, 2020

Enjoy.

Thala · May 2, 2020

DrMrLordX said:
If you don't strip the GUI, you're going to have to deal with Apple's APIs which are not going to be supported by the existing codebase.

You have to deal with Apples APIs in any case, even if you strip the UI.
In the best case the application is using only standard C/C++ and POSIX API but typically under iOS you are using Cocoa.

DrMrLordX · May 3, 2020

Thala said:
You have to deal with Apples APIs in any case, even if you strip the UI.
In the best case the application is using only standard C/C++ and POSIX API but typically under iOS you are using Cocoa.

Bleargh.

Discussion Quo vadis Apple Macs - Intel, AMD and/or ARM CPUs? ARM it is!

Diamond Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Senior member

Lifer

Senior member

Diamond Member

Diamond Member

Golden Member

Lifer

Diamond Member

Lifer

Diamond Member

Golden Member

Lifer

Diamond Member

Diamond Member

Lifer

Diamond Member

Golden Member

Lifer