• Guest, The rules for the P & N subforum have been updated to prohibit "ad hominem" or personal attacks against other posters. See the full details in the post "Politics and News Rules & Guidelines."

Solved! ARM Apple High-End CPU - Intel replacement

Page 19 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Thunder 57

Golden Member
Aug 19, 2007
1,256
941
136
I can imagine some people who haven't been paying attention, or have been in denial about the Spec FP and Int numbers are going to be in for a heck of a shock when that happens...
Everybody knew jets were going to be the big new thing, at least anyone paying attention. There were working on jets before the war started. War tends to speed up things that can bring a huge advantage.

I've been hearing the ARMada for so long now about how they will come to rise and corner the market. It hasn't happened. Windows was available on MIPS (and others) in the 90's. It didn't happen then either. At this point, show me a product or prototype. I'm waiting, and will keep waiting. It's not that I'm anti-ARM, I'm for what is best. Should ARM prove itself, I'd gladly cheer it on like how the US went from a 15 minute suborbital flight to landing on the moon in less than 10 years. I just don't see it yet.
 
  • Haha
  • Like
Reactions: Tlh97 and pcp7

Carfax83

Diamond Member
Nov 1, 2010
5,834
524
126
Similarly, Apple has been working on faster and faster ARM chips for some time, and has been testing the waters with scaling it up (A#x chips iPad) and actively cooling it (Apple TV) for quite some time. If you're reading between the lines, the Mac Pro appears to be apple's LAST professional x86/64 machine. By the time they are ready to update it again (5 years from now) they'll likely have an ARM HPC chip ready.
So according to you, Apple is going to dump x86-64 and go all in on ARM in roughly 5 years. How are they going to tackle compatibility with all of the x86 software?

I can imagine some people who haven't been paying attention, or have been in denial about the Spec FP and Int numbers are going to be in for a heck of a shock when that happens...
I don't think anyone is in denial about anything. If you have read this thread from front to back, several logical reasons have been postulated as to explain why the Apple A series is so performant on a per clock basis; most notably:

1) Low target clock speeds and massive caches, which enhances IPC due to less waiting on memory

2) Ultra wide microarchitecture which specifically targets low power mobile platforms which favor burst performance

3) Total control over the software and not having to worry about breaking backward compatibility

4) Process node advantage, which Intel enjoyed for many years

This is completely different from what Intel and AMD are doing. Both Intel and AMD design CPUs to preserve backward compatibility from both a hardware and software aspect, and also target multiple platforms with the same architecture.

They use the same core in laptops, desktops, server/database and HPC. If Apple does plan on scaling up their architecture to compete in other markets other than low power devices, then they are going to have to consider that the A series' ultra wide, high IPC and low clock speed attributes would likely not be helpful in more diverse workloads.
 

wintercharm

Junior Member
Apr 26, 2017
4
20
51
I've been hearing the ARMada for so long now about how they will come to rise and corner the market.
I don't speak for them. I didn't think this would happen until I saw that ARM was *actually* finally performant in single core compared to modern desktop processors (basically the last 1-2 years). Before then, I think it was very much a pipe dream, but right now it's a bit early. The benchmarks we *do* have seem to speak very favorably for ARM (Spec and SpecInt, and Geekbench 5, flawed as they all might be). But we literally cannot get any other information, until at least *some* ARM devices that run macOS or Windows hit the market... The upcoming Qualcomm 8cx is one such ARM chip. The A14 variant that goes in the MacBook Air might be next... Then we can play around with real world performance numbers, and see what's happening.

I also think there are changes coming in how products are made at a fundamental level, that will lead to ARM gaining a stronger foothold (custom silicon is a winning strategy) but that's a discussion for another day.

Everybody knew jets were going to be the big new thing, at least anyone paying attention. There were working on jets before the war started.
Exactly... and we're just now starting to see arm reach that level, and some companies are starting to quietly plan for this transition, while others are hedging their bets by deciding how much/how little time and resources and energy they dedicate to planning for it.

So according to you, Apple is going to dump x86-64 and go all in on ARM in roughly 5 years. How are they going to tackle compatibility with all of the x86 software?
They will pave the way with first party apps of their own, and developers who see an opportunity will follow through. furthermore, Apple's app development environment is about having very well optimized APIs for developers to hook into, and they've spent a lot of time unifying (feature-wise) their APIs between x86/64 and ARM, to where it's now relatively easy to port apps from iOS to macOS. They further streamlined this by giving people SwiftUI as the "future" of how UI is written on the mac, but also the watch, iPhone, and iPad...

We can expect future apps to be written with the same core functionality calling on similar/the same APIs for all these devices. The only thing different will be the UI that's written specifically for every particular form factor, so the app can adapt to the device it's being used on, with certain features or APIs being limited or disabled, if necessary.

Apple just dropped ALL 32 bit legacy apps on macOS. I don't expect they'll have any issues dropping x86/64 after emulating them for a little bit. Windows for ARM already has an x86 emulator, so if someone runs Bootcamp on an ARM mac, they can use that emulator for legacy software if they choose. Apple will likely write their own emulator, just like they wrote for the PPC >> Intel transition. No performance won't be great, and Apple will tell developers to update their apps, and notify people that the app is not performing well because developers haven't updated it (they did this for the 32bit >> 64bit transition on iPhone/iPad). Then, they will remove the noncompliant apps from the app store.


1) Low target clock speeds and massive caches, which enhances IPC due to less waiting on memory

2) Ultra wide microarchitecture which specifically targets low power mobile platforms which favor burst performance

3) Total control over the software and not having to worry about breaking backward compatibility

4) Process node advantage, which Intel enjoyed for many years
You're absolutely right about these things... and how they pertain to the wider market.

For 1 and 2, Apple has an in-house chip design team... they can adjust these things as necessary when targeting higher performance chips, or keep to this philosophy. Only lack of active cooling prevents these wide cores from bursting constantly. The large amounts of cache, and some data pipelining enhancements will allow the cores to remain fed, even when scaled up (for example, they could do quad channel memory in their wide desktop chip designs)

3 actually remains a major advantage for Apple. They have never been worried about backwards compatibility, and they won't start worrying now. Also, this is the major business advantage of ARM macs -- having total control of the product from silicon to ecosystem will allow apple to optimize far beyond what their competitors are able to do.

4. Yeah, process node advantage is something that Intel squandered, but even AMD doesn't have the preferential treatment Apple gets from TSMC. Apple is going to be the first to ship 5nm silicon, just like they were with 7nm... the sheer volume of their operation, and war chest of cash means they can buy out capacity at TSMC, Samsung, or wherever else, as they see fit...

This is completely different from what Intel and AMD are doing. Both Intel and AMD design CPUs to preserve backward compatibility from both a hardware and software aspect, and also target multiple platforms with the same architecture.
Yes, it's different, but Apple also has a reputation as a trend setter in the market... when they do something, and the experience is markedly better, many others follow suit. Apple needs to release one ARM laptop that has unbelievable performance and battery life, at a price that's just too tempting for its customers, and for the "wider swaths" of the general public. An excellent example of the last time this happened was the wedge-shaped MacBook Air in 2010. It was an affordable thin/light everyday laptop with amazing battery life, and one of the first mass market devices with an SSD as the standard configuration. It had a massive ripple effect in the laptop industry, and helped define the market for the last decade. One stellar ARM MacBook form Apple could be the rock that shifts a river, because consumers will start wondering why other devices can't have 24 hours of battery life in such a thin package, while being so cheap and fast, and OEMs will have to respond somehow.

I think that moment is coming. Some do not, and right now we're both quite rational and understand that there are going to be some caveats and tradeoffs if Apple pushes to arm - mainly that they *will* break compatibility, and it will upset some of their users. But if the software remains macOS, in that I can still have a CLI and full access to the file system, I'm personally okay with It.
 
Last edited:

USER8000

Golden Member
Jun 23, 2012
1,503
705
136
Everybody knew jets were going to be the big new thing, at least anyone paying attention. There were working on jets before the war started. War tends to speed up things that can bring a huge advantage.

I've been hearing the ARMada for so long now about how they will come to rise and corner the market. It hasn't happened. Windows was available on MIPS (and others) in the 90's. It didn't happen then either. At this point, show me a product or prototype. I'm waiting, and will keep waiting. It's not that I'm anti-ARM, I'm for what is best. Should ARM prove itself, I'd gladly cheer it on like how the US went from a 15 minute suborbital flight to landing on the moon in less than 10 years. I just don't see it yet.
ARM is fine - they knew their niche and stuck to it. Then went for low power and low die size,ie,its about cost - even their model is about licensing. That is why they succeeded - it is one of the enduring tech success stories the UK has produced. But all this stuff about Apple destroying AMD and Intel every year keeps being repeated,and it still doesn't happen - its been at least 5 years on here. Why should Apple even want to make desktop and server CPUs?? Who is going to buy desktops and servers from them,when even they have retreated from the former. They are a niche player in laptops and desktops.
Even with laptops,again why should they care about desktop replacements,when even their own Intel laptops are thermally challenged? Apple makes slimline laptops,not power laptops. So realistically their market is casuals(phones),and people doing things on the move,who want something slim. For that even the processing power of a basic quad core is enough.

Once they start scaling up designs,you will see them also hit the problems others have too,and they are basing all their growth on being on cutting edge manufacturing - Intel and AMD are far more experienced on using older nodes,ie,look at how AMD is doing that with chiplets. There is a reason for doing this - cost.

A lot here are just ignoring emerging CPU design companies from China - they don't even need to match Intel,AMD or Apple to be a threat. They need to be a good enough to subsitute our products from their systems,including supercomputers and these could be using anything from X86,ARM,MIPS or their own sets.
 
Last edited:

beginner99

Diamond Member
Jun 2, 2009
4,404
744
126
Apple could end up with a MacBook lineup that ranges from $800 ("old" Arm Mac) >> $2700 (top end updated ARM mac) if they moved to ARM.
True but that doesn't solve the issue of macpros. Either they remain on x86, get cancelled or apple simply takes the loss at making their own many-core SOCs.
 

naukkis

Senior member
Jun 5, 2002
298
137
116
They use the same core in laptops, desktops, server/database and HPC. If Apple does plan on scaling up their architecture to compete in other markets other than low power devices, then they are going to have to consider that the A series' ultra wide, high IPC and low clock speed attributes would likely not be helpful in more diverse workloads.
You do understand that Intel cpu core is also just as wide as they can build, and they are extracting it to become even more wide. And there is difference between x86 and Aarch64, Intels cpu front end is at least twice as wide as Apple's. x86 hold on how wide instruction decoding could be and there's not much point of widening rest of the core if front end (and also back end) cannot keep up. Apple's approach is fine, their next cpu core will be even more wide and more performing. As will be Intel and AMD cpus too because that's only way to increase performance as silicon limits how high you can clock your core.
 

naukkis

Senior member
Jun 5, 2002
298
137
116
True but that doesn't solve the issue of macpros. Either they remain on x86, get cancelled or apple simply takes the loss at making their own many-core SOCs.
Apple has money to do their own Mac pro chips if they want. Or they can use 3rd party chips like Cavium Thunder X3 - there's plenty of choices available in arm ecosystem, with Intel you are chained to one supplier.
 
  • Like
Reactions: Etain05

Nothingness

Platinum Member
Jul 3, 2013
2,143
372
126
I don't think anyone is in denial about anything. If you have read this thread from front to back, several logical reasons have been postulated as to explain why the Apple A series is so performant on a per clock basis; most notably:

1) Low target clock speeds and massive caches, which enhances IPC due to less waiting on memory

2) Ultra wide microarchitecture which specifically targets low power mobile platforms which favor burst performance

3) Total control over the software and not having to worry about breaking backward compatibility

4) Process node advantage, which Intel enjoyed for many years
You make it sound like Apple Axx chips are only good at IPC. That's simply not true. A13 is faster on SPECint 2006 than 3950X and slightly slower than a 9900K, both of which are high-end desktop chips.
 
  • Like
Reactions: Etain05

ksec

Senior member
Mar 5, 2010
389
84
101
If Apple can afford to do a custom chip every 2 years for the iPad pro, they can certainly afford to do a custom chip every 2 years for the Mac - a higher priced product, with similar sales.
This completely ignore Mac has wider range of TDP, The word here Is "afford". They can but again it isn't very much a valid argument from an financial perspective.

( Until Amazon AWS takes over the server market with ARM, we would start talking. But that is still at least 3 - 5 years away )
 

ksec

Senior member
Mar 5, 2010
389
84
101
Apple has money to do their own Mac pro chips if they want. Or they can use 3rd party chips like Cavium Thunder X3 - there's plenty of choices available in arm ecosystem, with Intel you are chained to one supplier.
I think the play here is that once AWS gain ground on Server and HPC Arm usage. It will only be a matter of time before either Google or Microsoft Fab their own ARM solution, or some third party is coming in to corner the demand for one. Once the ARM server and HPC market sort of stabilise ( or at least break even in R&D ) with a mix or x86 and ARM, we should see enough momentum to push ARM forward in PC. ARM also has a very decent roadmap in their High Performance Core.

But again that is in the very long run, ~4- 5 years time frame.
 

Richie Rich

Senior member
Jul 28, 2019
271
141
76
One stellar ARM MacBook form Apple could be the rock that shifts a river, because consumers will start wondering why other devices can't have 24 hours of battery life in such a thin package, while being so cheap and fast, and OEMs will have to respond somehow.
Exactly. For ARM MacBook with such an extra performance they can ask extra money so even further boosting margins. Typical Apple. Apple can sell both ARM and x86 laptops in the same time just to demonstrate that higher price for their ARM MacBook is still pretty good deal. However this will show how x86's perf/watt is bad, overall performance is bad and it could be the point of turnover for whole SW industry to massive adoption of ARM. Since most of SW will be available for both ARM and x86 whole industry will profit from increased competition (lower HW prices will boost sales numbers).

Another thing is the Apple's unique ultra wide low power architecture. This clearly demonstrates that this is the way for server CPUs to go due to similar clocks around 2.6 GHz (double IPC means 8x lower power consumption at iso-perf). So everybody (ARM Cortex cores, AMD and Intel) will need to adopt this ultra wide architecture somehow. And here it starts to be interesting because only Cortexes will follow Apple footsteps due to smart phone market. AMD and Intel can bring kind of hybrid, ultra wide and high clock. I wonder who will have the next ultra wide uarch....
 

lobz

Golden Member
Feb 10, 2017
1,011
813
106
You make it sound like Apple Axx chips are only good at IPC. That's simply not true. A13 is faster on SPECint 2006 than 3950X and slightly slower than a 9900K, both of which are high-end desktop chips.
Don't forget please that you're talking about single-threaded performance here. I'd love to see a 16 core 'A13' with 64 MB of L2 cache though........ then we can talk again
 

SAAA

Senior member
May 14, 2014
430
45
91
Don't forget please that you're talking about single-threaded performance here. I'd love to see a 16 core 'A13' with 64 MB of L2 cache though........ then we can talk again
I'd like too. But for real, like is there any reason it shouldn't scale adding more cores?

OK there's the hard job of making it work with 4x the cores, with more memory bandwidth, with the IO that's missing compared to server and desktop chips, with software multithreading… but then?
Is there any "real world" test were this chip performs really poor compared to Zen or Lake cores?

I was sceptic too, absolute performance wasn't there anyway so who cared, but now it isn't just some geekbench were it shines: even SPEC says this is full blown desktop class in your pocket. Web browsers test do, scripts, java etc. If it's matter of vector performance or some AI boost I'm sure it could be done with the current design without doubling size or anything.

For clocks as others said here going from 2.6GHz in a mobile device to at least 3, maybe 3.5GHz in a 100W envelope doesn't sound surreal.
4GHz might be too much even for single core turbo speeds if the voltage wall is very step but already at 3.5 nothing on the market would compare, unless Golden cove or Zen 4 have the same fairy dust inside them.
 

lobz

Golden Member
Feb 10, 2017
1,011
813
106
I'd like too. But for real, like is there any reason it shouldn't scale adding more cores?

OK there's the hard job of making it work with 4x the cores, with more memory bandwidth, with the IO that's missing compared to server and desktop chips, with software multithreading… but then?
Is there any "real world" test were this chip performs really poor compared to Zen or Lake cores?

I was sceptic too, absolute performance wasn't there anyway so who cared, but now it isn't just some geekbench were it shines: even SPEC says this is full blown desktop class in your pocket. Web browsers test do, scripts, java etc. If it's matter of vector performance or some AI boost I'm sure it could be done with the current design without doubling size or anything.

For clocks as others said here going from 2.6GHz in a mobile device to at least 3, maybe 3.5GHz in a 100W envelope doesn't sound surreal.
4GHz might be too much even for single core turbo speeds if the voltage wall is very step but already at 3.5 nothing on the market would compare, unless Golden cove or Zen 4 have the same fairy dust inside them.
I was trying to say, in a big CPU with a lot of cores it's not realistic to have the massive caches that the A13 has, which contribute a lot to those huge IPC numbers. That's why I always said it's so dumb to compare mobile ARM CPUs with huge desktop / enterprise x86 CPUs and draw the huge leaps of conclusions Richie Rich and some other guys do. Kudos to Apple, seriously, they have an awesome chip. I also very much like comparisons between ISAs and such - but can we please keep it real and acknowledge that it is just empirical and a means to satisfy some curiosity, rather than seeing a number in an excel sheet, freaking out and flood every forum 12 times a day with the same asinine conclusion I'm seeing here sometimes?
 
Last edited:

naukkis

Senior member
Jun 5, 2002
298
137
116
I was trying to say, in a big CPU with a lot of cores it's not realistic to have the massive caches that the A13 has, which contribute a lot to those huge IPC numbers. That's why I always said it's so dumb to compare mobile ARM CPUs with huge desktop / enterprise x86 CPUs and draw the huge leaps of conclusions Richie Rich and some other guys do. Kudos to Apple, seriously, they have an awesome chip. I also very much like comparisons between ISAs and such - but can we please keep it real and acknowledge that it is just empirical and a means to satisfy some curiosity, rather than seeing a number in an excel sheet, freaking out and flood every forum 12 times a day with the same asinine conclusion I'm seeing here sometimes?
Those are single-thread test. So A13 have much less effective cache than 3950x or 9900K. Yes, it's very bad comparison to compare phone soc to desktop or server class cpu. Phone soc is severely limited in power and thermal envelope. And yet Apple cpu will bring similar performance. It's plain ridiculous that phone soc is so fast - there's something really hampering x86 desktop performance.
 
  • Like
Reactions: Etain05

lobz

Golden Member
Feb 10, 2017
1,011
813
106
Those are single-thread test. So A13 have much less effective cache than 3950x or 9900K. Yes, it's very bad comparison to compare phone soc to desktop or server class cpu. Phone soc is severely limited in power and thermal envelope. And yet Apple cpu will bring similar performance. It's plain ridiculous that phone soc is so fast - there's something really hampering x86 desktop performance.
I'm sorry, what?
 
  • Haha
  • Like
Reactions: Tlh97 and pcp7

lobz

Golden Member
Feb 10, 2017
1,011
813
106
Both those chips have 16MB available cache for single thread operation, A13 only has 8MB.
8MB L2 shared between the 2 fast cores on A13.
512KB L2 cache available for one core on Zen 2.
128K L1 cache on A13
32K L1 on Zen 2.

No amount of L3 cache will ever make up for even a fraction of the advantage a 8-16 times (!!!) as much L2 cache plus 4 times as much L1 cache gives.
Please rethink what you wrote in your first post to me in light of this.
 

naukkis

Senior member
Jun 5, 2002
298
137
116
8MB L2 shared between the 2 fast cores on A13.
512KB L2 cache available for one core on Zen 2.
128K L1 cache on A13
32K L1 on Zen 2.

No amount of L3 cache will ever make up for even a fraction of the advantage a 8-16 times (!!!) as much L2 cache plus 4 times as much L1 cache gives.
Please rethink what you wrote in your first post to me in light of this.
Cache is always about size vs speed. That Apple's shared L2 is it's last level cache and about same latency as those desktop cpu's L3, so very comparable. Not even mentioning that both 9900K and 3950X have memory controller which itself burns much more power than whole A13Soc - resulting memory latencies about half of that A13.

Comparing phone soc to full blown desktop chip isn't fair. It's A13 which is severely crippled in that comparison. That's the absurd point, still with all limits of being phone soc A13 still is as fast as best x86 chips - they are totally different level of cpus.
 
  • Like
Reactions: Etain05

lobz

Golden Member
Feb 10, 2017
1,011
813
106
Cache is always about size vs speed. That Apple's shared L2 is it's last level cache and about same latency as those desktop cpu's L3, so very comparable.
Not really. Just a part of it is reserved to act like an actual L3 cache. It is still completely different than Zen 2's L3 cache. I agree it's completely unfair, but in the opposite direction than what you're implying.
 

Carfax83

Diamond Member
Nov 1, 2010
5,834
524
126
You make it sound like Apple Axx chips are only good at IPC. That's simply not true. A13 is faster on SPECint 2006 than 3950X and slightly slower than a 9900K, both of which are high-end desktop chips.
Not sure what you're getting at here. Isn't all the lavish praise directed towards the Apple A series on account of its much higher IPC, comparative to x86 CPUs?
 

insertcarehere

Senior member
Jan 17, 2013
276
75
101
Not sure what you're getting at here. Isn't all the lavish praise directed towards the Apple A series on account of its much higher IPC, comparative to x86 CPUs?
Also that it is very competitive in at least *some* ST benchmarks with the best x86 cpus out there despite being put in an iPhone form factor vs a desktop form factor, with the power, cooling, and memory limitations implied in the former.

It should be noted that even A77, although still far behind Apple, is also pretty comparable with the very latest x86 cpus (Zen 2/Ice Lake) on an IPC basis in whatever cross-platform benchmarks available atm.
 
Last edited:

naukkis

Senior member
Jun 5, 2002
298
137
116
Not really. Just a part of it is reserved to act like an actual L3 cache. It is still completely different than Zen 2's L3 cache. I agree it's completely unfair, but in the opposite direction than what you're implying.
It's not about whether last level cache is L2, L3 or L4, it's still shared last level cache - slow compared to core's private caches. Apple could easily extract more performance from their core by introducing middle level cache like Intel and AMD have done, by something like 1MB private L2 cache which would have about half latency of last level cache. In phone SOC that performance can't be extracted within power limits so it isn't implemented but for higher performing higher power environment it is easy way to increase performance.
 
  • Like
Reactions: Etain05

ASK THE COMMUNITY