Solved! ARM Apple High-End CPU - Intel replacement

Page 36 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Richie Rich

Senior member
Jul 28, 2019
470
229
76
There is a first rumor about Intel replacement in Apple products:
  • ARM based high-end CPU
  • 8 cores, no SMT
  • IPC +30% over Cortex A77
  • desktop performance (Core i7/Ryzen R7) with much lower power consumption
  • introduction with new gen MacBook Air in mid 2020 (considering also MacBook PRO and iMac)
  • massive AI accelerator

Source Coreteks:
 
  • Like
Reactions: vspalanki
Solution
What an understatement :D And it looks like it doesn't want to die. Yet.


Yes, A13 is competitive against Intel chips but the emulation tax is about 2x. So given that A13 ~= Intel, for emulated x86 programs you'd get half the speed of an equivalent x86 machine. This is one of the reasons they haven't yet switched.

Another reason is that it would prevent the use of Windows on their machines, something some say is very important.

The level of ignorance in this thread would be shocking if it weren't depressing.
Let's state some basics:

(a) History. Apple has never let backward compatibility limit what they do. They are not Intel, they are not Windows. They don't sell perpetual compatibility as a feature. Christ, the big...

Richie Rich

Senior member
Jul 28, 2019
470
229
76
First, you can't add switch to mobile for mobile% (your 60% mobile calc)

Second, smartphone and tablet appear to be what they call mobile, not sure the 3rd definition.

Lastly, mobile games have been going up as smartphone usage has gone up. But I think the market is pretty well saturated by now. So, first, I see the 45% as a real mobile number, and second, the future of this is a guess. You guess smartphone/tablet will grow, and I say its all going to plateau.

Stop twisting the facts.
You cry nicely but I'm afraid you cry on wrong shoulder. I did not write that article. But lets look at facts:
  • Is Nintendo Switch a mobile device? .......................................................Yes, it is.
  • Is Nintendo Switch CPU ARM based?..................................................... Yes, it is. (nVidia Tegra X1 uses Cortex A57 cores)
  • Can somebody claim mobile gaming is phones + mobile consoles? ........ Yes, he can.

The most important conclusion is: ARM devices has about 60% of worldwide gaming revenue and still rising. You can disagree, you can deny, .... but that's all you can do about that. Most worldwide money flows into ARM based devices, not x86.
 

soresu

Platinum Member
Dec 19, 2014
2,617
1,812
136
By your logic PC gaming should have decreased when the PC market started decreasing years ago. Did that happen? Honest question, I don't know :)
Different ball game when the device market is going down - less people may be buying PC's, but that doesn't mean that the ones that bought them before just discarded them and stopped playing, it just means that they aren't buying new hardware.

OTOH if x people have phones, you can't have a mobile gaming market of x+1 now can you?
 

soresu

Platinum Member
Dec 19, 2014
2,617
1,812
136
Is Nintendo Switch a mobile device? .......................................................Yes, it is.
Its a console, same as the 3DS, all it's predecessors and the other handheld vendor devices that also used ARM chips in the past.

ARM and the mobile market are not the same thing, it just happens that the latter follows the former in release cadence to maintain it's momentum - there are plenty of wall powered ARM devices out there, including the famous Raspberry Pi line of SBC's and the nVidia SHIELD TV which shares Switch's Tegra hardware.

Not to mention any number of the thousands of smart HDTV models that run from an ARM based SoC, or streaming devices like Chromecast and Fire TV.

The handheld console market and the mobile market are separate beasts - even 'gaming' phones are still phones, and just as limited by the anaemic choice of AAA titles in that market.
 

Nothingness

Platinum Member
Jul 3, 2013
2,371
713
136
Different ball game when the device market is going down - less people may be buying PC's, but that doesn't mean that the ones that bought them before just discarded them and stopped playing, it just means that they aren't buying new hardware.
That applies to the mobile device market too, doesn't it?

OTOH if x people have phones, you can't have a mobile gaming market of x+1 now can you?
I'm not sure I get it :) If people start buying more games per device the market will increase even if the number of devices stays the same. So even if the mobile market is saturated in terms of devices it doesn't mean the mobile gaming market can't increase.

The question would then be: will people spend more money to game on their mobile device? I'm certainly unable to tell as I'm not a mobile gamer and can't understand why people spend money (or even time) on Candy Crush and games like that.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
Anyone taking bets on A78/Hercules IPC bump?
Do we need to?
They pretty adamantly stated >15% performance improvement:

Then, it is also implementation specific improved:

I'll assume TSMC's 5nm will be faster than Samsung's 7nm/5nm that is behind TSMC's 7nm/7nm+.

A76 -> A77 is 20% improvement with 7nm to 7nm
A75 -> A76 is 35% improvement with 10nm to 7nm
Being between 20% to 35% has the higher confidence.

It'll beat Intel SNC/WLC given the TDP range of sub-6W but probably won't beat Apple's A14/A15.
 
Last edited:
  • Like
Reactions: Richie Rich

soresu

Platinum Member
Dec 19, 2014
2,617
1,812
136
More high end ARM chips news, seems the European supercomputer initiative is licensing the Neoverse Zeus platform, whether that is effectively based on A77 or not is unclear still, I expected Zeus to be announced as N2 in february one year after N1/Ares.

 

Richie Rich

Senior member
Jul 28, 2019
470
229
76
Anyone taking bets on A78/Hercules IPC bump?
ARM promised >15% IPC jump.
However Apple's evolution was like this:
  • A9 (2015)... +7% over Intel 9900K
  • A10 (2016)... +16%
  • A11 (2017)... +42% .... first 6xALU core
  • A12 (2018)... +65%
  • A13 (2019)... +83%

Today A77 is somewhere about A9 from 2015 in terms of performance. I expect A78 to be last 4xALU core similar to A10. So IPC jump between 15-20%. I expect some significant changes to wider machine like 6xALU core A11 with new upcoming ARM lineup with SVE2. But even 15-20% will be enough to maintain IPC advantage over Zen3 probably.

I wonder why HiSilicon skipped A77 in favor of A78 Hercules. A77 is so much better than A76 so it doesn't make sense. maybe because they know A78 is much better design than A77? Could A78 have 6xALUs? That would be massacre especially in servers for Graviton3.
 

soresu

Platinum Member
Dec 19, 2014
2,617
1,812
136
I wonder why HiSilicon skipped A77 in favor of A78 Hercules. A77 is so much better than A76 so it doesn't make sense. maybe because they know A78 is much better design than A77? Could A78 have 6xALUs? That would be massacre especially in servers for Graviton3.
Political uncertainty for a start I think, secondly it's not the first time they have done it - they skipped A75 with a 10nm A73 (the K970 in my Mate 10) and went straight to A76 with K980.
But even 15-20% will be enough to maintain IPC advantage over Zen3 probably.
Completely irrelevant, there is still a dearth of good software ported to WARM native mode - and though they have announced an OGL over DX12 initiative, I would not expect wonders considering they still haven't even got x64 on ARM binary translation working/performant yet.

On the other foot iOS and Android have a lot of software, but most of it is not the the right software to really take advantage of monster CPU cores, and much of the software that does (DCC) would just as easily gobble up more cores instead of bigger ones, or even GPU compute as a lot of high end DCC is moving to GPU as time goes by.
 
Last edited:
  • Like
Reactions: Tlh97

Richie Rich

Senior member
Jul 28, 2019
470
229
76
Political uncertainty for a start I think, secondly it's not the first time they have done it - they skipped A75 with a 10nm A73 (the K970 in my Mate 10) and went straight to A76 with K980.
That's what I meant. HiSilicon knew that A76 was worth to wait and A75 was worth to skip. That would mean A78 Hercules is much bigger performance jump similar to A75->76 was, then it make sense to wait for A78 for them. Or maybe they just save some money by skipping every second generation. ARM's road map shows lower performace gain than A77 though. Perfomance gain of 150% and doubling IPC in just 4 years? Not bad at all. Intel did poor 18% in the same time.

3_575px.png
 

Richie Rich

Senior member
Jul 28, 2019
470
229
76
Actual rumors, jeez Apple prepares heavy artillery:
  • - there are 3 processors in based on new A14 uarch at 5nm (probably 2-core for iPhone, 4-core for iPad and the 3rd for Mac)
  • - 1st Mac CPU will have 12 cores 8 big + 4 little (that's super powerfull config, no one can compete in laptops and they will destroy 95% of sold desktops, only high clocked power hungry 12 core can beat that Apple beast)
  • - Apple is experimenting with CPU with more than 12 cores (not clear if it is 4th A14 prototype or new A15 for workstations to replace Xeons)

https://www.macrumors.com/2020/04/23/12-core-arm-macs-2021-report/amp/?__twitter_impression=true

And if A14 is based on ARMv9 and 2048-bit SVE2 vectors, that will be absolutely devastating for x86. This would be powerfull workstation packed in thin laptop.
 
Last edited:

soresu

Platinum Member
Dec 19, 2014
2,617
1,812
136
Even A64FX isn't vector widths that wide! You really don't make any sense at all. Those are consumer CPUs.
^^ What he said, A64FX is 512 bit vector length SVE, and those are meant as supercomputer chips.

256 bit is likely the max we will see in consumer chips at first.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
A64FX => 2x 512-bit wide SIMD FMA
It is 1024-bit wide; 1 x 1024-bit SVE pipe(FLA+FLB) or 2x 512-bit SVE pipes(FLA/FLB)

48 cores x 1.8 GHz x 32 64-bit FMA FLOPs => 2.7+ DP TFlops
1024 / 64 = 16 x 2 = FMA

For the FUGAKU Supercomputer, Fujitsu states they can run it up to 2.2 GHz:
48 cores x 2.2 GHz x 32 64-bit FMA FLOPs => 3.3+ DP TFlops




Regardless, if Apple supports SVE2 they can run 2048-bit SIMDs.
 
Last edited:

soresu

Platinum Member
Dec 19, 2014
2,617
1,812
136
A64FX => 2x 512-bit wide SIMD FMA
It is 1024-bit wide; 1 x 1024-bit SVE pipe(FLA+FLB) or 2x 512-bit SVE pipes(FLA/FLB)
Their own media specifically states 512 bit SIMD x2 - not 1024, you have the wrong end of the stick.

1587697478364.png
1587697518240.png
That's just 512 x2, not 512x2 OR 1024x1 - it's the same as having 2x full AVX512 units.

Effectively it is 1024 bits of performance, but the units themselves are 512 bit.

Ah found this, I think this is what you got it from.....
1587698870549.png

That sounds like they investigated the possibility of just having 1024x1 in the design phase and decided in favour of 512x2 due to utilization concerns.
 
Last edited:

Doug S

Platinum Member
Feb 8, 2020
2,203
3,407
136
Actual rumors, jeez Apple prepares heavy artillery:
  • - there are 3 processors in based on new A14 uarch at 5nm (probably 2-core for iPhone, 4-core for iPad and the 3rd for Mac)
  • - 1st Mac CPU will have 12 cores 8 big + 4 little (that's super powerfull config, no one can compete in laptops and they will destroy 95% of sold desktops, only high clocked power hungry 12 core can beat that Apple beast)
  • - Apple is experimenting with CPU with more than 12 cores (not clear if it is 4th A14 prototype or new A15 for workstations to replace Xeons)

https://www.macrumors.com/2020/04/23/12-core-arm-macs-2021-report/amp/?__twitter_impression=true

And if A14 is based on ARMv9 and 2048-bit SVE2 vectors, that will be absolutely devastating for x86. This would be powerfull workstation packed in thin laptop.

Why would 2048 bit SVE2 vectors be "devastating for x86"? There are precious few tasks that could make use of that, that would be wasted silicon for 98% of the userbase. If there was a big market for 2048 bit vectors, Intel would sell some AVX2048 CPUs, at least in specialized stuff like the Knights* cores.

No way Apple goes that wide if/when they implement SVE2.
 
  • Like
Reactions: soresu

Richie Rich

Senior member
Jul 28, 2019
470
229
76
Why would 2048 bit SVE2 vectors be "devastating for x86"? There are precious few tasks that could make use of that, that would be wasted silicon for 98% of the userbase. If there was a big market for 2048 bit vectors, Intel would sell some AVX2048 CPUs, at least in specialized stuff like the Knights* cores.

No way Apple goes that wide if/when they implement SVE2.
I don't agree. It's not about maximal 2048-bit vector length as you suggest. SVE2 is very devastating for x86 in every way. Variable length 128-2048-bit vectors, size-less types, big push for autovectorization, all this is totally different way of solving same problem. Instead hand ASM optimization (slow and expensive) for AVX512 now, the SVE2 is about to shifting most of unnecessary work to compiler and HW (ie. jump from 80's to 21st century). BTW if you consider matrix multiplication 2048-bit = 32x Doubles and it's 4x8 matrix size only. CFD fluid simulations use millions of cells and so much bigger matrix sizes. So in some applications you can find 2048-bit useful even today (that's probably where Fujitsu aims their A64FX and why they chosen pretty wide 512-bit dual FPUs).

However real HW FPU width is something different under SVE2 abstraction layer because you need to find out some sweet spot for your typical load. That's why A64FX has 2x512-bit FPUs and Matterhorn will have 2x256-bit as Andrei suggested. And those small efficient cores can have just 1x128-bit FPUs.
 

Richie Rich

Senior member
Jul 28, 2019
470
229
76
He's not the one suggesting that it's about 2048b SVE2. You're the one who brought up that vector length, specifically. From this post:

https://forums.anandtech.com/threads/arm-apple-high-end-cpu-intel-replacement.2571738/post-40141635
And I still stand behind my opinion that "2048-bit SVE2" will be devastating for x86. It is huge jump from 128-bit NEON and also huge jump from AVX512 by philosophy point of view. It's not my fault some people want to see only half of words in sentence I wrote.