Discussion Apple Silicon SoC thread

Page 251 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,825
1,396
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

Screen-Shot-2021-10-18-at-1.20.47-PM.jpg

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:


M4 Family discussion here:

 
Last edited:

poke01

Platinum Member
Mar 8, 2022
2,205
2,803
106

It seems the M4 family will be based on islands around Ireland. With M3, Apple used islands around Spain.

If Apple doesn’t go the Chop route and makes a unique chip for each M4 chip I would expect a new code name for each chip.

As of this post we only know the code names of the M4 base chip. The Pro and Mac are not yet known.

—————————
A18 is called Tahiti and of course I am obligated to say this quote:

“who lives in Tahiti?.. Tahiti-ins I guess”
 
Last edited:
  • Like
Reactions: Apokalupt0

Mopetar

Diamond Member
Jan 31, 2011
8,113
6,768
136
There's something that might give good results from very wide cores that aren't yet utilized - like hardware loop unrolling. Complex to do - but when done it makes possible to run every iteration of loop in it's own hardware making well use of very wide execution hardware. Though proper ISA support would make implementing that kind of parallelism much easier.

Vector instructions are even better in those cases. Apple supports NEON instructions in their SoCs, but they don't talk about their vector processing capabilities all that much.

Loop unrolling needs a front end capable of issuing a larger number of instructions that come from executing multiple loop iterations at once. SIMD might be faster even with fewer execution ports because it doesn't get bound up by the front end and if you have a large enough vector size the memory accesses will be better as well.

I'm surprised that Apple hasn't added some kind of SMT yet. That's probably one of the best ways to keep a wider architecture busy without requiring too many additional transistors.
 

naukkis

Senior member
Jun 5, 2002
903
786
136
Vector instructions are even better in those cases. Apple supports NEON instructions in their SoCs, but they don't talk about their vector processing capabilities all that much.

Loop unrolling needs a front end capable of issuing a larger number of instructions that come from executing multiple loop iterations at once. SIMD might be faster even with fewer execution ports because it doesn't get bound up by the front end and if you have a large enough vector size the memory accesses will be better as well.

I'm surprised that Apple hasn't added some kind of SMT yet. That's probably one of the best ways to keep a wider architecture busy without requiring too many additional transistors.

This specially was solutions for utilizing wider cores. Vectorization(SIMD) is working only when there's no dependencies between data - basically dependencies have to be resolved compile time. With loop unrolling it's also possible resolve dependencies (calculate or predict variables from runtime data) and execute multiple loop iterations on paraller.
 

FlameTail

Diamond Member
Dec 15, 2021
3,951
2,376
106
I'm surprised that Apple hasn't added some kind of SMT yet. That's probably one of the best ways to keep a wider architecture busy without requiring too many additional transistors.
Apple doesn't need to. Their cores are OoO execution monsters. OoO ensures the good utilisation of resources.
 

FlameTail

Diamond Member
Dec 15, 2021
3,951
2,376
106
Let us discuss about Apple's OLED Macbooks here👇

 

Mopetar

Diamond Member
Jan 31, 2011
8,113
6,768
136
Apple doesn't need to. Their cores are OoO execution monsters. OoO ensures the good utilisation of resources.

Every major CPU for general purpose use is OoO these days. Maybe Apple had a larger buffer than some other chips, but for some workloads the performance will be bound by whatever type of execution port is completely used. If it's something that's load/store heavy, that means ALUs that aren't getting any use. SMT makes it easier to avoid those type of situations because some other thread could use those ALUs in this hypothetical situation.

I think Apple could do a better job than AMD/Intel because they write the operating system as well. Maybe it's just not something they ever thought to add for their phones where it's mainly just a single app executing at a time and anything in the background being restricted in terms of what it can do, but now that they're designing desktop class chips as well, there's a lot more reason to consider an implementation.
 

Doug S

Platinum Member
Feb 8, 2020
2,784
4,746
136
Every major CPU for general purpose use is OoO these days. Maybe Apple had a larger buffer than some other chips, but for some workloads the performance will be bound by whatever type of execution port is completely used. If it's something that's load/store heavy, that means ALUs that aren't getting any use. SMT makes it easier to avoid those type of situations because some other thread could use those ALUs in this hypothetical situation.

I think Apple could do a better job than AMD/Intel because they write the operating system as well. Maybe it's just not something they ever thought to add for their phones where it's mainly just a single app executing at a time and anything in the background being restricted in terms of what it can do, but now that they're designing desktop class chips as well, there's a lot more reason to consider an implementation.

SMT is probably pointless in a world where you have big and little cores. How would it benefit Apple to add a second thread to its P cores when they have very capable E cores just sitting there, sipping power? That's where you want that thread to run. They aren't designing servers, they don't care as much about maximizing throughput for unlimited MT code as Intel/AMD do.

Isn't Intel dropping HT in its next gen CPUs? Right after they added their own capable E cores? I don't think that's a coincidence.
 

FlameTail

Diamond Member
Dec 15, 2021
3,951
2,376
106
2020: A14 = N5
2021: A15 = N5P
2022: A16 = N4
2023: A17 = N3B
2024: A18 = N3E
2025: A19 = N3P
2026: A20 = N2+BSPDN
2027: A21 = N2P
2028: A22 = A14
2029: A23 = A14P

[Speculation]
 
Last edited:

richardskrad

Member
Jun 28, 2022
56
62
61
Holy, the M3 is already scoring 1000 points higher in GB6 single-thread over the M1. My M1 Air is just as snappy as the day I bought it and it's crazy that the M3 is that much faster while retaining Apple's efficiency lead. People give Tim Cook crap for a lots of things but you can't deny that under his watch, Apple has completely changed the laptop game. AMD and Intel still haven't caught up 4+ years later.
 

Mopetar

Diamond Member
Jan 31, 2011
8,113
6,768
136
SMT doesn't use up nearly as much space as a separate e-core. Obviously having a dedicated core is better than having to share one, but there are some processes that could get by without their own dedicated e-core just fine.

You do make fair points regarding Apple not caring about the server market where SMT shines. There are other reasons that they might not want to include it as having multiple threads can pollute caches or create security concerns.

Personally I think Intel is acting foolishly if they really are abandoning Hyperthreading going forward.
 
Jul 27, 2020
20,040
13,738
146
Personally I think Intel is acting foolishly if they really are abandoning Hyperthreading going forward.
Yeah, if they are so concerned about security or performance degrading for certain applications, they should push Microsoft to let the user create affinity profiles so the affected applications don't get put on HT virtual cores. I mean, they already made Microsoft include support for their crappy Thread Director that can't handle AVX-512 code.
 

naukkis

Senior member
Jun 5, 2002
903
786
136
Yeah, if they are so concerned about security or performance degrading for certain applications, they should push Microsoft to let the user create affinity profiles so the affected applications don't get put on HT virtual cores.

You do know that what you supposed means disabling HT. Splitting each core to two virtual cores = HT on, one thread per core = HT off. HT can of course "disabled" by parking one core from core pair. But to have 100% one thread performance HT have to be disabled totally as some hardware resources are split half between threads when HT ability is on. Gamers should disable HT for best performance when CPU has enough thread for current game without it.
 
Jul 27, 2020
20,040
13,738
146
I think if Intel were serious about it, they could devise some form of SMT that avoids the sharing of resources when the HT core isn't being used or it can be disabled virtually. I think it shows Intel's hypocrisy that they went to the trouble of creating Thread Director for proper E-core utilization when they could've also put the same effort to ensuring proper HT core utilization (use when it is beneficial, prevent when it's not). It's like they just gave up and said, we gonna keep on adding E-cores! Yeah, ok fine. Give us 32 or 64 E-cores then!

Maybe that's too much? OK, how about give us 16 Skymont E-cores plus 32 additional shrunken Tremont cores!
 

naukkis

Senior member
Jun 5, 2002
903
786
136
I think if Intel were serious about it, they could devise some form of SMT that avoids the sharing of resources when the HT core isn't being used or it can be disabled virtually. I think it shows Intel's hypocrisy that they went to the trouble of creating Thread Director for proper E-core utilization when they could've also put the same effort to ensuring proper HT core utilization (use when it is beneficial, prevent when it's not). It's like they just gave up and said, we gonna keep on adding E-cores! Yeah, ok fine. Give us 32 or 64 E-cores then!

Maybe that's too much? OK, how about give us 16 Skymont E-cores plus 32 additional shrunken Tremont cores!

In hybrid cpu configuration big cores are there for best per thread performance. If they want still to utilize SMT right core to have it is those e-cores - splitting slow core performance to half for best n-thread performance but still maintaining good 1-thread performance. Having SMT on their fast cores is just stupid thing to do.
 

SpudLobby

Senior member
May 18, 2022
989
680
106
SMT is probably pointless in a world where you have big and little cores. How would it benefit Apple to add a second thread to its P cores when they have very capable E cores just sitting there, sipping power? That's where you want that thread to run. They aren't designing servers, they don't care as much about maximizing throughput for unlimited MT code as Intel/AMD do.

Isn't Intel dropping HT in its next gen CPUs? Right after they added their own capable E cores? I don't think that's a coincidence.
Exactly.

Yes the throughput gain is there on MT for very minimal (like ~5-10% die area and closer to lower end apparently) logic but validation is a PITA, security is an issue, and the gains vary by workload and are more meaningful for servers. Scheduling probably becomes lot more painful with P + E and SMT too. We’ll be better off without it in client. And that’s where things are headed.

Performant, efficient ST + area and extremely energy efficient little cores for low QoS tasks is the way to go for anything from a phone to a tablet, headset, laptop.
 
  • Like
Reactions: Eug

eek2121

Diamond Member
Aug 2, 2005
3,100
4,398
136
I knew of nowhere else to post this, and I am behind on this thread, but I needed to share this somewhere with an okay amount of anonymity. I was approached by Apple recruiters 3X recently (last one was 2 hours ago) about building/implementing an LLM, so they are most certainly planning something. I know there have been rumors and such, but I haven't really been following along since I had hoped they'd focus on "on device" solutions, but with the details I was given, they are definitely planning something bigger, and I doubt, at least based on the details I was given, that they are partnering with anyone.

That being said, these details were not accompanying an NDA so take the info I provided for what it is worth. 2 of the jobs were pitched via direct Apple recruiters, today's offer was an indirect one. I turned all 3 of them down because I've exited the industry and I am becoming a full time consultant.
 
  • Like
Reactions: igor_kavinski

poke01

Platinum Member
Mar 8, 2022
2,205
2,803
106
I knew of nowhere else to post this, and I am behind on this thread, but I needed to share this somewhere with an okay amount of anonymity. I was approached by Apple recruiters 3X recently (last one was 2 hours ago) about building/implementing an LLM, so they are most certainly planning something. I know there have been rumors and such, but I haven't really been following along since I had hoped they'd focus on "on device" solutions, but with the details I was given, they are definitely planning something bigger, and I doubt, at least based on the details I was given, that they are partnering with anyone.

That being said, these details were not accompanying an NDA so take the info I provided for what it is worth. 2 of the jobs were pitched via direct Apple recruiters, today's offer was an indirect one. I turned all 3 of them down because I've exited the industry and I am becoming a full time consultant.
So the rumours are that Apple will do the “on-device AI” themselves and cloud AI most likely for chatbots they will partner with google.

WWDC should give a more clear understanding of their moves
 

eek2121

Diamond Member
Aug 2, 2005
3,100
4,398
136
Apple is doomed
Not really, You should maybe study economics before proclaiming this. Intel is still the #1 chip maker for a reason.
Please, dear god, stop making the chip in the phones faster for no reason. It's already so hot, the battery can only hold so much electro-juices. What if I didn't have to charge the damn thing every day?
Uh, just because you own an Android phone...

EDIT: even with my 14th gen iPhone, I do not have to charge every day, and my phone is used 8+ hours a day. With that usage is lasts nearly 2 days without needing a recharge.
Zen5 is only just widening their core from the 4 wide design of Zen1 though.

They still have a lot of open road to explore before they are tapped out.
It is a lot greater than 40%. Look at a certain previous release, which may or may not be relevant to current ones....
Let us discuss about Apple's OLED Macbooks here👇

I am so happy the current OLED rumors are confirming this. Apple has no idea how much money I'm about to throw at them as a result...unless it is false (not likely).

The next gen chips won't be as amazing as most people are expecting, but they will be great.
 

poke01

Platinum Member
Mar 8, 2022
2,205
2,803
106
So you are fully prepared for the early adopter issues, cursing at them and feeling like you wasted your money but still convincing yourself that it was the best decision you ever made because Apple?
For their new new products like Vision Pro I would wait for Gen 3 or Gen 4.

For their new generation of OLED MacBooks, should be fine as they are trialing the OLED this year in the iPads before they put them into MacBooks.

That kinda should tell how new the oxide OLED tech is. It’s bound to have some issues on first gen.
 

Doug S

Platinum Member
Feb 8, 2020
2,784
4,746
136
So the rumours are that Apple will do the “on-device AI” themselves and cloud AI most likely for chatbots they will partner with google.

WWDC should give a more clear understanding of their moves

I would read that Google talk as 100% about AI search. They want to keep everything on device to the extent possible, but obviously you can't do internet search on device so they want to see if Google will write them even more ridiculously large checks to be the default for AI enhanced internet search.