Discussion Apple Silicon SoC thread

Eug · Nov 10, 2020

M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:

Page 78 - Discussion - Apple Silicon SoC thread

Page 78 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

M1 Ultra discussion here:

Page 109 - Discussion - Apple Silicon SoC thread

Page 109 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

M2 discussion here:

Page 127 - Discussion - Apple Silicon SoC thread

Page 127 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:

Page 215 - Discussion - Apple Silicon SoC thread

Page 215 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

M4 Family discussion here:

Page 263 - Discussion - Apple Silicon SoC thread

Page 263 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

M5 Family discussion here:

Page 431 - Discussion - Apple Silicon SoC thread

Page 431 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

poke01 · Mar 26, 2024

okoroezenwa said:
Aren’t the A17 cores a new architecture?

Nope based on A16. https://github.com/AsahiLinux/docs/wiki/Codenames

M4 needs to be based on H17G for it be a new arch

FlameTail · Mar 26, 2024

But A17 Pro P-core upgraded to a 9-wide!

How is it the same core?

Also A16 and A17 cores have same codenames? Bizarre

poke01 · Mar 26, 2024

FlameTail said:
But A17 Pro P-core upgraded to a 9-wide!

How is it the same core?

Also A16 and A17 cores have same codenames? Bizarre

This was from Tech Reve. Essentially A17 Pro was code named H15 Coll.

H15 is the code name for A16.

FlameTail · Mar 26, 2024

poke01 said:
This was from Tech Reve.

chrome_screenshot_26 Mar 2024 05_35_17 GMT+05_30.png

RIP.

reggie_fils_aime · Mar 26, 2024

Please, dear god, stop making the chip in the phones faster for no reason. It's already so hot, the battery can only hold so much electro-juices. What if I didn't have to charge the damn thing every day?

poke01 · Mar 26, 2024

reggie_fils_aime said:
Please, dear god, stop making the chip in the phones faster for no reason. It's already so hot, the battery can only hold so much electro-juices. What if I didn't have to charge the damn thing every day?

Hey, still cooler than Intel chips in laptops lol

SiliconFly · Mar 26, 2024

FlameTail · Mar 26, 2024

Canalys expects Apple M4 to drop in Q1 2025:

Doug S · Mar 27, 2024

Their guesses aren't worth any more than the guesses of the people here. They aren't leveraging any inside information. We will see M4 in Q4 this year right around the same time we saw M3.

Apokalupt0 · Mar 27, 2024

FlameTail said:
They need to make 16 wide core with a 1000+ ROB

Lmao

okoroezenwa said:
Aren’t the A17 cores a new architecture?

ya they are newish as in there are a few notable changes, but bulk of the architecture is still the same. I hope we see a new architecture from ground up with A18. Same as how AMD is doing with Zen 5.

soresu · Mar 27, 2024

Apokalupt0 said:
ya they are newish as in there are a few notable changes, but bulk of the architecture is still the same. I hope we see a new architecture from ground up with A18. Same as how AMD is doing with Zen 5.

They are already sporting a super wide core - I don't think it's going to gain them much by going significantly wider a la Zen 5 at this point.

Beyond a certain width you hit diminishing returns.

Going from 4 -> 6 doesn't net the same gain as 6 -> 8, let alone 8 -> 10 even though you are increasing by the same amount.

Unless they can somehow architect 13-16 wide µArch without explosive power draw and area increase maybe, but that seems like a stretch.

Unless some big breakthrough in CPU design happens I think we will see perf hit a hard wall without a drastic change to the underlying hardware device and materials, perhaps something like antiferromagnetic or photonic logic with topological insulator based metal layers.

SiliconFly · Mar 27, 2024

StinkyPinky · Mar 27, 2024

I think I may create a twitter account named Tim Cook's Ballsack and release random amazing geekbench numbers every week. Get everyone in a frenzy about it.

After everyone starts reporting on it, I'll post up a fake FBI takedown notice on the account to help add to the mystery.

soresu · Mar 27, 2024

SiliconFly said:
Not always true. Esp. if they manage to enhance the accompanying blocks.

The point was about taking it purely on the point of just throwing silicon at the problem.

The impact of diminishing returns is already starting to be a buzzkill for Apple I imagine.

Given ARM Ltd's best 4 wide CPU core far outperforms Apple's initial 6 wide design that point should be kinda obvious by now to anyone paying attention.

On that note, are the A7xx cores still 4 wide?

Anyone got the spec sheet on this? I seem to remember a Google Docs thing floating around some time ago.

If so I wonder if Chaberton/A730 will continue to be 4 wide.

Doug S · Mar 27, 2024

SiliconFly said:
Not always true. Esp. if they manage to enhance the accompanying blocks.

Doesn't matter, there are always diminishing returns for widening, because not all code has sufficient parallelism. Doesn't help as much to go from 8 to 10 wide if even under ideal circumstances the code you're running only exceeds 8 instructions that can be issued/retired at once 10 or 20 percent of the time. But maybe when you went from 6 to 8 it was 20 to 30 percent that could benefit.

naukkis · Mar 27, 2024

Doug S said:
Doesn't matter, there are always diminishing returns for widening, because not all code has sufficient parallelism. Doesn't help as much to go from 8 to 10 wide if even under ideal circumstances the code you're running only exceeds 8 instructions that can be issued/retired at once 10 or 20 percent of the time. But maybe when you went from 6 to 8 it was 20 to 30 percent that could benefit.

There's something that might give good results from very wide cores that aren't yet utilized - like hardware loop unrolling. Complex to do - but when done it makes possible to run every iteration of loop in it's own hardware making well use of very wide execution hardware. Though proper ISA support would make implementing that kind of parallelism much easier.

igor_kavinski · Mar 27, 2024

naukkis said:
There's something that might give good results from very wide cores that aren't yet utilized - like hardware loop unrolling. Complex to do - but when done it makes possible to run every iteration of loop in it's own hardware making well use of very wide execution hardware. Though proper ISA support would make implementing that kind of parallelism much easier.

Sounds intriguing. On that note, why don't compilers automatically write SIMD code for loops where it's "obvious" that the task can be parallelized? Or how about generating executable code with its own virtual machine that analyzes the code as it runs. There will be an overhead for small data inputs but if a large input is given, the VM will "see" that it's taking too long to execute and thus, it pauses the execution of the code, parallelizes the task so the loop runs in parallel as multiple threads and then resumes from where it left off.

FlameTail · Mar 27, 2024

Doug S said:
Doesn't matter, there are always diminishing returns for widening, because not all code has sufficient parallelism. Doesn't help as much to go from 8 to 10 wide if even under ideal circumstances the code you're running only exceeds 8 instructions that can be issued/retired at once 10 or 20 percent of the time. But maybe when you went from 6 to 8 it was 20 to 30 percent that could benefit.

So if making the core wider will only bring diminishing returns, what are they gonna do!?

Are IPC gains dead?

igor_kavinski · Mar 27, 2024

FlameTail said:
Are IPC gains dead?

If that really happens, they will start trying insane moonshot ideas coz what else they gonna do? Sooner or later, they will strike gold!

FlameTail · Mar 27, 2024

igor_kavinski said:
If that really happens, they will start trying insane moonshot ideas coz what else they gonna do? Sooner or later, they will strike gold!

Inverse multithreading?

FlameTail · Mar 27, 2024

A20: 2P+6E
M4: 4P+6E
M4 Pro: 8P+4E
M4 Max: 12P+6E

(Speculation)

Apokalupt0 · Mar 28, 2024

soresu said:
The point was about taking it purely on the point of just throwing silicon at the problem.

The impact of diminishing returns is already starting to be a buzzkill for Apple I imagine.

Given ARM Ltd's best 4 wide CPU core far outperforms Apple's initial 6 wide design that point should be kinda obvious by now to anyone paying attention.

On that note, are the A7xx cores still 4 wide?

Anyone got the spec sheet on this? I seem to remember a Google Docs thing floating around some time ago.

If so I wonder if Chaberton/A730 will continue to be 4 wide.

The A715 went from 4 wide to 5 wide

FlameTail · Mar 28, 2024

Apokalupt0 said:
The A715 went from 4 wide to 5 wide

Cortex X4 is a 10 wide monster

FlameTail · Mar 28, 2024

soresu said:
Anyone got the spec sheet on this? I seem to remember a Google Docs thing floating around some time ago.

Yeah, where's that Google Doc link?

okoroezenwa · Mar 28, 2024

Is your speculation that A20 and M4 will be the next synced architecture? That makes no sense given the rumoured M3 timeline.

Discussion Apple Silicon SoC thread

Lifer

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Member

Diamond Member

Golden Member

Attachments

Diamond Member

Diamond Member

Junior Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Lifer

Diamond Member

Lifer

Diamond Member

Diamond Member

Junior Member

Diamond Member

Diamond Member

Member