• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Discussion Apple Silicon SoC thread

Page 471 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

Screen-Shot-2021-10-18-at-1.20.47-PM.jpg

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:


M4 Family discussion here:


M5 Family discussion here:

 
Last edited:
Was that on an iGPU or an dGPU? When you have a single memory pool, your RAM is also your VRAM, and VRAM matters now even when you browse the web or open Chromium based apps.
That was a dGPU, but I myself played on an 8GB single channel system and SC2 with mining app running in the background. I am an enthusiast but I also remember when everyone though 1280x1024 resolution and 30 fps is fine and that we are actually very adaptable, as long as you are willing to. Because there might be a day when we have no more new computers and by that time any computer would be valued.
Presumably, someone bought that about 15 years ago and wisely avoided the cheapo 4GB SB laptops to allow such longevity. Maybe not.
Desktop. I know from personal experience enabling all C and P states that significant(but not all) responsiveness difference between laptop/desktop is due to those states. It was noticeable enough that in a desktop-form factor it was not acceptable to save maybe 5W.

If it wasn't for the laptop breaking due to a fall my mother would still be using an AMD A10 laptop.

The biggest point is that this is an Apple device, going for new Core i3 Windows prices. Unless they mess this up bad, they will take nice amount of share. Between Apple taking share and RAM/SSD pricing likely outright bankrupting few vendors, they are going to have a hard time.
 
Last edited:
The SoC is used in Phones, it probably doesn't have PC class storage performance of today's NVMe SSDs (and the complicated thing Apple uses in M SoCs for their petty reasons). Various latencies and IOPS limits likely add up when running a heavy OS.
It's the same thing Apple uses in M series SoCs, because why would it be any different? As far as I know the only difference is that M series chips have more flash memory channels than A series.

Your perceptions / prejudices about Apple's SSD tech are a little strange. Do you not know that it's literally NVMe plus a couple Apple extensions? The NVMe controller here isn't a separate chip made by Phison or whomever, instead it's an Apple designed controller integrated into the main SoC. That's why the SoC only provides for connecting flash memory to the SoC, and why you don't see a M.2 socket for connecting a standalone drive with its own NVMe controller.

Their reasons for doing this aren't "petty". The big headline feature they gain is tight integration of their NVMe controller with the Secure Enclave inside the same SoC, which results in a much better encryption architecture than is possible with generic NVMe's TCG Opal. (This is the reason why Apple had to extend the NVMe protocol a bit.) There's probably other improvements, too.

I stand corrected.

Technically it's 4 LPDDR channels. But why does the number of channels matter? What matters is the bandwidth at the end, right? Not the means used to achieve it.
Bandwidth is not the only thing that matters. Or, to be more precise, you can lose theoetical bandwidth due to other limitations, which I'm going to explain...

Let's say the reason why you have high bandwidth demands is that there's a really complicated SoC with a lot of memory agents, each generating its own memory request stream. Under these circumstances, memory controller channel count is a very important scaling resource. More channels equals more open pages, and more commands in flight. The more things you have generating memory requests, the more memory subsystem parallelism you need.

This is pretty much the entire reason why LPDDR4 and LPDDR5, the JEDEC standards designed primarily for phones, have narrower channels than desktop RAM standards. Phones SoCs are all unified memory systems, with a ton of bus agents. It's very common to have more agents generating memory requests than you'd see in a traditional performance desktop PC. Hence, LPDDR5's 16-bit wide channels - even phones really need four channels. If you built the same thing but with a single 64-bit wide channel, real-world bandwidth would drop relative to 4x16, just because you'd be much more likely to suffer from stalls waiting for pages to close and open and there'd be less queue capacity.
 
There's probably other improvements, too.
High speed storage controllers can throw off a lot of heat. Apple already has all the thermal management hanging off the SOC, so that's the best place to put that heat.

Bandwidth is not the only thing that matters. Or, to be more precise, you can lose theoetical bandwidth due to other limitations, which I'm going to explain...

Let's say the reason why you have high bandwidth demands is that there's a really complicated SoC with a lot of memory agents, each generating its own memory request stream. Under these circumstances, memory controller channel count is a very important scaling resource. More channels equals more open pages, and more commands in flight. The more things you have generating memory requests, the more memory subsystem parallelism you need.

This is pretty much the entire reason why LPDDR4 and LPDDR5, the JEDEC standards designed primarily for phones, have narrower channels than desktop RAM standards. Phones SoCs are all unified memory systems, with a ton of bus agents. It's very common to have more agents generating memory requests than you'd see in a traditional performance desktop PC. Hence, LPDDR5's 16-bit wide channels - even phones really need four channels. If you built the same thing but with a single 64-bit wide channel, real-world bandwidth would drop relative to 4x16, just because you'd be much more likely to suffer from stalls waiting for pages to close and open and there'd be less queue capacity.
It's worth noting that the camera is the driving factor for why Apple put NVMe storage in the iPhone and the main task on the device is streaming ~10Gb/s off the camera sensors to compute and about ⅓ of that to storage. I think if there's any opportunity to make that process lower power, etc. Apple will go to pretty great lengths to take it as there are a lot of users that are doing that level of recording for hours on end.
 
Back
Top