Discussion Apple Silicon SoC thread

Page 27 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,583
996
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

Screen-Shot-2021-10-18-at-1.20.47-PM.jpg

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:

 
Last edited:

name99

Senior member
Sep 11, 2010
404
303
136
Some pertinent points from the AnandTech article.


As mentioned already, Andrei seem to think the M1 TDP is somewhere a bit north of 20 Watts. That surprised me a little as I thought it would be below 20 Watts, but I'll defer to his expertise of course.

M1 gets full memory bandwidth with just a single core, at 58 GB/s read and 35 GB/s write. Bandwidth actually decreases somewhat as you add more cores.

119145.png


Ryzen 5950X wins for Cinebench R23 ST performance. M1 and Intel Core 1165G7 are effectively tied for second place.

119160.png


M1 wins at Geekbench 5 ST.

111168.png


M1 wins at SPEC2006 ST, both for int and for fp.

117493.png


Ryzen 5950X wins at SPECint2017, but M1 wins at SPECfp2017 ST.

Rosetta 2 performance ranges from 50-95% native according to SPEC2006 and SPEC2017 subtests, mostly in about the 70-80% range.

Poor little Graviton2 :-(
Hopefully he gets a huge shot of adrenaline at re:invent soon...
 

Hitman928

Diamond Member
Apr 15, 2012
5,177
7,628
136
Someone did power profile under multicore cinebench load: Exactly 15W for CPU cores only.

Andrei and Ars did their tests on a mac mini versus that link being a MBP which makes counting the power use easier (no screen). However, with the power sensor program, if I am reading the twitter post screenshot right, it is 15W when the big cores are at 3 GHz. At 3.2 GHz it would use more (again, not familiar with how that power program works on Macs). Adding at least 1/3 of power to the big cores does seem a lot for less than 10% frequency bump, but if they are already at the spike of the f/v curve, it's possible. If it really is 15W at 3.2 GHz big core, there is a decent amount of power unaccounted for in the Mac mini that something is using when CPU cores are active versus idle.
 

insertcarehere

Senior member
Jan 17, 2013
639
607
136
Andrei and Ars did their tests on a mac mini versus that link being a MBP which makes counting the power use easier (no screen). However, with the power sensor program, if I am reading the twitter post screenshot right, it is 15W when the big cores are at 3 GHz. At 3.2 GHz it would use more (again, not familiar with how that power program works on Macs). Adding at least 1/3 of power to the big cores does seem a lot for less than 10% frequency bump, but if they are already at the spike of the f/v curve, it's possible. If it really is 15W at 3.2 GHz big core, there is a decent amount of power unaccounted for in the Mac mini that something is using when CPU cores are active versus idle.

Wouldn't be surprising if all the better binned chips (for power consumption) went to the laptops while the Mac Minis were left with ones that had to use more voltage for the same clocks.

The CPU cores get all the spotlight but the integrated GPU performance also kicks butt on this thing, and that's Rosetta performance not native (!).
119359.png

119360.png
 
Last edited:

Hitman928

Diamond Member
Apr 15, 2012
5,177
7,628
136
Wouldn't be surprising if all the better binned chips (for power consumption) went to the laptops while the Mac Minis were left with ones that had to use more voltage for the same clocks.

There shouldn't be an over 33% increase between bins though, that's a crazy amount of variation. Based on the Mac Mini power numbers, it's really hard to get to the total system consumption based upon a 15W SOC TDP, unless it's 15W core power only and there's an additional 5+ watts from the uncore. I don't know, it'd be nice to know what that power utility is actually reporting and how.
 
  • Like
Reactions: Tlh97

Hitman928

Diamond Member
Apr 15, 2012
5,177
7,628
136
I can't find where that 15W figure comes from. I only see some 1294 mW reported for the"E-cluster power".

There's two screenshots, one for E-cluster power (small cores) and P-cluster power (big cores). The E-cluster is using ~1.3W at 2 GHz while the P-cluster is using ~13.7W at 3 GHz. At least that's how I'm interpreting it.
 
  • Like
Reactions: Tlh97 and jeanlain

Viknet

Junior Member
Nov 14, 2020
9
10
51
Andrei and Ars did their tests on a mac mini versus that link being a MBP which makes counting the power use easier (no screen). However, with the power sensor program, if I am reading the twitter post screenshot right, it is 15W when the big cores are at 3 GHz. At 3.2 GHz it would use more (again, not familiar with how that power program works on Macs). Adding at least 1/3 of power to the big cores does seem a lot for less than 10% frequency bump, but if they are already at the spike of the f/v curve, it's possible. If it really is 15W at 3.2 GHz big core, there is a decent amount of power unaccounted for in the Mac mini that something is using when CPU cores are active versus idle.

I think 3.0 GHz is maximum when multiple cores under load. It can reach 3.2 GHz only in singlecore.
And 15W is only from CPU cores without DRAM, GPU, regulators etc.
 

Eug

Lifer
Mar 11, 2000
23,583
996
126
Video of new Mac App launch: Spoiler it's pretty much instant:

I'd be curious what it would be like on an Intel MacBook Pro 16" fresh out of the box.

Cuz even on my Core m3 MacBook it's not too bad, as long as I don't try to launch MS Office 2016. I also remember being very pleased when I first got my iMac 27" in 2017. I did the same test and launched like a dozen apps and it didn't break a sweat, despite it being a i5 with no HyperThreading support. It wasn't quite as fast as that video, but it was impressive nonetheless.

I suspect 75% of the battle is down to SSD speed, and it's going to be faster too on a fresh install. I also think that's why my Core m3 MacBook is still quite decent after all these years. The PCIe SSD on my MacBook gets > 1 GB/s for transfer speeds. However, the latest M1 MacBook Pro gets over 3 GB/s, while the M1 MacBook Air gets 2.6 GB/s.

Indeed, I remember back in the day when Apple started selling Macs with PCIe SSDs. Everyone on the various forums were doing app launch icon bounce tests and marvelling at their new toys.
 
  • Like
Reactions: Tlh97 and Mopetar

iwulff

Junior Member
Jun 3, 2017
24
7
81
Wouldn't be surprising if all the better binned chips (for power consumption) went to the laptops while the Mac Minis were left with ones that had to use more voltage for the same clocks.

The CPU cores get all the spotlight but the integrated GPU performance also kicks butt on this thing, and that's Rosetta performance not native (!).
119359.png

119360.png
It does kick ass... but most definitely because the competition never planned for creating something that was more powerful in a consumer igpu. They are all constrained by the current ram bandwidth. AMD could have made an HBM2 soc or a consumer sold product based on the XBOX/PS5 soc's, but they didn't. The GPU of the M1 is impressive, but also it shines because the competition never cared for this. Which to me is a shame. I definitely would have liked a HBM2 soc with RDNA2 and Zen 3 in it. Perhaps this move from Apple will entice them. The closed we got was a joint effort from Intel and Amd.

The Rosetta performance is probably the same as if it was dedicated for the platform (that's my assumption), since it's bottlenecked presumably by the RAM bandwidth as well, just at a later stage since it's unified ram. But how knows... perhaps it would jump another 10-20% up.
 
  • Like
Reactions: Tlh97

Hitman928

Diamond Member
Apr 15, 2012
5,177
7,628
136
I think 3.0 GHz is maximum when multiple cores under load. It can reach 3.2 GHz only in singlecore.
And 15W is only from CPU cores without DRAM, GPU, regulators etc.

I think the all core load frequency is a bit in question. Andrei said he couldn't confirm the frequency of the Mac Mini at all core load (though he suspected it might be 3 GHz). The screenshot on twitter shows the 3.2GHz frequency bin with 100% residency across all 4 big cores. It'd be nice to get some confirmation here.

As far as TDP calculation, my calculation was for CPU cores only (or rather SOC) taking out expected values for DRAM, GPU (idle), VRM, etc. I'm guessing that at least part of the difference (if not most) between the power utility consumption number and my calculation is that the power utility is showing cores only and there's at least a few watts being used by the uncore/SOC that it's not showing.
 
  • Like
Reactions: Tlh97 and Viknet

naukkis

Senior member
Jun 5, 2002
701
568
136
It does kick ass... but most definitely because the competition never planned for creating something that was more powerful in a consumer igpu. They are all constrained by the current ram bandwidth.

That's not a valid argument since Apple, Intel and AMD all use 128bit lpddr4x memory in their laptop chips.
 
  • Like
Reactions: Tlh97 and NTMBK

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
So when are we going to get some people compiling FOSS benchmarks on a Mac Mini? The Cinebench R23 results do look pretty good though! Finally something meatier than GB5 and SPEC numbers . . .
 
  • Like
Reactions: Thunder 57

iwulff

Junior Member
Jun 3, 2017
24
7
81
That's not a valid argument since Apple, Intel and AMD all use 128bit lpddr4x memory in their laptop chips.
Well the bottleneck is real for any igpu, even Apple can't do magic, that's also a main reason why AMD didn't scale them up even further besides that the market is potentially to small... so what's the catch? Does the M1 have quad DDR4 channels with advanced compression? Otherwise they would have hit an bottleneck much sooner and not be able to hit those numbers.
 

name99

Senior member
Sep 11, 2010
404
303
136
I'd be curious what it would be like on an Intel MacBook Pro 16" fresh out of the box.

Cuz even on my Core m3 MacBook it's not too bad, as long as I don't try to launch MS Office 2016. I also remember being very pleased when I first got my iMac 27" in 2017. I did the same test and launched like a dozen apps and it didn't break a sweat, despite it being a i5 with no HyperThreading support. It wasn't quite as fast as that video, but it was impressive nonetheless.

I suspect 75% of the battle is down to SSD speed, and it's going to be faster too on a fresh install. I also think that's why my Core m3 MacBook is still quite decent after all these years. The PCIe SSD on my MacBook gets > 1 GB/s for transfer speeds. However, the latest M1 MacBook Pro gets over 3 GB/s, while the M1 MacBook Air gets 2.6 GB/s.

Indeed, I remember back in the day when Apple started selling Macs with PCIe SSDs. Everyone on the various forums were doing app launch icon bounce tests and marvelling at their new toys.

Don't ascribe too much to the SSD speed; it's just not that different.
Remember that most (all?) recent macs have T2 providing their SSD controller, and Apple pays for the high quality fast flash (and then gets complains that their flash isn't as cheap as the crap used in USB sticks...)

BlackMagic-Disk-Speed.png
 

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,672
136
I suspect 75% of the battle is down to SSD speed, and it's going to be faster too on a fresh install.
And caching. One of the reasons I tell people to not underestimate the RAM needs on a modern machine is the effect it has in keeping more relevant data in a faster memory pool. It works on smartphones, it works on computers.

My home machine got a recent upgrade to 32GB due to changes in work-related tasks, and the OS is currently using 16GB for caching.
 
  • Like
Reactions: Tlh97

Eug

Lifer
Mar 11, 2000
23,583
996
126
Don't ascribe too much to the SSD speed; it's just not that different.
Remember that most (all?) recent macs have T2 providing their SSD controller, and Apple pays for the high quality fast flash (and then gets complains that their flash isn't as cheap as the crap used in USB sticks...)

BlackMagic-Disk-Speed.png
My point was that even my iMac i5 in 2017 did very well with this app load test when it was brand new out of the box, so I think a lot of it has to do with the SSD speed.

My other point though was that this gets worse over time. It's great with a fresh install and newly installed apps, but with a machine that has been in active use for a couple of years, the load times clearly increased, at least on my machine.

IOW, it's true that M1 is fast, but I suspect a large of the great app loading performance in that video has to do with the fact that it's a brand-spankin' new machine with freshly installed apps, on a super fast SSD. Yes, the CPU matters, but that may not be the main bottleneck here.

I think the best example of this was when I was playing with some new OS installs. I ended up doing a clean install of Mojave and re-installing the apps, and all of a sudden, everything was screaming fast, including app loading. I then restored my original backup to the same computer, and the increased load times I was having before came back. So there was some sort of cruft on there that was slowing things down and I never figured out what it was. Note though the slower version of the install did have about 300 GB of photos and videos on it, so that couldn't have helped. (1 TB SSD.) The new clean OS installs were less than 30 GB even after all the applications were installed.
 
Last edited:

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
Well the bottleneck is real for any igpu, even Apple can't do magic, that's also a main reason why AMD didn't scale them up even further besides that the market is potentially to small... so what's the catch? Does the M1 have quad DDR4 channels with advanced compression? Otherwise they would have hit an bottleneck much sooner and not be able to hit those numbers.

Nope, its seemingly 'magic' in the Clarke form of sufficiently advanced technology :)

From the main review here:
"Besides the additional cores on the part of the CPUs and GPU, one main performance factor of the M1 that differs from the A14 is the fact that’s it’s running on a 128-bit memory bus rather than the mobile 64-bit bus. Across 8x 16-bit memory channels and at LPDDR4X-4266-class memory, this means the M1 hits a peak of 68.25GB/s memory bandwidth."

I guess that the packaging of the ram so close to the SoC probably helps quite a bit? Then a bunch of other things.

These chips have much more bandwidth than they've had to work with in their phones, so they've no doubt got very good indeed at working round low bandwidths. They might well also be about at their maximum - the next round of bigger chips might well go quad memory, LPDDR5 etc and then.....
 

Eug

Lifer
Mar 11, 2000
23,583
996
126
And caching. One of the reasons I tell people to not underestimate the RAM needs on a modern machine is the effect it has in keeping more relevant data in a faster memory pool. It works on smartphones, it works on computers.

My home machine got a recent upgrade to 32GB due to changes in work-related tasks, and the OS is currently using 16GB for caching.
Yes caching. Good point. But was that test cached? If so, then that totally negates the test. (I didn't check.)

Anyhow, I like having 16 GB on my old MacBook. :) The caching makes MS Office reloads much more tolerable. Plus it helped with the memory usage (leak?) with PowerPoint on my 24 GB iMac. I was making a PowerPoint presentation a couple of years ago all day long and noticed things started to slow down. Eventually I checked the memory usage and it turned out PowerPoint alone was using 7 GB. I relaunched PowerPoint and it was back down to a few hundred MB.
 

jeanlain

Member
Oct 26, 2020
149
122
86
Well the bottleneck is real for any igpu, even Apple can't do magic, that's also a main reason why AMD didn't scale them up even further besides that the market is potentially to small... so what's the catch? Does the M1 have quad DDR4 channels with advanced compression? Otherwise they would have hit an bottleneck much sooner and not be able to hit those numbers.
Isn't it what the TBDR GPU design is all about? Use tile memory (apparently, some huge cache) and avoid reading from RAM as much as possible.