Discussion Apple Silicon SoC thread

Page 136 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,583
996
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

Screen-Shot-2021-10-18-at-1.20.47-PM.jpg

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:

 
Last edited:

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
I know it was widely reported SVE2 was mandatory, but that's 100% wrong. It is definitely optional, and says nothing about having to report as ARMv8 if it lacks SVE2.
While, that ARM documentation states SVE2 is optional. It is NOT optional in real world. ARMv9-A lacking SVE2 will be reported and downgraded to ARMv8-A path.

ARMv9-A in tool suites automatically assume SVE/SVE2 support. There has not been an OPTIONAL option of SVE/SVE2 for ARMv9 cores. So, softly spoken document indicates as optional, but the real world big stick with a hammer indicates that it is mandatory. Failure to have SVE/SVE2 in ARMv9A downgrades the processor to ARMv8A.

Thus, SVE/SVE2 ARMv9A is required/mandatory/compulsory or the processor will be required to report ARMv8A status rather than ARMv9A status.

No end-user(laptop/smartphone) ARMv9A solution will lack SVE/SVE2, ever.

There is another ARMv9 datasheet which indicates SVE2 is mandatory. SVE/SVE2 is a base requirement for ARMv9.x and greater. This is reflected with most recent ARMv9 (ARMv4 -> ARMv9) having Enhanced Vector Processing. While prior art didn't include it. That technical document probably hasn't been updated to reflect ARM's mandatory inclusion of SVE/SVE2.

This is the base emphasis:
‘armv9-a’Armv9-A‘armv8.5-a’, ‘+sve’, ‘+sve2’

  • Armv9-A maps to Armv8.5-A.
  • Armv9.1-A maps to Armv8.6-A.
  • Armv9.2-A maps to Armv8.7-A.
  • The SVE2 extension is enabled by default on these architectures.
2021:
Extension - SVE/SVE2 - Enabled by default = No
2022:
Extension - SVE/SVE2 - Enabled by default = Armv9-A or later

ARMv9-A across everything has SVE/SVE2 included. While, FEAT_SVE+FEAT_SVE2 was optional for custom, it is no longer optional for custom.

SVE/SVE2 "+nosve" will degrade the processor to ARMv8.5A. The legit machineregister/codepath for reporting ARMv(x) version if SVE/SVE2 is missing is to report as ARMv8.5A.

Assuming ARMv9A
-march=apple-a16(orwhatever)
+ARMv9A +AppleInstructions within Xcode.
ARMv9A includes mandatory SVE/SVE2.
 
Last edited:

Heartbreaker

Diamond Member
Apr 3, 2006
4,222
5,224
136
BTW I don't understand why people insist on SVE1. This has never been a feature for client CPU, it's an HPC thing.

I don't, but I have seen many act like SVE2 is some kind of game changer.

If that was true, then SVE1 would would have been a lot more common.

I suspect Apple will stay at ARMv8 and use GPU for big SIMD loads.
 

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
If that was true, then SVE1 would would have been a lot more common.

SVE1 was a lot narrower in focus:


SVE specifically was more aimed at ML and HPC applications. It was not included in standard ARM-licensed core designs mostly because it was Fujitsu's creation. You could still support it, but as I said earlier, it was pretty niche.
 

Eug

Lifer
Mar 11, 2000
23,583
996
126

M2 vs M1
+12.9% - Cinebench R23 multicore (30 minute loop)
+4.3% - Cinebench R23 single core
+16.0% - Geekbench 5.3 multicore
+12.0% - Geekbench 5.3 single core
+43.1% - Geekbench OpenCL / Compute
+45.0% - Shadow of the Tomb Raider bench fps (1920x1200, highest setting)
+27.8% - PugetBench Premiere Pro
-11.9% - Xcodebenchmark time (This is measured in seconds, so lower is better.)
+30.8% - 4K export time (This is measured in seconds so higher is worse. M2 did worse.)

So, in summary, M2 did incrementally better than M1 in CPU, and much better in GPU. However, there was one test - 4K export - where M2 actually did significantly worse than M1. I'm not sure what that 4K test does though as it wasn't spelled out in the article. However, aside from single-core, the jump in performance from M2 to M1 Pro* is significantly bigger than the jump from M1 to M2. However, battery life of the M2 machine was noticeably better than the M1 Pro.

*They talk about the 8-core M1 Pro at the beginning of the article, but it seems they are not actually testing that model. Judging by the benchmarks, they appear to be testing the 10-core M1 Pro, so that's a bit misleading.
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
Looking forward to benchmarks on how M2's efficiency fares compared to M1. Apple's announcement just said same battery life but included a slightly bigger battery which may imply slightly lower efficiency.
 

Doug S

Platinum Member
Feb 8, 2020
2,201
3,405
136
Looking forward to benchmarks on how M2's efficiency fares compared to M1. Apple's announcement just said same battery life but included a slightly bigger battery which may imply slightly lower efficiency.

The Air changed from the 'wedge' design to a flat design so there should be a little more room for a bigger battery in that model. The Macbook Pro M2 is the same form factor as the M1 so it is likely its battery size is unchanged.
 

ashFTW

Senior member
Sep 21, 2020
302
225
96
Question regarding M1 Pro/Max die and mask set:

We know that M1 Pro is a “chop” of the M1 Max. Does that mean that the wafer is made with all M1 Max die and that when individual chips are binned, due to defects or other reasons, some M1 max die are 1) physically sliced to produce the smaller die. Or is it that 2) the bottom part of the die is just fused off. Both these options seem quite wasteful, since M1 Pro is probably a higher volume part. Or 3) is the wafer made somehow using only the top portions of the M1 Max mask set, so that all chips on the wafer are M1 Pro. Can modern lithography machines, do such things? Also, can they do mirror images from mask sets? This could be useful for the rumored 4xM Ultra part. And finally, 4) there are separate mask sets for M1 Max and M1 Pro. The top portion of M1 Max is almost identical to M1 Pro. So there is cost savings in mask making, but one can’t get around having separate mask sets.

Answers appreciated! Thanks!
 

Eug

Lifer
Mar 11, 2000
23,583
996
126
Question regarding M1 Pro/Max die and mask set:

We know that M1 Pro is a “chop” of the M1 Max. Does that mean that the wafer is made with all M1 Max die and that when individual chips are binned, due to defects or other reasons, some M1 max die are 1) physically sliced to produce the smaller die. Or is it that 2) the bottom part of the die is just fused off. Both these options seem quite wasteful, since M1 Pro is probably a higher volume part. Or 3) is the wafer made somehow using only the top portions of the M1 Max mask set, so that all chips on the wafer are M1 Pro. Can modern lithography machines, do such things? Also, can they do mirror images from mask sets? This could be useful for the rumored 4xM Ultra part. And finally, 4) there are separate mask sets for M1 Max and M1 Pro. The top portion of M1 Max is almost identical to M1 Pro. So there is cost savings in mask making, but one can’t get around having separate mask sets.

Answers appreciated! Thanks!
According to reports out there in internet land, M1 Pro is not a binned version of M1 Max. M1 Pro gets its own mask, independent of M1 Max.

M1 Pro and M1 Max each have their own binned variants though (as does M1).
 
  • Like
Reactions: Mopetar and ashFTW

Heartbreaker

Diamond Member
Apr 3, 2006
4,222
5,224
136
Question regarding M1 Pro/Max die and mask set:
Or 3) is the wafer made somehow using only the top portions of the M1 Max mask set, so that all chips on the wafer are M1 Pro. Can modern lithography machines, do such things?

This one. It doesn't seem like this would be a technical challenge.
 
  • Like
Reactions: ashFTW

ashFTW

Senior member
Sep 21, 2020
302
225
96
This one. It doesn't seem like this would be a technical challenge.
If feasible, this would be preferred for mask set cost saving reasons, but it may also slow down lithography steps to the point that separate mask sets are overall cheaper. I just don’t have any experience here.
 

Doug S

Platinum Member
Feb 8, 2020
2,201
3,405
136
According to reports out there in internet land, M1 Pro is not a binned version of M1 Max. M1 Pro gets its own mask, independent of M1 Max.

M1 Pro and M1 Max each have their own binned variants though (as does M1).


My understanding is that the way Apple did it they can use the same mask set for both, but only expose part of the mask when making M1 Pro. Clever way to save on design cost for the much lower volume Max/Ultra/"Extreme" version.
 

repoman27

Senior member
Dec 17, 2018
342
488
136
@ashFTW 4).

The Pro and Max dies each have their own mask set and are manufactured separately—there is no other way to do it. The seal ring that surrounds the Pro die does not pass through the middle of the Max's GPU and memory controllers. The Pro circuit being a proper subset of the Max significantly reduces design and validation costs. Masks (which are blocks of quartz) may not be cheap, but they're nothing compared to design and layout costs (i.e. paying a decently sized team of engineers for multiple years of work).

Also, mask sets contain quite a few masks. Practically every step in the process (deposition, etching, implantation) begins with a coating of photoresist and a photolithographic exposure. Each layer requires multiple steps, and some steps require multiple exposures (multi-patterning). TSMC N5 requires more than 80 individual masks.

Furthermore, Apple needs a lot more than one set of photomasks for any chip being manufactured in volume. A very conservative estimate for the Pro die would put it at 2.5 million units, which would work out to right around 10,000 wafers. The Pro and Max dies combined may account for as many as 8 million units annually and over 50,000 wafers. There is no way that TSMC is cranking those out in an entirely serial fashion on a single line. Nor are they shutting that line down every time an individual mask gets damaged and waiting until a replacement can be made.

I do not believe that there is any way to do a mirrored die without producing a mirrored mask set.
 

ashFTW

Senior member
Sep 21, 2020
302
225
96
@ashFTW 4).

The Pro and Max dies each have their own mask set and are manufactured separately—there is no other way to do it. The seal ring that surrounds the Pro die does not pass through the middle of the Max's GPU and memory controllers. The Pro circuit being a proper subset of the Max significantly reduces design and validation costs. Masks (which are blocks of quartz) may not be cheap, but they're nothing compared to design and layout costs (i.e. paying a decently sized team of engineers for multiple years of work).

Also, mask sets contain quite a few masks. Practically every step in the process (deposition, etching, implantation) begins with a coating of photoresist and a photolithographic exposure. Each layer requires multiple steps, and some steps require multiple exposures (multi-patterning). TSMC N5 requires more than 80 individual masks.

Furthermore, Apple needs a lot more than one set of photomasks for any chip being manufactured in volume. A very conservative estimate for the Pro die would put it at 2.5 million units, which would work out to right around 10,000 wafers. The Pro and Max dies combined may account for as many as 8 million units annually and over 50,000 wafers. There is no way that TSMC is cranking those out in an entirely serial fashion on a single line. Nor are they shutting that line down every time an individual mask gets damaged and waiting until a replacement can be made.

I do not believe that there is any way to do a mirrored die without producing a mirrored mask set.
Thank you @repoman27 for such a thorough answer!
 

Eug

Lifer
Mar 11, 2000
23,583
996
126

Impressive chip but not much of a upgrade over M1. How does it get better battery life despite consuming more power. Is it more efficient at idle or low cpu usage?
I think the M2 model has a slightly bigger battery too.

BTW, here is the NotebookCheck review:

 
Last edited:

Doug S

Platinum Member
Feb 8, 2020
2,201
3,405
136

Eug

Lifer
Mar 11, 2000
23,583
996
126
According to Apple's site the MBP13 with M2 has the same 58.2 Wh battery as the M1 version did.
Yes, you are correct. I've updated my post, thx.

It was the MacBook Air I was thinking of. The M2 Air has a 5% bigger battery than the M1 Air, but they have differently sized screens and other different features, so that's not a valid comparison.
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
Going by Notebookcheck's own table MBP 13 M2 has worse load maximum but better idle maximum:

bild_2022-06-24_10032h5kmk.png


In the text itself they write "We measured up to 56W, which later stabilized at around 48W." which seems to be the main negative change as M1 wasn't allowed to rise above 48W (47.5W according to above table). Perhaps this is the first showcase of some Intel style PL1 frequency boost on Apple Silicon?
 

Doug S

Platinum Member
Feb 8, 2020
2,201
3,405
136
In the text itself they write "We measured up to 56W, which later stabilized at around 48W." which seems to be the main negative change as M1 wasn't allowed to rise above 48W (47.5W according to above table). Perhaps this is the first showcase of some Intel style PL1 frequency boost on Apple Silicon?

Too soon to jump to conclusions like that. It could be the higher draw was loading data from RAM, and went down as more of the working set was in cache.

Another possibility is the chip develops hot spots if individual CPU or GPU cores are run at max for too long. The MBP13 probably doesn't have much mass in its heatsink so even if it can maintain overall SoC temperature it may not be able to remove it fast enough to keep hot spots from developing.

Someone would have to run such tests while monitoring core frequencies and temperatures.
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
Too soon to jump to conclusions like that. It could be the higher draw was loading data from RAM, and went down as more of the working set was in cache.

Another possibility is the chip develops hot spots if individual CPU or GPU cores are run at max for too long. The MBP13 probably doesn't have much mass in its heatsink so even if it can maintain overall SoC temperature it may not be able to remove it fast enough to keep hot spots from developing.

Someone would have to run such tests while monitoring core frequencies and temperatures.
The reason they did an MBP13 M2 vs MBP13 M1 comparison to begin with is that supposedly aside the M SoC everything else is the same between the two. Whatever the exact cause ~17% higher load maximum is a significant difference.
 

Doug S

Platinum Member
Feb 8, 2020
2,201
3,405
136
The reason they did an MBP13 M2 vs MBP13 M1 comparison to begin with is that supposedly aside the M SoC everything else is the same between the two. Whatever the exact cause ~17% higher load maximum is a significant difference.

The clock rate is probably most responsible. They went from 3.2 to 3.5 GHz with a tweaked version of the same process. You don't get a 10% jump at ISO power from N5P, and the frequency/voltage curve is not linear so you have to pay for 10% frequency with more than 10% power.
 

StinkyPinky

Diamond Member
Jul 6, 2002
6,761
777
126
I mean that includes the 3090 so that doesn't seem a fair comparison? The 3090 blows the GPU performance of the M1's out of the water, it never claims to be power friendly.
 

Doug S

Platinum Member
Feb 8, 2020
2,201
3,405
136
How is the core i9-12900K using 4.1 watts when it is off? Pretty sure Energy Star rating doesn't allow that much "off" usage so I'm suspicious about the accuracy of what they used to measure that.
 

Eug

Lifer
Mar 11, 2000
23,583
996
126
How is the core i9-12900K using 4.1 watts when it is off? Pretty sure Energy Star rating doesn't allow that much "off" usage so I'm suspicious about the accuracy of what they used to measure that.
They used a Kill-A-Watt (or something similar) at the wall socket.

While that 4W when off seems odd, I'm almost more surprised at the idle power. I hadn't realized it was that high.

I mean that includes the 3090 so that doesn't seem a fair comparison? The 3090 blows the GPU performance of the M1's out of the water, it never claims to be power friendly.
I wonder what the numbers would be with 3070. 400 W total?