Discussion Apple Silicon SoC thread

Page 82 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,607
1,018
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

Screen-Shot-2021-10-18-at-1.20.47-PM.jpg

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:

 
Last edited:

Mopetar

Diamond Member
Jan 31, 2011
7,885
6,130
136
I'm surprised that they made something as big as the M1 Max and it will be curious to see how it ends up performing. It's fairly clear from iOS games that you can get some good performance out of their GPUs if you take the time to do so, but there's a pretty massive chicken and egg problem where because there aren't a lot of Apple users that run AAA titles on their Macs, there isn't a lot of incentive for the developers to take the time to optimize.
 

Eug

Lifer
Mar 11, 2000
23,607
1,018
126
AnandTech’s take:


In terms of performance, Apple is battling it out with the very best available in the market, comparing the performance of the M1 Max to that of a mobile GeForce RTX 3080, at 100W less power (60W vs 160W). Apple also includes a 100W TDP variant of the RTX 3080 for comparison, here, outperforming the NVIDIA discrete GPU, while still using 40% less power.

12A66E87-A04F-4DFF-8A22-28BB7BB01EA2.jpeg

Today reveal of the new generation Apple Silicon has been something we’ve been expecting for over a year now, and I think Apple has managed to not only meet those expectations, but also vastly surpass them. Both the M1 Pro and M1 Max look like incredibly differentiated designs, much different than anything we’ve ever seen in the laptop space. If the M1 was any indication of Apple’s success in their silicon endeavors, then the two new chips should also have no issues in laying incredible foundations for Apple’s Mac products, going far beyond what we’ve seen from any competitor.

—-


BTW, while I think the Intel vs AMD vs Apple raw CPU and GPU performance discussions are interesting, what will be even more interesting for many real world users are the hardware accelerators included. For example, many of the Final Cut types are gonna love the hardware ProRes acceleration. This will make video editing so much easier. Not just export, but actual editing.

Also, regarding the GPU, I’m not sure how informative gaming comparisons will be. It’s more about content creation for Apple at this point.
 
Last edited:

majord

Senior member
Jul 26, 2015
433
523
136
The GPU one is a tough one to compare .. 16core is basically Navi 23 clocked at half clock speed. ( 1.3ish ghz) you'd have to clock a 6600xt down to these levels and compare. I know for mining workload , Asic power can be brought down to the 50w mark at those clocks , but I'm still not sure that would be entirely relevant as I believe you hit a lower voltage limit before getting clocks that low, thus it's not in it's perf/w sweetspot, something you can be sure the apple GPUs are..down to the MHz.

Secondly these numbers dont really tell us how that translates to real world rasterization perf at higher TF levels for this architecture. We'll have to wait for that I guess.
 
  • Like
Reactions: Tlh97 and Mopetar
Jul 27, 2020
16,545
10,549
106
It's fairly clear from iOS games that you can get some good performance out of their GPUs if you take the time to do so, but there's a pretty massive chicken and egg problem where because there aren't a lot of Apple users that run AAA titles on their Macs, there isn't a lot of incentive for the developers to take the time to optimize.
Apple certainly has enough dough in the bank to fund a spectacular AAA title, if only to showcase their graphical prowess. Also, a tantalizing thought: could these powerful GPUs be laying the foundation for Apple's rumored AR/VR headset?
 

Eug

Lifer
Mar 11, 2000
23,607
1,018
126
Here is an other geek-bench score! The Mac is F******* back guys!!!

View attachment 51596
Yes there is variability in the scores and often times they can rise by as much as 10-15% as results come in.

12422 is more respectable at about 1.6X M1 but still a bit short of the ~13000 some of us were guesstimating back-of-napkin. Interestingly, 13000 is about 1.7X M1, which IIRC just happens to be a performance multiplier Apple had mentioned in the keynote for some tests when comparing against M1 (which scores around 7700ish).

Meanwhile, my 2007 Mac Pro 3 GHz 8-core Xeon X5365 gets about 2300, and my 2017 iMac Core i5-7600 gets about 3800. As for a laptop, my 2017 MacBook m3 gets about 1550.

A011BB8D-EDC3-417B-90A0-600EFC879794.png
 
Last edited:

Eug

Lifer
Mar 11, 2000
23,607
1,018
126
With 30 W CPU and 60 W GPU, I wonder how loud these things are going to get.

I’m not impressed with the 3.5 lb weight of the 14” though.


What's projected cine r23 scores?
If the 1.7X M1 applies here, maybe 12000?

Anyhow, we should be getting some real numbers next week.


Ordered 16” MacBook Pro, M1 Max with 64GB RAM, 32 GPU cores, and 4TB SSD. Can’t wait!!
Wow, that’s one beefy machine. What are you going to use that for?
 
  • Like
Reactions: majord

eek2121

Platinum Member
Aug 2, 2005
2,930
4,027
136
What's projected cine r23 scores?

Zero, because like most software, you won’t find that it runs natively on a Mac. The performance, for native M1 software, is of course, better than the M1, however, we received samples of these laptops this afternoon so there is not much that I can say except…don’t hold your breath.
 

majord

Senior member
Jul 26, 2015
433
523
136
Zero, because like most software, you won’t find that it runs natively on a Mac. The performance, for native M1 software, is of course, better than the M1, however, we received samples of these laptops this afternoon so there is not much that I can say except…don’t hold your breath.

Afaik r23 is available for Mac.
 

Doug S

Platinum Member
Feb 8, 2020
2,288
3,566
136
Why did people expect 1.7x faster multithread? We know the little cores of the M1 are about 1/3 the speed of the big cores. If you take 4 + 4 * .33 and 8 + 2 *.33 as the multithread performance of M1 and M1 Pro/Max, and assume the same 3.2 GHz clock rate, it comes out to around 62.5% faster. Since nothing scales in perfect linear fashion getting 60% boost is about what you'd expect.

Tests that can use all the bandwidth they can get will do better than that but they are the exception not the rule.
 

Doug S

Platinum Member
Feb 8, 2020
2,288
3,566
136
The die photos showing the Pro and Max dies side by side are pretty interesting. As noted by others you can see the "chop" location, but below the additional GPU cores there are some replicated structures. From left to right:

#1 "random schmear" is mirrored above the SLC block on the left
#2 "chips surrounding a bigger chip" mirrored above the SLC block on the right
#3 "more empty schmear" I don't see this one mirrored above
#4 "E and backwards E" is mirrored above #2

So what are they? One is clearly another two display controllers. What else does the Max have twice as many of as the Pro?

The big mystery is what is #3? It stands to reason the "something new" would be for off chip communication to other M1 Max dies in a larger system like Mac Pro. That block is not big enough to be a full fabric, so Apple will need an I/O die like AMD uses rather than having it built in like IBM, at least for this generation.

The I/O die will implement the fabric, and include DDR5 controllers for DIMM slots hanging off it. With up to 256 GB of LPDDR5 (unless larger LPDDR5 stacks are possible...anyone know how big those can get?) with 1.6 TB/sec of memory bandwidth in a 4 M1 Max Mac Pro it'll be fairly NUMAy when you hit the much slower DDR5 DIMMs.
 

Gideon

Golden Member
Nov 27, 2007
1,655
3,744
136
The last geekbench score is quite good. I'm still slightly dissapointed at 0% ST gains compared to M1 though.

Yes it's somewhat to be expected, but they did manage to improve the ST score for phones from 1300 (A14) to 1700 (A15). i was hoping they could also squeeze at least 100-200 extra MHz out of the M1 Pro.

Currently the ST clocks are exactly the same as my iPhone 13 mini (3.2 Ghz)
 

Red_m

Junior Member
Aug 29, 2021
6
23
36
Zero, because like most software, you won’t find that it runs natively on a Mac. The performance, for native M1 software, is of course, better than the M1, however, we received samples of these laptops this afternoon so there is not much that I can say except…don’t hold your breath.

seems u know nothing and sound very silly!!

cinebench r23 is native for M1.

90% of adobe products are native for M1.

Davinchi resolve is M1 native.

wolfram mathematica is native.

Android emulator is native.

VECTORWORKS is native.
 

jeanlain

Member
Oct 26, 2020
149
122
86
Yes it's somewhat to be expected, but they did manage to improve the ST score for phones from 1300 (A14) to 1700 (A15). i was hoping they could also squeeze at least 100-200 extra MHz out of the M1 Pro.
My take is that the M1Pro/Max performance core is 1000% identical to the M1 and is built on the same lithography.
 
  • Like
Reactions: Gideon

biostud

Lifer
Feb 27, 2003
18,257
4,772
136
As long as Apple has access to TSMCs next gen node, and don't do a bad job of designing a cpu and gpu, they're going to be more efficient than their pc counter parts. If we had an AMD 5 nm based laptop cpu and a 5 nm GPU, the numbers would be different. So kudos to Apple for pushing new technology. Personally I just want to buy a video card at MSRP :p
 
  • Like
Reactions: Tlh97

insertcarehere

Senior member
Jan 17, 2013
639
607
136
The die photos showing the Pro and Max dies side by side are pretty interesting. As noted by others you can see the "chop" location, but below the additional GPU cores there are some replicated structures. From left to right:

#1 "random schmear" is mirrored above the SLC block on the left
#2 "chips surrounding a bigger chip" mirrored above the SLC block on the right
#3 "more empty schmear" I don't see this one mirrored above
#4 "E and backwards E" is mirrored above #2

So what are they? One is clearly another two display controllers. What else does the Max have twice as many of as the Pro?

The big mystery is what is #3? It stands to reason the "something new" would be for off chip communication to other M1 Max dies in a larger system like Mac Pro. That block is not big enough to be a full fabric, so Apple will need an I/O die like AMD uses rather than having it built in like IBM, at least for this generation.

The I/O die will implement the fabric, and include DDR5 controllers for DIMM slots hanging off it. With up to 256 GB of LPDDR5 (unless larger LPDDR5 stacks are possible...anyone know how big those can get?) with 1.6 TB/sec of memory bandwidth in a 4 M1 Max Mac Pro it'll be fairly NUMAy when you hit the much slower DDR5 DIMMs.

If https://twitter.com/Locuza_/status/1450296155477319683?s=20 is correct, the M1 Pro/Max die shots aren't clean and weren't scaled correctly, which puts some error bars on the die size estimates.

It seems that with the M1 Pro/Max GPU's, Apple are more than willing to put up with big dies to maximize efficiency, to an extreme extent if the slides are to be believed. I suspect that for the binned GPUs (14/24core) the increased clocks from running less cores in the same thermal budget would bridge quite a bit of the compute gap vs the fully enabled versions.
 
Last edited:
  • Like
Reactions: Ajay and Eug

Eug

Lifer
Mar 11, 2000
23,607
1,018
126
Why did people expect 1.7x faster multithread? We know the little cores of the M1 are about 1/3 the speed of the big cores. If you take 4 + 4 * .33 and 8 + 2 *.33 as the multithread performance of M1 and M1 Pro/Max, and assume the same 3.2 GHz clock rate, it comes out to around 62.5% faster. Since nothing scales in perfect linear fashion getting 60% boost is about what you'd expect.

Tests that can use all the bandwidth they can get will do better than that but they are the exception not the rule.
You may be right, but FWIW, this is what Apple had to say about it:


“The CPU in M1 Pro and M1 Max delivers up to 70 percent faster CPU performance than M1, so tasks like compiling projects in Xcode are faster than ever.”

BTW, another score, this time M1 Pro: 1707 / 11030

944A0F45-6BBD-41EB-A906-99ABE7DF256F.jpeg