Discussion Apple Silicon SoC thread

Page 98 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,825
1,396
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

Screen-Shot-2021-10-18-at-1.20.47-PM.jpg

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:


M4 Family discussion here:

 
Last edited:

Hitman928

Diamond Member
Apr 15, 2012
6,187
10,695
136

A guy with his own renderer and some pure CPU power comparisons. Unfortunately, few modern CPUs, however very interesting is versus the Xeon W-3245, i7 9750, and the Threadripper 3990X.

Really impressive showing. I do wish he had something that is an actual modern competitor, but it is a great result nonetheless.
 

biostud

Lifer
Feb 27, 2003
18,700
5,434
136
The 5950x is - 10bn transistors on 12nm I/O + 7nm CCX, and a 3080 is ~28bn on 8nm Samsung, M1 max is 57bn transistors on 5nm TSMC. Wouldn't it be pretty bad if it wasn't far more powerful and efficient than a 5950x+3080. Also how can Apple create such a huge monolithic chip? I wonder how many they have to have to scrap.
 

The Hardcard

Senior member
Oct 19, 2021
218
309
106
Really impressive showing. I do wish he had something that is an actual modern competitor, but it is a great result nonetheless.

He said it’s his hobby renderer. He could probably make a few bucks if he wanted by allowing other people to run it with those scenes. They are several orders of magnitude more complex than Cinebench, a real test of what current and near future architectures can do with rendering.
 

biostud

Lifer
Feb 27, 2003
18,700
5,434
136
Appeal to authority and ad hominem attacks don't mean much to me.

There is a lot I could discuss about why you are cherry picking, but I'll just stick with the initial argument and ask where are Andrei's numbers for Cezanne showing the M1 with a 2.5-3x efficiency lead since that is what was actually being discussed?
Especially how close the Threadripper was to the m1 in energy efficiency in 4k rendering.
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
The 5950x is - 10bn transistors on 12nm I/O + 7nm CCX, and a 3080 is ~28bn on 8nm Samsung, M1 max is 57bn transistors on 5nm TSMC. Wouldn't it be pretty bad if it wasn't far more powerful and efficient than a 5950x+3080. Also how can Apple create such a huge monolithic chip? I wonder how many they have to have to scrap.

It's a sea of SRAM and GPU units that are pretty much redundant. Being able to sell cutdown versions works wonders for yield as well.
 

Viknet

Junior Member
Nov 14, 2020
9
10
51
Especially how close the Threadripper was to the m1 in energy efficiency in 4k rendering.
But this Threadripper 3990X has similar energy-efficiency only at 70% per-core performance of M1 Max.
Apple M1 in Macbook Air loses about 20% with the throttling from 15W to 7W, so I would expect M1 Max to be 2-3x more efficient than Threadripper at the same per-core performance.
 

naukkis

Senior member
Jun 5, 2002
903
786
136
Unless something has changed within the last year or two, for x264 those flags won't turn on SSE/AVX. If you turn off ASM then there is no code path to enable SSE/AVX, those instructions sets are built into the ASM code path and are left out if ASM is disabled no matter what your other flags are set as. So no, what I said the second time wasn't misleading. You can download and compile the software yourself and check if you want to.

Edit: Most likely those flags are fine for the rest of the software but I'm not as familiar with all of the sub-test software so I didn't want to speak on them specifically and don't know what specific flags/optimizations may be used in the release builds versus Anandtech's builds.

That's because it is spec. Spec is meant to benchmark difference in hardware not in software, so they offer source code without hand tuned assembly. And no, nobody could use any hand-tuned assembly in any subset of spec tests. Compiler is though free auto-vectorize what it can so SSE/AVX isn't turned off, only hand-tuned asssembly parts doesn't exists in spec testing.


Put it shorter - as Spec ia CPU benchmark designed to be used to cross-benchmark cpus with different instruction sets there's absolutely no way that any subset of spec includes any kind of assembly.
 
Last edited:

Hitman928

Diamond Member
Apr 15, 2012
6,187
10,695
136
That's because it is spec. Spec is meant to benchmark difference in hardware not in software, so they offer source code without hand tuned assembly. And no, nobody could use any hand-tuned assembly in any subset of spec tests. Compiler is though free auto-vectorize what it can so SSE/AVX isn't turned off, only hand-tuned asssembly parts doesn't exists in spec testing.


Put it shorter - as Spec ia CPU benchmark designed to be used to cross-benchmark cpus with different instruction sets there's absolutely no way that any subset of spec includes any kind of assembly.

Yes, in my previous post I agreed that it makes sense to disable ASM. A couple years ago, when I was running some of these tests myself, the one that really stuck out to me was x264 as once I turned off ASM, whether I had SSE/AVX and Zen flags set or not made no difference in performance. Now, this was a couple of years ago with Zen+ where I was matching the compiler Anandtech was using at the time, so it is possible that their move to a more modern version allows for better auto-vectorization on the Zen family. If I have time this week I'll try try and get the LLVM compiler setup to match theirs and try it again. I am pleased to see that Anandtech is using LLVM for x86 which at least is a better match than GCC when comparing against Apple's LLVM based compiler.
 

Doug S

Platinum Member
Feb 8, 2020
2,785
4,750
136
Here's a review of an x86 mobile APU:


Look at all of the benchmarks in that article. Now look back at their article on the M1 Pro/Max. See any differences?


I see them running a bunch of Windows specific applications, plus some cross platform stuff like GIMP which may or may not have a macOS port and a lot of the Windows stuff probably lacks a Windows/ARM port too. What are you expecting Anandtech to do, port a bunch of Windows stuff to the Mac so they can give you more benchmarks?
 

Doug S

Platinum Member
Feb 8, 2020
2,785
4,750
136
Interesting how the AT review states that Apple hasn't defined any max TDP for their SoC and it just goes as far as it can until thermal conditions prevent going any further. Someone needs to test an M1 Max in Alaska in minus temperatures. This also suggests that the Mac Pro with water cooling could be formidable.


I think you're misunderstanding what he's saying. There is no turbo, so it won't run faster with better cooling. It will only throttle less - and he notes that it is really hard to reach a situation where that happens - see his comment about the "high power mode" only being useful for something like running an overnight render and would have made no difference in the results he reported.
 

jeanlain

Member
Oct 26, 2020
159
136
86
I see them running a bunch of Windows specific applications, plus some cross platform stuff like GIMP which may or may not have a macOS port and a lot of the Windows stuff probably lacks a Windows/ARM port too. What are you expecting Anandtech to do, port a bunch of Windows stuff to the Mac so they can give you more benchmarks?
++
To compare CPUs with the same ISA on the same OS, you have a lot of apps and benchmark tools at your disposal. Performance differences should mostly reflect hardware differences.
Comparing CPUs with different ISAs running different OSes is another matter entirely. Different APIs (DX vs Metal for instance), different degrees of optimisation, etc.
 

StinkyPinky

Diamond Member
Jul 6, 2002
6,886
1,103
126
++
To compare CPUs with the same ISA on the same OS, you have a lot of apps and benchmark tools at your disposal. Performance differences should mostly reflect hardware differences.
Comparing CPUs with different ISAs running different OSes is another matter entirely. Different APIs (DX vs Metal for instance), different degrees of optimisation, etc.

I think you guys are missing the point here. It doesn't have to be a like for like comparison. If they use a different API, so be it. I'm not interested in just the hardware being tested, but how the hardware using said software performs against Windows based laptops.

I strongly disagree that reviews should "mostly reflect hardware differences". On the contrary, they should reflect the ecosystem the hardware is based in just as much as the hardware itself.
 
  • Like
Reactions: Tlh97 and scannall

Eug

Lifer
Mar 11, 2000
23,825
1,396
126
Teardown:


Ports appear to be modular swappable parts, not soldered to the logic board. If true, colour me surprised. If its not true, then it looks like those ports are extremely heavily reinforced at least.

csevJ3w.jpeg

Also, the batteries are no longer glued in. They are stuck on with iPhone-style adhesive tape with regular pull tabs to ease removal.

The fans are not on top of the logic board. Instead, the logic board is shaped like a W with curved cutouts in which the fans sit.

xF9plqK.jpeg

M1 Max package is giant of course.
 

insertcarehere

Senior member
Jan 17, 2013
639
607
136
Especially how close the Threadripper was to the m1 in energy efficiency in 4k rendering.

The guy did a back of the napkin calculation using maximum TDP figure * time run, and assumed the M1 Max has a 60w TDP. So this efficiency comparison only makes sense if TDP == Power consumption
- We know that the M1 Max (including DRAM) consumes nowhere near 60w in CPU tasks
-The other chips in the comparison also have increasingly tenuous relationships between listed TDP and power consumption as well.
 
Last edited:

yottabit

Golden Member
Jun 5, 2008
1,496
529
146
So any speculation on how they will handle desktop Mac Pro? Starting with "will there be one"?

Next, would it be of an SOC design? Or separated higher core count (16-24) M1 family CPU with PCIe bus for NVMe drive and discrete graphics?

It seems like it would be a shame to give up the benefits of the unified memory from what we have seen so far. But at the same time I can't imagine them making a die much larger than the m1 max.

I could see an SOC with similar die space but a higher power, higher frequency design. But what about expandability people would expect from Mac Pro?

Could it be something expandable with additional m1 "compute modules" on some proprietary slot?

It makes me excited to armchair speculate this one since it all seems so wild to me, and I really don't know what to expect. I personally think an m1 CPU with some on package memory and PCIe slot expandabiity is most likely.

And something like a Mac Mini Pro with 0 traditional expandability beyond ( maybe drive bays) is second most likely.
 

Eug

Lifer
Mar 11, 2000
23,825
1,396
126
I thought this test was interesting since it includes a score that caps power utilization for an Intel mobile chip.

V-Ray CPU running through Rosetta on 10-core M1 Pro, compared against reported scores for i9-11980HK running at full tilt and i9-11980HK capped at 45 Watts.

Screen Shot 2021-10-26 at 8.07.24 PM.png

 

Roland00Address

Platinum Member
Dec 17, 2008
2,196
260
126
Teardown:
Wow my brain is still processing this.

Part of me days earlier was wishing for more ports, and we probably could jam some extra ports in there with some redesign. But my brain is still processing how they made everything mostly modular and replaceable at apple that my earlier wish is no longer registering for I feel happiness.
 

Eug

Lifer
Mar 11, 2000
23,825
1,396
126
Wow my brain is still processing this.

Part of me days earlier was wishing for more ports, and we probably could jam some extra ports in there with some redesign. But my brain is still processing how they made everything mostly modular and replaceable at apple that my earlier wish is no longer registering for I feel happiness.
I guess that is one reason why the machine is thicc and heavy.

BTW, the modular ports are definitively confirmed. Here is the W shaped motherboard, with big SoC + RAM package, with the ports removed. The 1 yuan coin is 2.5 cm (1 inch) across. Note the fan cutouts in the mobo.

16-inch-macbook-pro-teardown-lovetodream-2.jpg

(That top part in the pic is the removed heat sink / heat pipe flipped upwards.)

 
Last edited:
  • Like
Reactions: lightmanek

Doug S

Platinum Member
Feb 8, 2020
2,785
4,750
136
I think you guys are missing the point here. It doesn't have to be a like for like comparison. If they use a different API, so be it. I'm not interested in just the hardware being tested, but how the hardware using said software performs against Windows based laptops.

I strongly disagree that reviews should "mostly reflect hardware differences". On the contrary, they should reflect the ecosystem the hardware is based in just as much as the hardware itself.

If you use a specific app then by all means compare how that app performs on various hardware, and if some hardware is saddled with a poor quality port or has to run under emulation that's a disadvantage you want to know about - you should get what performs best for you.

If however you are trying to compare two platforms against one another, doing your comparisons in that way makes the results much less useful. If you and I had a contest to see how many pushups we could do in a minute, and you do proper military style pushups with someone watching you and only counting the ones you do with proper form, and I'm just doing whatever I consider a pushup and counting them myself, would you consider that fair? That's basically what you're suggesting by "it should reflect the ecosystem" - so in our bet if we do them differently well that's too bad for whoever is following stricter form.
 

jamescox

Senior member
Nov 11, 2009
644
1,105
136
I don’t have time to read 100 pages on the benchmarks and whatever, but considering the cpu/gpu performance and quality of other components (like the screen), what other laptops are going to come even remotely close?

I could use a new laptop and was thinking of getting a framework laptop, but the new Mac pros make that seem pretty obsolete. I don’t like everything soldered on one goes and glued shut, but I think the framework laptop just has a 4 core intel chip with integrated graphics. That probably isn’t going to compete even with the low end Mac Pro. I also want a larger laptop, so the 16 inch pro might be the machine to get. To compete with Apple, it seems like AMD or Intel would need to make an APU with some HBM stacks or at least some graphics memory. I don’t know where the software stack is at for unified memory architectures. I would likely run Linux.
 

DrMrLordX

Lifer
Apr 27, 2000
22,065
11,693
136
I see them running a bunch of Windows specific applications, plus some cross platform stuff like GIMP which may or may not have a macOS port and a lot of the Windows stuff probably lacks a Windows/ARM port too. What are you expecting Anandtech to do, port a bunch of Windows stuff to the Mac so they can give you more benchmarks?

You are making this too complicated. One article has a variety of benchmarks. The other has . . . two.

Two benchmarks.

Yay?
 

Gideon

Golden Member
Nov 27, 2007
1,774
4,145
136
To compete with Apple, it seems like AMD or Intel would need to make an APU with some HBM stacks or at least some graphics memory. I don’t know where the software stack is at for unified memory architectures. I would likely run Linux.

Even just 4 channels of LPDDR5 would be plenty. If there are enough CUs just add in some sort of SLC/Infinity cache to memory controllers on die (like on the M1) and you could easily performance way above 3050 Ti.

If you're worried about pins, just put the memory on package like apple does.

Considering the mining craze, current GPU prices and the fact that Apple does huge integrated graphics now - OEMs should come around soon-enough for the amounts of money they can save alone.

Yeah that premium SKU might cost a bit and might not sell in crazy-volumes. But considering that it can be assembled from existing blocks and AMD makes more in a quarter they made in the whole 2016 (and Q4 will probably rival 2017), it should be worth it for mindshare alone. I mean, just look how relatively small the 128-bit memory controller and 8MB of extra cache is, it shouldn't be all that big of a problem:

Mf2IBG4.png
 
Last edited:
  • Like
Reactions: lightmanek

biostud

Lifer
Feb 27, 2003
18,700
5,434
136
The question is, and maybe not for this thread, if AMD could make their future APU's for Xbox and Playstation as a laptop CPU. As we can see here a very powerful SoC can be made for laptops, but if it would make sense for the diverse x86 market as well?
 

Shivansps

Diamond Member
Sep 11, 2013
3,875
1,530
136
The question is, and maybe not for this thread, if AMD could make their future APU's for Xbox and Playstation as a laptop CPU. As we can see here a very powerful SoC can be made for laptops, but if it would make sense for the diverse x86 market as well?

The 4700S makes it clear that AMD is not allowed to do that.
 

Gideon

Golden Member
Nov 27, 2007
1,774
4,145
136
Some Rust compilation benchmarks:


Deno: M1 Max: 6m11s, M1 Air: 11m15s
bat: M1 Max: 42.9s, M1 Air: 1m23s
hyperfine: M1 Max: 23.1s, M1 Air: 42.2s
ripgrep: M1 Max: 16.1s, M1 Air: 36.5s

Some comparisons from the reddit thread
Compiling deno from scratch on my 5950X (with everything RAMdisked) took 5m 05s. This is with full unlocked PBO and some of the best cooling I could cram into this case.
ASUS Zephyrus G15 with a 5800HS, compiling on the solid state drive with lld as the chosen linker (which may make a significant difference, I don’t know and was too lazy to disable that temporarily): 6m43s.

So not earth shattering, but looks pretty darn good for software development