Discussion Apple Silicon SoC thread

Page 97 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,825
1,396
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

Screen-Shot-2021-10-18-at-1.20.47-PM.jpg

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:


M4 Family discussion here:

 
Last edited:

albertmamama

Junior Member
Aug 31, 2020
5
2
51
That is just the memory bandwidth.

What we really want to see is the MH/s for Eth mining.

I have a feeling it is going to be 1/2 the MH/s of a 3080, but at 1/4 watts. Which would be exceptional.


-Leeea tries to catch the attention of the wolves: "hey, look over there, fresh meat!"-

Eth mining is bandwidth-bound, so it's comparable from bandwidth alone.
 

USER8000

Golden Member
Jun 23, 2012
1,542
780
136
FCjDX5uWQAMrc4b


Laptop with a Ryzen 9 5900HS:
M1A.png

m1.png
 

BorisTheBlade82

Senior member
May 1, 2020
680
1,069
136
A 5800u at 15W limited TDP scores ~7500 points in MT for a perf/w score of 500. An M1max scores ~12400 at 34 watts for a perf/w score of 365. Clearly Zen 3 is more efficient than M1max. Additionally, for ST, an M1 chip uses just 3.8W to get the same score as M1max which uses 11W. That gives M1 a 2.9x performance efficiency advantage over M1max. They must have really screwed up the M1max design to do so bad in perf/w compared to M1. (/s)

It's easy to cherry pick data points with zero nuance/context and make whatever point you want to make.
I am not cherry picking. I am quoting Andrei Frumusanus general assessment of the competitive landscape regarding power and performance efficiency. Has it occurred to you that people like him are able to weight benchmark results not only based on some average means but also by their technical knowledge and experience? And has it occurred to you that you are the one cherry picking when he clearly states that CB is a negative outlier for the M1 Max?
The amount of ignorance you are showing tells me that you are not really interested in a grown-up discussion but instead will try to grasp at straws until the end of days.
 
  • Like
Reactions: Lodix

beginner99

Diamond Member
Jun 2, 2009
5,233
1,610
136
Apple's cores just are from totally different league than x86 rivals, and denial against it still lives strongly.
or putting it other way, x86 cpu manufacturers should be ashamed that they are beat so strongly by cpu from some lifestyle company which main target isn't selling cpus.....

Apple should be ashamed that we can't even find software to run workloads tailored for these cpus. Only half sarcastic. In data science it's very much either get an intel mac or ditch apple. The M1 isn't there yet or better said the software. And good luck if you need to do deep learning (albeit that didn't work before already)

Ditching backwards compatibility in a vertically integrated and locked platform will obviously be beneficial for performance. It's like consoles used to be. punching above their weight (here the weight is equal but the performance isn't). Neither intel or AMD can do that easily because the servers their chips need to run in often do run decades old software and need decades old instruction sets and use said instructions. Both also need to take into account the devices these things run in. their cheapest device isn't $2000. it's $300 laptops. Apples performance also directly translates to die size / silicon = cost.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,729
136
Going back and reviewing, I believe I mis-remembered the SPEC setup. I believe it is Anandtech that compiles with flags that turn off SSE for x264/x265. The actual flag is to disable ASM which makes sense as it is highly optimized for x86, but then the fallback is to not have any SSE/AVX. You'd have to check each tests compiler flags to see what is being used.
If you mis-remember then it is your duty to remember it properly and not make misleading claims.


These are the flags used by Andrei:

Code:
-Ofast -fomit-frame-pointer
-march=x86-64
-mtune=core-avx2
-mfma -mavx -mavx2
 

jeanlain

Member
Oct 26, 2020
159
136
86
Blizzard has been making games for the Mac since they started ages ago. Including World of Warcraft. So it should be a solid build.
Blizzard has been progressively abandoning the Mac.

I'd be very surprised if WoW included specific TBDR optimisations, using dedicated Metal APIs. My bet is that they simply converted their DX12 code into Metal with some automatic translation tool like everyone else does (including Feral), checked for bugs and performance issues, and recompiled the game to ARM.
All AAA games are coded for IMR GPUs, except perhaps Baldur's Gate III, which is being ported to the iPad.
In benchmark tools specifically designed to compare a wide array of devices, from mobile to desktop (e.g., GFXBench, 3DMark Wild Life), Apple GPUs perform much better.

But for the foreseeable future, Apple GPUs will underperform in AAA games, as these games will be coded with DX/Vulkan for IMR GPUs and ported to Metal as an afterthought (if they are ported at all).
We already had a large difference between performance on macOS and bootcamp on the same intel Mac. This difference will get larger, due the lack of optimisation for Apple in GPUs.
 

jeanlain

Member
Oct 26, 2020
159
136
86
So in 526.blender_r and 511.povray_r, the M1 Max massacres the 5980HS, but in Cinebench R23 it . . . doesn't? Wouldn't it be nice of AT to run a current build of Blender on their M1 Max review sample instead?
The M1 has been underperforming on Cinebench, so these results are not surprising. My guess is that Cinebench favours X86 and is not as optimised for other ISAs.
 

Gideon

Golden Member
Nov 27, 2007
1,774
4,145
136
While no blender there are at least some Handbrake transcoding results from Tom's Guide 14" macbook review (4K to 1080p):

4K to 1080p transcoding of a unspecified clip took:
  • 4 min 48s - 14" Macbook Pro (M1 Max)
  • 4 min 51s - 14" Macbook Pro (M1 Pro)
  • 7 min 20s - Razer Blade 14 (Ryzen 9 5900HX)
  • 7 min 46s - 13" 2020 MacBook Pro (M1)
  • 8 min 10s - Dell XPS 15 (Core i7-11800H)
  • 11 min 25s - Surface Laptop Studio (11 gen i5 - probably 11320H)
  • 18 min 12s - Dell XPS 13 (Core i7-1185G7)

Those 11-gen XPS'es are pitted as mac competitors in look, form and quality and they are not cheap. With similar specs and the OLED screen the 15" is straight in the 16" mac pricing territory. The 13" is relatively a tiny bit cheaper, but just look at the performance.

Razer Blade 14 is pretty much the best x86 laptop currently available in the form-factor, especially for multithreaded work. It at least has a RTX 3080 that vastly outperforms the M1 Pro (and the Max often as well). The Dells usually have RTX 3050 Ti's
 
Last edited:

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
4K to 1080p transcoding of a unspecified clip took:
  • 4 min 48s - 14" Macbook Pro (M1 Max)
  • 4 min 51s - 14" Macbook Pro (M1 Pro)
  • 7 min 20s - Razer Blade 14 (Ryzen 9 5900HX)
  • 7 min 46s - 13" 2020 MacBook Pro (M1)
  • 8 min 10s - Dell XPS 15 (Core i7-11800H)
  • 11 min 25s - Surface Laptop Studio (11 gen i5 - probably 11320H)
  • 18 min 12s - Dell XPS 13 (Core i7-1185G7)
Is that with hardware encoding enabled? I remember the original M1 was quite a bit slower than comparable x86 CPUs in software mode, which is why I am asking.
 
  • Like
Reactions: Tlh97 and USER8000

Gideon

Golden Member
Nov 27, 2007
1,774
4,145
136
Is that with hardware encoding enabled? I remember the original M1 was quite a bit slower than comparable x86 CPUs in software mode, which is why I am asking.
A good point, unfortunately the article is frustratingly light on details (not even the codec is mentioned).

Do you remember what reviews these were? Handbrake only added ARM support in version 1.4.0 released this July. Previously it must have ran using Rosetta 2 emulation.


EDIT:

Overall a good observation

This guy has a very detailed video comparing M1 Mini to a Intel Panther Canyon NUC (i7-1165G7) using both emulation and ARM native binaries running handbrake 10-bit x265 encode:

In his (totally unrelated and different test) it took:
  • 7min 34s for the M1 Mac Mini (native)
  • 7min 53s for the i7 Tiger Lake Panther Canyon NUC
  • 10min 36s for the M1 Mac Mini (running under emulation)
So it is true that at least in 4K x265 encoding the difference between M1 and Core i7-11xx series is a lot closer than in the Tom's Guide transcoding results (though the XPS 13 is slim so it might also be throttling hard)
 
Last edited:
Jul 27, 2020
20,040
13,739
146
Interesting how the AT review states that Apple hasn't defined any max TDP for their SoC and it just goes as far as it can until thermal conditions prevent going any further. Someone needs to test an M1 Max in Alaska in minus temperatures. This also suggests that the Mac Pro with water cooling could be formidable.
 

jeanlain

Member
Oct 26, 2020
159
136
86
I'm not even quite convinced it's x86 but rather it just scales extremely well with SMT.
I was referring to single-core scores. For instance, a tiger lake CPU scores higher than the M1 in single-thread cinebench runs, while it's the opposite in most other single-thread tasks (EDIT: 18 out of the 22 SPEC tests).
 
Last edited:
  • Like
Reactions: Gideon

DrMrLordX

Lifer
Apr 27, 2000
22,065
11,693
136

nickmania

Member
Aug 11, 2016
47
13
81
With this chip, Apple is giving the best option for filmakers, youtubers etc.... You can use one of the best video codecs, ProRes, and use your M1 system with renders and apply effects much faster than any other X86 computer, even desktops. They focus much more on this aproach instead of making a "good chip for everything". The render times in FInal Cut Pro are crazy and way ahead any X86 computer can make. So you need to use their codec, their computer and their software to obtain an increase in render times and effects. When you make the comparisions and said is not a good bench comparing render in premiere with renders in final cut, you need to understand that IS a good benchmark, if you take in consideration the user and the final result. The numbers are impressive.

Sure they are going to continue this approach in other sectors, designing the chips for specific tasks so they could sell the "whole package" a bit ahead of the x86 world, which is not going to be able to compete in the short term.

Yes, is overpriced but now they are selling something totally different than in the Intel era, and this time, it makes sense to buy one of this overpriced machines if it fit your workspace.
 

NTMBK

Lifer
Nov 14, 2011
10,324
5,360
136

If you read into the details- it's down to the fact that the M1 has unified memory architecture. For this workflow, not shuttling data back and forth across the PCIe bus is a huge win. This is what AMD have been saying since the Llano days ;)

Hopefully this will light a fire under PC manufacturers to take integrated graphics more seriously, and for AMD and Intel to actually bring those products to market.
 

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
If you read into the details- it's down to the fact that the M1 has unified memory architecture. For this workflow, not shuttling data back and forth across the PCIe bus is a huge win.

Still rather amazing to be beating such a total monster of a dGPU :)

This is what AMD have been saying since the Llano days ;)

Hopefully this will light a fire under PC manufacturers to take integrated graphics more seriously, and for AMD and Intel to actually bring those products to market.

It seems really, really hard though. AMD have been making the console chips for a while now so there's obvious theoretical capability but still nothing 'real' coming out of it. Never mind getting the software to play along.
 

Hitman928

Diamond Member
Apr 15, 2012
6,187
10,694
136
If you mis-remember then it is your duty to remember it properly and not make misleading claims.


These are the flags used by Andrei:

Code:
-Ofast -fomit-frame-pointer
-march=x86-64
-mtune=core-avx2
-mfma -mavx -mavx2

Unless something has changed within the last year or two, for x264 those flags won't turn on SSE/AVX. If you turn off ASM then there is no code path to enable SSE/AVX, those instructions sets are built into the ASM code path and are left out if ASM is disabled no matter what your other flags are set as. So no, what I said the second time wasn't misleading. You can download and compile the software yourself and check if you want to.

Edit: Most likely those flags are fine for the rest of the software but I'm not as familiar with all of the sub-test software so I didn't want to speak on them specifically and don't know what specific flags/optimizations may be used in the release builds versus Anandtech's builds.
 
Last edited:

Hitman928

Diamond Member
Apr 15, 2012
6,187
10,694
136
I am not cherry picking. I am quoting Andrei Frumusanus general assessment of the competitive landscape regarding power and performance efficiency. Has it occurred to you that people like him are able to weight benchmark results not only based on some average means but also by their technical knowledge and experience? And has it occurred to you that you are the one cherry picking when he clearly states that CB is a negative outlier for the M1 Max?
The amount of ignorance you are showing tells me that you are not really interested in a grown-up discussion but instead will try to grasp at straws until the end of days.

Appeal to authority and ad hominem attacks don't mean much to me.

There is a lot I could discuss about why you are cherry picking, but I'll just stick with the initial argument and ask where are Andrei's numbers for Cezanne showing the M1 with a 2.5-3x efficiency lead since that is what was actually being discussed?
 
Jul 27, 2020
20,040
13,739
146

A guy with his own renderer and some pure CPU power comparisons. Unfortunately, few modern CPUs, however very interesting is versus the Xeon W-3245, i7 9750, and the Threadripper 3990X.
The battery results for M1 Max are excellent. Only at 4K rendering does it slow down slightly when battery powered.