Discussion Apple Silicon SoC thread

Page 94 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,825
1,396
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

Screen-Shot-2021-10-18-at-1.20.47-PM.jpg

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:


M4 Family discussion here:

 
Last edited:

biostud

Lifer
Feb 27, 2003
18,700
5,434
136
Doesn't Apple simply need to be good at everything, and then excel (destroy competition) in content creation to be a success? Nobody is going to to buy this primarily as a gaming rig, but obviously it will be nice if it can handle games as well.
 

Doug S

Platinum Member
Feb 8, 2020
2,785
4,750
136
Doesn't Apple simply need to be good at everything, and then excel (destroy competition) in content creation to be a success? Nobody is going to to buy this primarily as a gaming rig, but obviously it will be nice if it can handle games as well.


Apple's primary goal is to take care of the existing Mac customer base. A lot of that is content creation, that's why they spend so much time talking about that and not talking about frame rate in the AAA game of the moment or about how quickly Excel calculations complete.

Once Apple feels that the existing Mac customer base is satisfied with the transition then they can see if there are opportunities to expand the Mac customer base, either by capturing parts of their existing markets they don't currently have (i.e. people doing content creation work on PCs rather than Macs) or by gaining customers in markets not traditionally viewed as strengths on the Mac platform. I have to think games would be well down that list, since the problem there isn't about convincing gamers to buy a Mac - first they would have to convince game developers it is worth porting to Metal on a "build it and they will come" hope.

I think the Mac has plenty of potential to grow its market share while continuing to completely ignore the gaming market, so I don't see Apple changing its policy of not caring about gaming on the Mac anytime soon. If Apple did care about convincing game developers to port, increasing the Mac's market share is more likely to succeed than anything else they could do, so ignoring the gaming market in favor of other markets might in a roundabout way be the best way to make that happen!
 

Gideon

Golden Member
Nov 27, 2007
1,774
4,145
136
BTW there is a very nice website to watch for the state of M1 Mac gaming:


Overall the state is pretty bad. Hugely fragmented and almost no new AAA games actually run - unless they have metal implementations (in which case they actually do work very well).

What does work quite well are:
  • Native titles and those originally ported to x86 macs (in 64 bit versions only!)
  • Many indie-titles in Steam work well in both rosetta or native.
  • Older windows games (DX-11 and earlier) either in CrossOver (Wine) or Parallels (VM) for 32 bit games Parallels is the best pick

Just to prove the point. Here is a list of latest games used in Techpowerup GPU reviews:

GameMethod of runningRuns wellRuns at allUnsupported API?
Assassins Creed Valhalla-00DX12
Battlefield V-00
Borderlands 3rosetta11
Civilization IVrosetta11
Controlparallels01
Days gone-00
Cyberpunk 2077-00DX12
Deathloop-00DX12
Death Stranding-00DX12
Detroit Became Human-00Vulkan
Divinity Original Sin IIrosetta11
DOOM Eternal-00Vulkan
F1 2021-00DX12
Far Cry 5-00
Far Cry 6-00DX12
Gears 5-00DX12
Hitman 3-00DX12
Metro Exodusrosetta11
Red Dead Redemption 2-00DX12
Resident Evil Village-00DX12
Sekiro: Shadows Die TwiceWine11
Shadow of Tomb Riderrosetta11
Star Wars Squadrons-00
Wither 3-00
Watch Dogs Legion-00
Total (out of 25)67

  • So only 6 out of 25 run at all
  • 5 run actually well enough - just install via steam. Interesting is that none are actually ARM native! (BG3 is pretty much the only high-profile ARM native game)
  • 1 only runs when using wine, but runs well (Sekiro)
  • 1 runs barely in a a VM (Control)
Luckily older games (DX9-DX11) are usually better supported.

The most obvious problems are: lack of DX12 support in Parallels. If they somehow manage to enable this, things would probably go from bad to passable (though that alone is no guarantee 4 DX11 capable titles in the list don't run).

And of course the self-inflicted wound of not supporting Vulkan. The latter can be bypassed by devs using MoltenVK or other such tools, but that is rarely a plug-and-play solution in practice and requires plenty of dev resources.


TL;DR - Gaming on ARM Macs:

Legacy/indie titles should work (let's say at least half of the time). Latest AAA games fail to run 4 times out of 5. If the title only supports DX12, it's a guaranteed fail.
 

moinmoin

Diamond Member
Jun 1, 2017
5,064
8,032
136
Regarding games I don't think Apple will make any more effort than pushing universal apps as the solution of covering all their platforms:
 

Gideon

Golden Member
Nov 27, 2007
1,774
4,145
136
Regarding games I don't think Apple will make any more effort than pushing universal apps as the solution of covering all their platforms:

Probably. I know Apple doesn't care about gaming and their users don't game at all, other than their favorites from other platforms that just happen to run in macOS as well.

I just really wish they did something Vulkan in some limited manner. Pretty much as they do with Rosetta 2 and x86 support. A separate executable you have to download that comes with an EULA that essentially says that Apple has no commitments, etc ...

That would allow them to:
  • Only support M1 Macs - essentially only one generation of GPUs and no headaces for iOS
  • Instantly allow to be 9/10 as good in gaming as Linux (as Proton could then essentially be ported to MacOS).
  • All of this would actually put a dent in Windows exclusivity to gaming and motivate devs to support Vulkan vs DX12
Metal and Vulkan are similar enough that such a product shouldn't be that big of a deal (for Arm Macs only). I'm sure most of it could even be implemented as a side-project by Apple's gaming-enthusiast devs, if they only gave a green light.
 
Jul 27, 2020
20,040
13,739
146
Metal and Vulkan are similar enough that such a product shouldn't be that big of a deal (for Arm Macs only). I'm sure most of it could even be implemented as a side-project by Apple's gaming-enthusiast devs, if they only gave a green light.
Apple doesn't even have to do anything.

Portability benchmark of Dota2 on MacOS - gfx-rs nuts and bolts

It's up to the developers to target MoltenVK. Epic Games' Tim Sweeney would have likely extolled the virtues of the M1's GPU with his Unreal Engine but Apple decided to alienate him through litigation in the Epic Games v. Apple case (I know Epic sued them but Apple failed to see the bigger picture and refused to settle).
 
  • Like
Reactions: Tlh97 and moinmoin

Shivansps

Diamond Member
Sep 11, 2013
3,875
1,530
136
Because Apple is Apple, they dont support Vulkan because no one would use Metal anymore and supporting Vulkan also means supporting development that could be used on diferent platforms.

Still, MoltenVK can be used but thats petty much a fancy wrapper, so perf loss is expected, to support DX12 means implementing VKD3D on top of MoltenVK, thats two wrappers to get the game running.
Steam can get this done by porting Proton.
 

bigggggggg

Junior Member
Nov 27, 2020
18
12
41
Started reading the review, first things to notice: oh my god, this thing is stupidly fast

edit: 11% faster than a desktop 5900X for floating point workloads. It is just incredible, i have no words.
Maybe i should regret my comments about the 12900hk being faster than this :D
 

Attachments

  • 117496 (3).png
    117496 (3).png
    43.8 KB · Views: 44
Last edited:
  • Like
Reactions: Tlh97 and Pix12

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,729
136
And there were people already claiming that the power-efficiency is nothing to write home about, well how about that?

View attachment 51890

In Cinebench 23 M1 has 5% worse ST score and 3-4% MT score than a 11980HK. For ST workload the package draws 3x less and the whole machine 4x less power. For MT workload the difference is 2.5x.
Cinebench isn't a very relevant benchmark. Look instead at gcc and bwaves in SPEC. The former is a typical integer workload and the latter is a typical floating point workload. The perf/W advantage is 4x and 6x respectively in those two workloads compared to Tiger Lake + 10nm SF.

Part of the lead in bwaves is no doubt due to the ~5x effective memory bandwidth compared to what's available on x86 competition, but people who say node advantage in favour of Apple should finally shut up.

There's no company providing server-class memory bandwidth in a laptop.
 

Eug

Lifer
Mar 11, 2000
23,825
1,396
126
Well, this may explain the discrepant clock speeds reported by Geekbench:

The CPU cores clock up to 3228MHz peak, however vary in frequency depending on how many cores are active within a cluster, clocking down to 3132 at 2, and 3036 MHz at 3 and 4 cores active. I say “per cluster”, because the 8 performance cores in the M1 Pro and M1 Max are indeed consisting of two 4-core clusters, both with their own 12MB L2 caches, and each being able to clock their CPUs independently from each other, so it’s actually possible to have four active cores in one cluster at 3036MHz and one active core in the other cluster running at 3.23GHz.

--

GPU runs at 1.3 GHz. This is not being reported correctly by Geekbench.

--

CPUs cannot max out memory bandwidth on their own. CPU memory scaling maxes out at 243 GB/s on M1 Max. M1 Pro can max out the 204 GB/s though.

Andrei can get the GPU to use up to 90 GB/s.

This allows extra memory bandwidth overhead for the rest of the SoC.

--

M1 Max idle power 0.2 Watts! <--- This is my favourite performance stat of the M1 Max of the entire review. Holy frickin' wow! Wall power of the entire 16" MacBook Pro at idle with screen turned down to minimum was 7.2 Watts.

Aztec High Offscreen + 511.povray 92 Watts. (In comparison, i9-11980HK uses 220 Watts.)

--

@Doug S was right about the CPU scaling at 1.6X theoretically.

Because the new M1 Pro and Max have 2 less E-cores, just assuming linear scaling, the theoretical peak of the M1 Pro/Max should be +62% over the M1. Of course, the new chips should behave better than linear, due to the better memory subsystem.

However, memory makes a huge difference so in some workloads the scaling is much higher due to the higher memory bandwidth, with some scores scaling at >2X.

117496.png


119365.png



--

GFXBench Aztec High Offscreen for M1 Max scales at 4X M1, and M1 Pro scales at 2X M1. Perfect scaling. Same goes for 3D Mark Wild Life Extreme.

Gaming not great though, not surprisingly.

--

Andrei agrees with my comments about the the ProRes/ProRes RAW acceleration:

To further improve content creation, the new media engine is a key feature of the chip. Particularly video editors working with ProRes or ProRes RAW, will see a many-fold improvement in their workflow as the new chips can handle the formats like a breeze – this along is likely going to have many users of that professional background quickly adopt the new MacBook Pro’s.

--

Overall, this solidifies my resolve to buy the lowest M1 Pro SoC available in the Mac mini next year, or else the M2, as long as it has enough ports. The M1 is already twice as fast as my current primary machine, a Core i5-7600, and I'm actually buying this for a secondary machine.

An 8-core (6+2) M1 Pro machine would be way more performance than I need, and M2 machine at say 10% faster than an M1 machine would also be more performance than I need. But I would prefer more ports than in the current M1 Mac mini.
 
Last edited:

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
All mildly amazing, yes.

One thing very gently intriguing me - shouldn't be there some GPU accelerated compute tasks where the massive memory pool and the shared memory between CPU & GPU (+ other bits) mean it gets even sillier? cf Linus Torvald talking about APU's having some distinct advantages, years ago now.

Maybe that is mostly seen in power savings.
 

Eug

Lifer
Mar 11, 2000
23,825
1,396
126
sixcolors: 14-inch MacBook Pro review: A Mac Pro in your backpack

mbp2021-tests-max-numbers.png

The coder geeks here will like that compile time. It's twice as fast as a 3.2 GHz 8-core Xeon W-2140B iMac Pro.

---

BTW, Andrei responded to someone about why Geekbench isn't scaling GPU scores so well on M1 Max. This is second hand information, but apparently Andrei thinks the Geekbench tests are too short, and the M1 Max GPU is not ramping up quick enough on such a short test.
 

insertcarehere

Senior member
Jan 17, 2013
639
607
136
Looks like any benchmark & workflow that is even somewhat bandwidth-sensitive will absolutely clean house on the M1 Pro/Max. Which in practical terms, should bode well for the sort of professional users which need to manipulate/edit/playback tons of high-resolution media at a time.

This crazy level of engineering work (512 bit LPDDR5 on an SoC!) almost justifies the crazy money Apple is charging for the Pros, almost.
 
Last edited:
  • Like
Reactions: Eug

Hitman928

Diamond Member
Apr 15, 2012
6,187
10,694
136
And there were people already claiming that the power-efficiency is nothing to write home about, well how about that?

View attachment 51890

In Cinebench 23 M1 has 5% worse ST score and 3-4% MT score than a 11980HK. For ST workload the package draws 3x less and the whole machine 4x less power. For MT workload the difference is 2.5x.

I don't think anyone was claiming the power efficiency wouldn't be really good, it's just that a particular poster was claiming 500% better than the most efficienct x86 competition which is bollocks and these new tests show as much. Also, everything looks super power efficient compared to Intel's chips :p.
 

Cali3350

Member
May 31, 2004
127
11
81
The Anandtech article notes that the max CPU clock of the performance cores in the M1 PRO is dictated by the number of active cores in a 4 core cluster.

Apple sells a M1 Pro binned chip with 8 cores (6 performance + 2 efficiency). Do you believe they force the binning such that only 1 performance core can be disabled from each cluster or no? It seems it could matter (however slightly) as you could have 2 cores at 3132 mhz on a cluster with both other cores disabled and 4 cores at 3036 on the other, vs a 1 + 1 disable where all cores would run at 3036.
Very slight difference, but it would be a difference in a chip Apple is selling as one SKU, which I am not familiar with happening before?
 

Hitman928

Diamond Member
Apr 15, 2012
6,187
10,694
136
The memory bandwidth Apple made available to a CPU in a laptop is insane. That is one area where it really is workstation class performance in a laptop. Now, I don't know how applicable it is for most users, but for those that can use it, there is absolutely 0 competition out there today in this format. The hardware accelerated tasks are obviously also very strong and are a big benefit of Apple being vertically integrated. From a CPU and GPU compute perspective, it is still quite impressive, but more in line with the node lead they enjoy compared to the other situations where the x86 competitors don't/can't offer competitive solutions in this format.
 

StinkyPinky

Diamond Member
Jul 6, 2002
6,886
1,103
126
These "reviews" are terrible. Most of them are just running the same crap synthetic benchmarks we already had leaked to us. Is someone actually going to run those machines through their paces at some point? Anandtech is the only one that even made an effort and even that is somewhat lacking.
 
Jul 27, 2020
20,040
13,739
146
Anandtech is the only one that even made an effort and even that is somewhat lacking.
Well, AT admitted that it was tough trying to find something that makes use of the insane bandwidth and real world gaming benchmarks for the M1 are almost non-existent due to unavailability of a native AAA game. I guess the only other thing left to test is Windows ARM edition benchmarks running in Parallels.