Discussion Apple Silicon SoC thread

Page 33 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,587
1,001
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

Screen-Shot-2021-10-18-at-1.20.47-PM.jpg

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:

 
Last edited:

Eug

Lifer
Mar 11, 2000
23,587
1,001
126
Media accelerators are in modern PCs too. Intel Quick Sync, NVIDIA NVENC and whatever AMD calls theirs. I've run older computers (like 3rd gen Intel systems) without GPUs and man does just normal use in Windows show how much media and 2D accelerator hardware can make a huge difference.
That’s just it. Even with the discrete GPUs the higher end Intel Macs had, they had some performance issues under some specific circumstances with video editing that are solved on M1. And the Intel iGPUs were just inadequate.

Spot on.

I personally would be all over an 8x4 Apple Silicon CPU in an iMac or MBP16, and I'm sure it would be plenty fast. But at the end of the day, I expect most users would actually benefit more from Apple using all that extra space to massively increase the number of GPU cores, combined of course with faster unified memory to feed them. An M1 with twice the GPU oomph would be quite something.
I don’t think the media acceleration I was talking about is on the GPU. It’s probably T2 +/- other and not GPU or CPU.


Apple T2 Security Chip Comes to Mac mini
The Apple T2 Security Chip brings industry-leading security to your Mac mini. The T2 features an SSD controller with on-the-fly data encryption so everything stored on the SSD is automatically and fully encrypted. The Secure Enclave in T2 ensures that software loaded during the boot process has not been tampered with. T2 also features HEVC video transcoding that’s up to an incredible 30 times faster, enabling pro users to work more quickly with higher resolution video throughout their workflow.1


1 Testing conducted by Apple in October 2018 using preproduction 3.2GHz 6-core Intel Core i7-based Mac mini systems with 64GB of RAM and 2TB SSD, and shipping 3.0GHz dual-core Intel Core i7-based Mac mini systems with 16GB of RAM and 1TB SSD. Performance tests are conducted using specific computer systems and reflect the approximate performance of Mac mini.
 
  • Like
Reactions: scannall

IvanKaramazov

Member
Jun 29, 2020
56
102
66
I don’t think the media acceleration I was talking about is on the GPU. It’s probably T2 +/- other and not GPU or CPU.
Right, sorry, I didn’t mean for the acceleration scenarios you were describing. I meant rather that if one had to choose between improving either general multi-core CPU performance (via an 8+4 variant, thanks @name99) or general GPU performance in a larger M Chip, that perhaps GPU would have greater marginal benefit for more types of consumers.

Presumably they’ll aim for both, obviously.
 

Doug S

Platinum Member
Feb 8, 2020
2,267
3,519
136
When I have seen it mentioned in reviews it's only been as a negative. It goes against Apple long standing ethos, of only doing things that can be well done.

People are testing iOS apps that haven't been updated to support this yet. Check back in six months, and maybe it will be more useful. Not for every app, there's no reason someone is going to update a game that requires the phone's gyros/accelerometers useful on a Mac which lacks them.

But will something like the HBO Max app get modified to go to full screen for instance? I'd be shocked if it doesn't, because the same reason that makes an app preferable than a web link on a phone is also true on a Mac. No one makes such apps for the laptop/desktop market, but if a few tweaks to your iOS app will make it suitable for that, why not?
 
  • Like
Reactions: name99

Doug S

Platinum Member
Feb 8, 2020
2,267
3,519
136
I personally would be all over an 8x4 Apple Silicon CPU in an iMac or MBP16, and I'm sure it would be plenty fast. But at the end of the day, I expect most users would actually benefit more from Apple using all that extra space to massively increase the number of GPU cores, combined of course with faster unified memory to feed them. An M1 with twice the GPU oomph would be quite something.

Who says they won't add 4 more cores AND double the GPU cores as well? In fact I think that's almost a given, since an SoC designed for higher end Macs not only needs more CPU power but also more GPU power to beat the discrete laptop GPUs people would otherwise be wanting.

As for memory, it obviously needs to support a lot more memory. Whether that's still closely coupled LPDDR available at a wider range of capacities or DIMMs, I am willing to bet that 8+4 chip also has 4 64 bit memory controllers rather than 2 of the M1. It is almost a requirement given that Andrei has found that a single big core in the M1 can pretty much saturate the memory bus!
 

Eug

Lifer
Mar 11, 2000
23,587
1,001
126
But will something like the HBO Max app get modified to go to full screen for instance? I'd be shocked if it doesn't, because the same reason that makes an app preferable than a web link on a phone is also true on a Mac. No one makes such apps for the laptop/desktop market, but if a few tweaks to your iOS app will make it suitable for that, why not?
Because you can use Catalyst to create a proper Mac app instead. No need to depend on an iPadOS app for use on a Mac.

 

Heartbreaker

Diamond Member
Apr 3, 2006
4,228
5,228
136
Spot on.

I personally would be all over an 8x4 Apple Silicon CPU in an iMac or MBP16, and I'm sure it would be plenty fast. But at the end of the day, I expect most users would actually benefit more from Apple using all that extra space to massively increase the number of GPU cores, combined of course with faster unified memory to feed them. An M1 with twice the GPU oomph would be quite something.

With all extra functional units inside the SoC, what is your use case that still needs more than 4 performance cores?
 

IvanKaramazov

Member
Jun 29, 2020
56
102
66
With all extra functional units inside the SoC, what is your use case that still needs more than 4 performance cores?
I have very little use case for it, if I'm being honest, which is why I was agreeing with you. I think the lizard part of my brain wants 8 performance cores, while rationally I have little use for them. I do a few tasks regularly that completely peg the CPU on my 2016 MBP 16" (OCR, bizarrely, is one, which I do very frequently. I wonder if that could be sped up by the neural engine though?). But in general, I'm only very, very occasionally pushing the CPU to the max. The M1 is probably twice as fast as my current laptop in ST and thrice as fast in MT. I really don't need more than that.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,228
5,228
136
From what I have seen online, the CPU in my current machine has about 1/10th the Single core performance of the M1, and I seldom peg all 4 of it's cores and it really doesn't feel slow to me, and I don't have any media encoders or support systems at all. Nearly everything has to be done through the CPU.

It would be an insane 10X CPU upgrade to get an M1. I was planning to build a new PC, but now I am starting to consider just getting a Mini and indefinitely shelving the idea of a new build...

So much power, in such a small, cool and quiet package. Very tempting.
 

Eug

Lifer
Mar 11, 2000
23,587
1,001
126
From what I have seen online, the CPU in my current machine has about 1/10th the Single core performance of the M1, and I seldom peg all 4 of it's cores and it really doesn't feel slow to me, and I don't have any media encoders or support systems at all. Nearly everything has to be done through the CPU.

It would be an insane 10X CPU upgrade to get an M1. I was planning to build a new PC, but now I am starting to consider just getting a Mini and indefinitely shelving the idea of a new build...

So much power, in such a small, cool and quiet package. Very tempting.
What do you have?
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,228
5,228
136
What do you have?

Core 2 Quad running 3.2 GHz. It might be a little more than 1/10th of M1 on ST. I was mixing up CB R23 and Geekbench.

From comparing the another score online I figure mine would get about ~175 ST, ~700 MT in CB R23.
 
Last edited:

podspi

Golden Member
Jan 11, 2011
1,965
71
91
From what I have seen online, the CPU in my current machine has about 1/10th the Single core performance of the M1, and I seldom peg all 4 of it's cores and it really doesn't feel slow to me, and I don't have any media encoders or support systems at all. Nearly everything has to be done through the CPU.

It would be an insane 10X CPU upgrade to get an M1. I was planning to build a new PC, but now I am starting to consider just getting a Mini and indefinitely shelving the idea of a new build...

So much power, in such a small, cool and quiet package. Very tempting.

It would be more tempting if we knew it could also run, say, WARM. I have no interest in getting stuck in Apple's walled garden, but I really think I want that CPU.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,774
3,153
136
I don't think Andrei used the term TDP, he simply measured the CPU power draw for Cinebench as 15w, whereas he measures it elsewhere going as high as 24w iirc. I think the Notebookcheck scores are using an external monitor and thus monitoring whole system power draw for those Renoir Cinebench figures, but that still may suggest (I'd love for people to measure for themselves here to confirm) that the M1 is achieving its score at around half the power draw of the Renoir chips.
i already did in this thread

https://forums.anandtech.com/thread...ench-5-single-core-1700.2587205/post-40355153


R23 mutli thread (8C/8T)
Score: 5851
Core clocks: 3.0ghz all cores
TDP limit 17watts
17.x watts package power consumption
2.x watts per core power consumption

will do again with higher performance mode on


R23 mutli thread (8C/8T)
Score: 6606
Core clocks: 3.3ghz all cores
TDP limit 25watts
25.x watts package power consumption
3.[0-2] watter per core power consumption
 
  • Like
Reactions: Tlh97 and Hitman928

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
Some additional news:

Tensorflow Acceleration on the Mac

It works both on Intel Mac and new M1 Macs and well, the later being a lot faster. As far as I understood it uses the GPU and not the "Neural engine" but not 100% clear.

However: Most stuff in production still uses TF1 and pretty much nothing uses TF2.4. So it will probably take some time till this becomes very useful. Due to compatibility one usually doesn't want to build their models on the newest version.

But just adds more salt into the Intel/AMD wounds. With all the talk about Navi efficiency, well it could be much better (honestly the iGPU is the true winner in M1, much more baffling).
 

Eug

Lifer
Mar 11, 2000
23,587
1,001
126
Haha. Codeweavers was able to run CrossOver on M1 to run Windows apps.

I can't tell you how cool that is; there is so much emulation going on under the covers. Imagine - a 32-bit Windows Intel binary, running in a 32-to-64 bridge in Wine / CrossOver on top of macOS, on an ARM CPU that is emulating x86 - and it works! This is just so cool.


It is an old benchmark but has anyone got any 11.5 cinebench numbers so I can compare 8 year old systems 🤓?

Precisely for computers like this 😁
Notebookcheck?

What computers?

Core 2 Quad running 3.2 GHz. It might be a little more than 1/10th of M1 on ST. I was mixing up CB R23 and Geekbench.

From comparing the another score online I figure mine would get about ~175 ST, ~700 MT in CB R23.
I tried installing CB R23 on my Core 2 Duo MacBook running Catalina. It failed to launch.

However, it works fine on my Windows 10 Phenom II 1055T.

ST: 455
MT: 2288

CB_R23_ST.PNGCB_R23_MT.PNG

BTW, it seems the Maxon illustrator broke into my home to copy my Ikea furniture. I had this table and wooden chairs in my dining room 20 years ago. Even had the same colour cushions. And hell, even our floors were the same. Also, I had Paradigm Monitor 7 speakers in that room too that looked identical. o_O

My old dining room:

diningroom1_500.jpg

Cinebench R23:

R23_livingroom.PNG
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
For example, in a typical 4K encode depending on the exact media a 10600K can beat a 3900X by a solid 40% in encode time. That would be because of quicksync. But that also is irrelevant for someone doing encoding, because a 1660 Ti is more than 3x faster than a 3900X and over twice as fast as the Quicksync encodes.

And ya know what? It makes almost no difference what CPU you use for that task. But here we are, encoding is one of the biggest categories of benchmarks these "enthusiast" sites use....

You have to wonder if hardware acceleration is as potent as you claim (and it definitely is, performance wise), why would anybody with sense use the CPU for encoding? The reason is because using the CPU produces superior quality and smaller file sizes. Now you can as you say achieve similar quality with hardware encoders by using higher bitrates, but the resulting output files will be much larger than what a CPU based encoder can do.

So if you like quality and efficiency, CPU encoding is king. If you like speed and convenience, hardware accelerated encoding is king.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Haven't really had a chance to comment on the M1's performance yet since I've been busy all week.

Honestly, this is about the best case scenario that I could have wanted. The M1 is a beast no one could doubt, and extremely efficient. Shots were definitely fired across the bow of Intel, AMD and x86-64, and it shows that ARM microprocessors are potentially a huge threat when done right.

But given the M1's exceptional engineering and process node advantage, Zen 3 is still very competitive from a performance standpoint in single threaded performance and superior in multithreaded despite being one node behind. There's no reason to believe that Zen 4 would not significantly outperform the M1 across most workloads, and significantly tighten the gap when it comes to performance per watt and overall efficiency given a healthy boost in IPC while maintaining a strong advantage in clock speed. I'm intentionally not counting Intel as they likely will not catch up with AMD or Apple in performance per watt until they can get their 7nm process out.

At any rate, the emergence of the M1 is nowhere near a devastating blow to x86 as many ARM proponents suggested it would be. What this has shown though to me personally is that despite the fanfare surrounding ultra wide designs like the M1, CPUs with narrow microarchitectures like Zen 3 are just as relevant; with the former not being demonstratively superior to the latter. A repetitive refrain I always hear from engineers is that's it all about tradeoffs..
 

DrMrLordX

Lifer
Apr 27, 2000
21,637
10,855
136
What was my original goal post and what's the new one? It hasn't changed.

You said it was "the fastest" and then tried to weasel out of it when someone pointed to a benchmark that M1 could actually lose to a mobile chip. In fact, I'll put myself out there by saying that any sufficeintly-multithreaded benchmark that doesn't make extensive use of the M1's fixed-function hardware will be faster on the 45W 4900H (which is barely a mobile CPU, but hey what can you do). It's not what the M1 is really "designed to do" and so it makes no sense for it to win those benchmarks, especially given its significantly-lower power consumption. But it'll lose all kinds of benches like Blender or Cinebench against a 4900H.

There's also the problem of the 4900H and M1 sharing a tiny number of benchmarks in common, though FOSS could easily solve that if/when someone actually starts bothering to compile stuff like Blender on an M1 system just to see how it would run. And no, I am not suggesting that one would buy a machine equipped with an M1 with Blender in mind.

The M1's CPU is the fastest overall amongst any laptop CPUs.

Do you remember when the 1800x came out? As great as it was, the areas where the 7700k still won benchmarks prevented the 1800x from ever being "fastest overall". Ditto when the 2700x squared off against the 8700k. Once the 9900k was out, it clearly dominated the desktop PC scene, making it indisputably the fastest. Of course you can add nebulous terms like "overall" to try and confuse things.

I'm still waiting.

And you're going to continue waiting, apparently.

IMO, it's clearly the best laptop chip.

It is if you want MacOS and everything that comes with it. Or if you don't really care what OS you get so long as it works. Which it will.

Also honestly if I wanted a no-holds-barred gaming laptop, M1 would not be my choice either. There are probably some other niches where I would not want the M1. For most users who just want a laptop that works and runs some popular software that you can get on MacOS or Windows (or just MacOS alone), it looks like a great deal.

But hands down, the M1 Macs are the most impressive thing I have seen, and I am no Apple fan. I have never owned a single Apple product in my life. They just did an amazing job here.

I'm confused as to why Apple hasn't looked to monetize their custom cores in other ways. Yes, they will make a lot of money here, but there's so much more money out there. It's a little scary if you think about it.

Whether M1 specifically is the fastest laptop chip is a needless argument.

It is! Actually what was needless was trying to establish it as "the fastest" in the first place. It's definitely the fastest within its own power envelope.

GB5 doesn't do any meaningful work, and neither does Cinebench. That Cinema4D is good for something is not an argument against GB5, because some of its subtests do meaningful work like the compiler benchmark. More people run compilers than run Cinema4D, so that subtest alone is more valuable than Cinebench.

Cinebench, at least since R20, utilizes the same codebase as existing Cinema4D hardware. You can use its scores to get very detailed information about how well your system will run Cinema4D compared to others. GB5 approximates compiler performance as one of its subtests which you have to dig into the results to find (it's not reported by the overall score). And which compiler is it meant to represent? LLVM? GCC? ICC? VisualStudio? Something else?
 

jeanlain

Member
Oct 26, 2020
149
122
86
Fun fact: many people are more impressed by the M1 GPU than by its CPU.
The chief reason is that there is a clear distinction between discrete and integrated GPUs, and that computers iGPUs have been historically very weak. But if PCs had console-like APUs, the M1 GPU would be very far from these in performance.
In perf/W, the M1 Firestorm core should be about 4X better (single-thread) than TGL for instance (about >5W/core for the M1, and 21W for the 1185G7).
The GPU is, according to Apple, only 3X better than the competition in perf/W.