Discussion Apple Silicon SoC thread

Page 232 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Eug

Lifer
Mar 11, 2000
23,825
1,396
126
M1
5 nm
Unified memory architecture - LP-DDR4
16 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 12 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache
(Apple claims the 4 high-effiency cores alone perform like a dual-core Intel MacBook Air)

8-core iGPU (but there is a 7-core variant, likely with one inactive core)
128 execution units
Up to 24576 concurrent threads
2.6 Teraflops
82 Gigatexels/s
41 gigapixels/s

16-core neural engine
Secure Enclave
USB 4

Products:
$999 ($899 edu) 13" MacBook Air (fanless) - 18 hour video playback battery life
$699 Mac mini (with fan)
$1299 ($1199 edu) 13" MacBook Pro (with fan) - 20 hour video playback battery life

Memory options 8 GB and 16 GB. No 32 GB option (unless you go Intel).

It should be noted that the M1 chip in these three Macs is the same (aside from GPU core number). Basically, Apple is taking the same approach which these chips as they do the iPhones and iPads. Just one SKU (excluding the X variants), which is the same across all iDevices (aside from maybe slight clock speed differences occasionally).

EDIT:

Screen-Shot-2021-10-18-at-1.20.47-PM.jpg

M1 Pro 8-core CPU (6+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 14-core GPU
M1 Pro 10-core CPU (8+2), 16-core GPU
M1 Max 10-core CPU (8+2), 24-core GPU
M1 Max 10-core CPU (8+2), 32-core GPU

M1 Pro and M1 Max discussion here:


M1 Ultra discussion here:


M2 discussion here:


Second Generation 5 nm
Unified memory architecture - LPDDR5, up to 24 GB and 100 GB/s
20 billion transistors

8-core CPU

4 high-performance cores
192 KB instruction cache
128 KB data cache
Shared 16 MB L2 cache

4 high-efficiency cores
128 KB instruction cache
64 KB data cache
Shared 4 MB L2 cache

10-core iGPU (but there is an 8-core variant)
3.6 Teraflops

16-core neural engine
Secure Enclave
USB 4

Hardware acceleration for 8K h.264, h.264, ProRes

M3 Family discussion here:


M4 Family discussion here:

 
Last edited:

FlameTail

Diamond Member
Dec 15, 2021
3,952
2,376
106
.The wildcard in this is Intel. It is still claimed they will be using N3B for the N3 chiplets they are sourcing from TSMC, despite sufficient time for them to have ported to N3E. But there are a lot of variables such as available N3E capacity, terms of their deal with TSMC, etc. Heck maybe the plan is to get Apple on N3E and then Intel essentially takes over the N3B capacity Apple was using. TSMC may have less incentive to improve yields if Intel is the only one using N3B, given that Intel will become a major foundry competitor and is unlikely to ever use TSMC capacity to such a degree (and even if they are unhappy and in the future need more capacity than they have themselves, their only alternative is Samsung)
Yaya N3E is better but if everyone moves to N3E, then what becomes of the N3B production lines? Are they going to just sit and rot? Which is why I think TSMC would be incentivised to have Intel using N3B.
 

DrMrLordX

Lifer
Apr 27, 2000
22,065
11,693
136
The wildcard in this is Intel. It is still claimed they will be using N3B for the N3 chiplets they are sourcing from TSMC, despite sufficient time for them to have ported to N3E. But there are a lot of variables such as available N3E capacity, terms of their deal with TSMC, etc. Heck maybe the plan is to get Apple on N3E and then Intel essentially takes over the N3B capacity Apple was using. TSMC may have less incentive to improve yields if Intel is the only one using N3B, given that Intel will become a major foundry competitor and is unlikely to ever use TSMC capacity to such a degree (and even if they are unhappy and in the future need more capacity than they have themselves, their only alternative is Samsung)
That's why I asked about it in the first place. With yields like that, it's a bit concerning that Intel would finally start taking N3B wafers.
 

Doug S

Platinum Member
Feb 8, 2020
2,785
4,750
136
Yaya N3E is better but if everyone moves to N3E, then what becomes of the N3B production lines? Are they going to just sit and rot? Which is why I think TSMC would be incentivised to have Intel using N3B.

No they don't sit and rot, they would be repurposed to provide additional N3E capacity. The process flow is different but the equipment is all the same.
 
  • Like
Reactions: Saylick

Doug S

Platinum Member
Feb 8, 2020
2,785
4,750
136
That's why I asked about it in the first place. With yields like that, it's a bit concerning that Intel would finally start taking N3B wafers.

Well those were the reported yields when they started mass production for Apple, and that was after a year of risk production. Maybe they have a breakthrough that allows them to significantly increase yields, but it seems more likely that any improvement would be minor and N3B would have yields far below what TSMC normally considers acceptable for mass production.

Without knowing the details of the deal between TSMC and Intel, which we never will, all we'll ever be able to do is guess. Heck I don't think we know for SURE whether Intel will be using N3B, and even after they start shipping chips unless TSMC or Intel makes a clear statement what they are using we will never know unless someone like TechInsights gets out the electron microscope or whatever to find out.
 

DrMrLordX

Lifer
Apr 27, 2000
22,065
11,693
136
Heck I don't think we know for SURE whether Intel will be using N3B, and even after they start shipping chips unless TSMC or Intel makes a clear statement what they are using we will never know unless someone like TechInsights gets out the electron microscope or whatever to find out.
Also a valid point, and it'll probably be well-discussed by the time Arrow Lake finally hits the scene. Among other things.
 

mikegg

Golden Member
Jan 30, 2010
1,844
467
136
So A17P really has double the NPU resources of M3? That seems like an odd decision given how small the NPU is.
Yes. The specs are easy to find on Apple's website. Apple cut the NPU in half for M3 series.

The NPU is not that small in die area. It's about 4 e-cores. It's probably a concious decision to reduce cost on the M series. iPhones need more NPU power anyway due to the plethora of ML tasks such as camera on the iPhone and the need to save power. Meanwhile, Macs don't have much ML work - that is until local LLM, Stable Diffusion, etc become common. I'm sure Apple made the decision to cut the NPU in half for M3 series before the explosion of LLMs and Gen AI.
 

mikegg

Golden Member
Jan 30, 2010
1,844
467
136
Haha. I am chuckling now.

Intel and AMD are trying to one up each other's AI capabilities by boasting who has the more TOPS, while a smartphone chip casually has more TOPS than either of them.
Intel and AMD's AI accelerators are mostly just marketing speak right now. Windows has very few ML tasks that can be offloaded to an NPU. Even macOS, which Apple fully controls, doesn't have that much except for local Siri and some small tasks like organizing photos.

I think it's going to be a few years at least before NPUs on laptops/desktops become useful.
 

Eug

Lifer
Mar 11, 2000
23,825
1,396
126
Intel and AMD's AI accelerators are mostly just marketing speak right now. Windows has very few ML tasks that can be offloaded to an NPU. Even macOS, which Apple fully controls, doesn't have that much except for local Siri and some small tasks like organizing photos.

I think it's going to be a few years at least before NPUs on laptops/desktops become useful.
Hmmm... Photo organization and background processing in Apple Photos on the Mac sometimes can take far, far too long even on M-series chips. I don't know how much Photos uses the NPU though.

I attribute it to problematic programming though. For example, to just export native images from Photos, if I to export say 10000 photos, the machine will gobble up all the memory, try to do the task for extended period, and then crash. Memory leak? Remember, there is no image processing here at all (no NPU), since it's just a straight export of the originals.

OTOH, there are cheap third party apps that will do the export from the Photos database quickly with minimal memory usage.
 
Last edited:

FlameTail

Diamond Member
Dec 15, 2021
3,952
2,376
106
Both Apple M2 and Radeon 780M have upto 100 GB/s of memory bandwidth (LPDDR5-6400, 128 bit)

Yet the Apple GPU is a bit more powerful,

Screenshot_20231105_195431_YouTube.jpg

I believe the reason is the tile architecture of Apple's GPU, which allows them to pack in more GPU performance to the same memory bandwidth.

I guess this applies to Qualcomm as well.
 
Jul 27, 2020
20,040
13,739
146
I don't know how much Photos uses the NPU though.
Does that app have the ability to recognize people's faces, let you assign names to each face and then let you search all photos of a particular person? I think that functionality would depend heavily on the NPU and could be much faster than doing it on the CPU.
 
  • Like
Reactions: Eug
Jul 27, 2020
20,040
13,739
146
Is control running natively there?
Probably using the Apple Game Porting Toolkit.

Apple would need to release a cheap game console (under $600) to get studios to take its graphics horsepower seriously. Most gamers don't get a Macbook for gaming when there are superior and cheaper options available in the same price range with far more RAM and storage.
 

eek2121

Diamond Member
Aug 2, 2005
3,100
4,398
136
Hmm. Did you read the semianalysis article I linked?


Average cost? So then the price-per-wafer is not fixed, and varies?

Then what the article is saying is very relevant. More lithography time significantly increase the lithography cost, and in their worst-case-scenarior example, the lithography cost is up by $2000!
TSMC charges based on machine time and volume. Each contract is also different, so two companies with the exact same requirements may be charged very different prices.
Nah it's overused on TikTok and elsewhere.
💀☠️
Intel and AMD's AI accelerators are mostly just marketing speak right now. Windows has very few ML tasks that can be offloaded to an NPU. Even macOS, which Apple fully controls, doesn't have that much except for local Siri and some small tasks like organizing photos.

I think it's going to be a few years at least before NPUs on laptops/desktops become useful.
Windows 11 has some AI stuff (generative AI in paint), but Windows 12 will change everything.

Windows 12 will also probably require a subscription for the advanced stuff. There were rumors of tying it to office 365, but I believe Microsoft backtracked on that at some point.

It will be a long time before most PCs without giant GPUs can perform complex tasks without a cloud GPU farm supporting somewhere.
 

Doug S

Platinum Member
Feb 8, 2020
2,785
4,750
136
Almost certain because the game is not optimized for ARM and Metal. Even if it uses Metal, it's probably using MoltenVK which translate Vulcan to Metal.

True that most GPU benchmarks tend to understate what Apple's GPU is really capable of because very little is truly natively designed and tuned for Metal and TBDR, but that just reflects reality.

What would be interesting to see would be benchmarks that compare results using a version that's highly tuned for Apple's GPU and also a version ported using MoltenVK with no special tuning. How much does Apple lose? Is it 5% or is it 50%? I have no idea.
 

mikegg

Golden Member
Jan 30, 2010
1,844
467
136
Windows 11 has some AI stuff (generative AI in paint), but Windows 12 will change everything.

Windows 12 will also probably require a subscription for the advanced stuff. There were rumors of tying it to office 365, but I believe Microsoft backtracked on that at some point.

It will be a long time before most PCs without giant GPUs can perform complex tasks without a cloud GPU farm supporting somewhere.
FYI, the NPUs are probably way slower than GPUs at ML acceleration. The advantage for NPUs is just efficiency. On Windows, laptops and desktops don't need to conserve as much as energy as phones. That's why Apple cut the NPU in half for M3. It's just not needed right now.

When LLMs or Gen AI gets big, I could see the NPUs becoming huge as well. That's why I wrote this: https://forums.anandtech.com/thread...us-with-a-cpu-and-gpu-attached-to-it.2611174/
 

FlameTail

Diamond Member
Dec 15, 2021
3,952
2,376
106
F9HhMaLaMAAhI1G.jpeg
Screenshot_20231217_103331_Gallery.jpg
Exynos 2200. You can clearly see the tag logic for the SLC marked here.

Where is the tag logic for the SLC of the A17 Pro? I guess it's the structure in between the two SLC slices?
 

Doug S

Platinum Member
Feb 8, 2020
2,785
4,750
136
Interesting that A17P is 8.18mm x 12.69mm, considering that the reticle is 33 x 26 - 8 dies just barely but quite neatly fit.

Both A15 and A16 were just enough larger that 8 dies could not fit - pretty sure only 6 would but I don't have their exact dimensions. I wonder if knowing about N3B's issues informed their decision on how efficiently to fill the reticle and minimize scans per wafer?
 

FlameTail

Diamond Member
Dec 15, 2021
3,952
2,376
106
Interesting that A17P is 8.18mm x 12.69mm, considering that the reticle is 33 x 26 - 8 dies just barely but quite neatly fit.
Yup.

33 × 26 = 1 chip
16.5 × 13 = 4 chips
11 × 13 = 6 chips
8.25 × 13 = 8 chips

11 × 13 = 143 mm².

M3 estimate size is 146 mm². This means it just barely couldn't fit 6 dies per reticle!
Both A15 and A16 were just enough larger that 8 dies could not fit - pretty sure only 6 would but I don't have their exact dimensions. I wonder if knowing about N3B's issues informed their decision on how efficiently to fill the reticle and minimize scans per wafer?
I was asking a question related to that in the foundry thread. TSMC has fixed price per wafers, right? So if a chip takes more time in lithography, the cost doesn't pass on to the customer, does it?