DeathReborn
Platinum Member
- Oct 11, 2005
- 2,589
- 554
- 136
They need a N34 (if they had one) chiplet with 64MB IC to put alongside a Zen4 CCD.Now AMD could do something like Kaby-G.... put N33 and some GDDR6 on package on one unit.
They need a N34 (if they had one) chiplet with 64MB IC to put alongside a Zen4 CCD.Now AMD could do something like Kaby-G.... put N33 and some GDDR6 on package on one unit.
Strix point with 24 CU is NOT a big APU.No one is saying get rid of APU. AMDs current APUs are powerful enough for that role, and their continued evolution will keep them serving it.
The pushback is merely against the wishful thinking that seems to have an expectation of a sudden giant GPU in the APU. This is simply unrealistic.
Again, no one is against APUs, just the realism of the Big GPU, APU wishful thinking.
Steamdeck is actually a great counter example against the Big GPU APU meme. Steam deck has a custom part, so Valve could have ordered any size GPU in their APU that they wanted, but the Steamdeck has only half the GPU size vs AMDs standard 6800U gpu.
That's a dedicated handheld game machine, and they chose only half the GPU of AMDs standard APU.
Given that it seems unlikely that there is much OEM demand for standard APU to have a much bigger GPU section.
That's a sudden doubling of GPU size, so it is a Much bigger GPU.Strix point with 24 CU is NOT a big APU.
Its completely, and utterly standard die. Just like Rembrand, Phoenix Point are.
Sure people swear the same about MLID. It's all BS clickbait to me.Redgamingtech posted it, so it's basically money in the bank. That guy is never off on AMD stuff.
That applies as long as the gpu is only there for games and professional visualisation. If we start using gpu compute more then it starts to make more sense to have big gpu's. If Nvidia had an x86 license you know that's what they would be doing - adding it to normal cpu's and then helping and incentivizing key software to use the gpu compute. They aren't because they lack the x86 license, but that doesn't mean AMD shouldn't do that, in fact there's a chance that Intel go down that route now they are back making serious gpu's. If AMD had any sense they'd be pro-actively going for this market, not waiting to get beaten to the punch then reacting late.Widening memory controller buses and/or adding dedicated on-chip caches (neither of which scale well with die shrinks by the way) for performance which may not be valued by buyers is stupid. Law Firms and Consultants are not. going. to. pay. more. for the next Thinkpad Carbon just because the chip has a big iGPU + 64MB SLC to feed said GPU.
Hopes and dreams, that's where the bandwidth would come from.Sure people swear the same about MLID. It's all BS clickbait to me.
Did mister "never off" explain where the BW was coming from? It's on the same socket, so same DDR5 memory, which has about ~100GB/s of BW.
Edit: I looked up the Rumor. He's talking about 9 TFlops GPU, that is the same as a RX 6600.
RX 6600 has 32 MB of "Infinity Cache" to compensate for lower memory BW, AND 224 GB/s of Memory BW. Best case it seems the fictional part will have less than half the required Memory BW.
You mean so long as we are talking about reality. So yeah it applies.That applies as long as the gpu is only there for games and professional visualisation.
It depends on what you think a "big APU" is. The iGPU in Rembrandt is the equivalent of a midrange dGPU of some years ago, and more powerful than all the PS4 generation consoles but in terms of actual area it is not huge for sure. Of course you will not see 400+ mm^2 APU dies in a market that covers from low end to high end laptops (which are the primary target). But with 200 to 300mm^2 at disposal and new processes being available (and with MCM as the future), who knows if cannot really see what you could consider today a "big" APU in a matter of a couple of years.No one is saying get rid of APU. AMDs current APUs are powerful enough for that role, and their continued evolution will keep them serving it.
The pushback is merely against the wishful thinking that seems to have an expectation of a sudden giant GPU in the APU. This is simply unrealistic.
That was because Valve gave more importance to battery life and cost than pure performance, also because they control the software layer. But there are other handhelds (even smaller than the Deck) in commerce already using a 6800U and soon with a 7040U class APU, for example. As there are several XSFF PCs (X here stays for Extra) used as multimedia and gaming stations, which can benefit in having a beefier iGPU compartment. This does not mean that we will immediately see huge dies with IGPU measuring 200+ mm^2 alone (well, will have in the HPC market, see MI300). But, with the proper balance between CPU and IGPU side, there are clear advantages for such solutions. I.e. in the highend corporate market, we saw often solutions based on only APU or with a CPU and a quite small dGPU (MX450/550 class). A Rembrandt/Phoenix already would kill all dGPU solutions of that class by offering similar performance at a lower cost (or higher margin for the OEM) and even in the mainstream notebook segment smaller dGPUs could be substituted by slightly beefier APUs than we see today. Main problem here is more availability than technical/cost issues.Steamdeck is actually a great counter example against the Big GPU APU meme. Steam deck has a custom part, so Valve could have ordered any size GPU in their APU that they wanted, but the Steamdeck has only half the GPU size vs AMDs standard 6800U gpu.
That's a dedicated handheld game machine, and they chose only half the GPU of AMDs standard APU.
Given that it seems unlikely that there is much OEM demand for standard APU to have a much bigger GPU section.
I've been pretty clear on that. Reasonable expectation is a continuation of current evolution. GPU section continues to evolve performance along with the evolution of memory BW on the standard socket.It depends on what you think a "big APU" is.
There are technical solutions for improving effective bandwidth (cache stacking, new memory standards). Today a new LPDDR5 speed (LPDDR5-T where T stays for Turbo) was launched by Hynix, with effective transfer rate up to 9,600 Gbps. Yes there are costs related, it must be seen how and on what these costs can be justified, but there could be eve more segmentation in the future, based on the iGPU performance.I've been pretty clear on that. Reasonable expectation is a continuation of current evolution. GPU section continues to evolve performance along with the evolution of memory BW on the standard socket.
It's unreasonable to expect that the GPU section will suddenly double while memory BW is stagnant. That would just be a waste of silicon bottlenecked by too weak memory BW.
See above with the Strix Point/RX 6000 example. You need more than Double high speed DDR5 BW, and that's already after you have a large infinity cache mitigating lower memory speed.There are technical solutions for improving effective bandwidth (cache stacking, new memory standards). Today a new LPDDR5 speed (LPDDR5-T where T stays for Turbo) was launched by Hynix, with effective transfer rate up to 9,600 Gbps. Yes there are costs related, it must be seen how and on what these costs can be justified, but there could be eve more segmentation in the future, based on the iGPU performance.
You need double the BW if you need to double the performance in all departments, and that only if the BW is the only limiting factor. Maybe the target is different. And the new LPDDR5-T standard is already giving +50% BW in comparison to today's LPDDR5-6400. Also, if Strix Point is on N3, the area used on a 24CU/12WGP iGPU on RDNA3+ may be not be much different than the area used on Rembrandt, or Phoenix, iGPUs. Not saying that we will really see a 12 WGP (after the usual hype debacle, I will believe only in what it will be effectively delivered), but the possibility exists.See above with the Strix Point/RX 6000 example. You need more than Double high speed DDR5 BW, and that's already after you have a large infinity cache mitigating lower memory speed.
Why would you think area will stay constant, when wafers are getting more expensive at each process shrink?You need double the BW if you need to double the performance in all departments, and that only if the BW is the only limiting factor. Maybe the target is different. And the new LPDDR5-T standard is already giving +50% BW in comparison to today's LPDDR5-6400. Also, if Strix Point is on N3, the area used on a 24CU/12WGP iGPU on RDNA3+ may be not be much different than the area used on Rembrandt, or Phoenix, iGPUs. Not saying that we will really see a 12 WGP (after the usual hype debacle, I will believe only in what it will be effectively delivered), but the possibility exists.
Except even 128bit LDDDR5-9600 is only gonna be ~150GB/s in bandwidth. For reference, 28CU RDNA3 on 6nm (7600m) is listed as 128bit GDDR6 with 256 GB/s, and that also needs a 32mb IC on top. To actually use 24CU RDNA3 at the sort of clock speeds that N3 should permit would require a 32mb IC at the very, very least. Caches don't shrink well with node process so a decent IC to mitigate low memory bandwidth will take a substantial proportion of the precious die space within such an APU by themselves.You need double the BW if you need to double the performance in all departments, and that only if the BW is the only limiting factor. Maybe the target is different. And the new LPDDR5 standard is already giving +50% BW in comparison to today's LPDDR5-6400. Also, if Strix Point is on N3, the area used on a 24CU/12WGP iGPU on RDNA3+ may be not be much different than the area used on Rembrandt, or Phoenix, iGPUs.
The reality is any modern pc is capable of some level of gpu compute, and the fact that it's not being used by a lot of software is due to lack of standardisation, and someone big willing to champion it. It's a great potential market for AMD as they have the x86 cpu's and the knowledge to make integrated gpu's that could do compute well. Like most things the market moves fast, if you don't take advantage someone else will.You mean so long as we are talking about reality. So yeah it applies.
Just gonna mention here that my 6900HS with 4x32bit LPDDR5 6400 only actually measures ~50GB/s in AIDA64Except even 128bit LDDDR5-9600 is only gonna be ~150GB/s in bandwidth. For reference, 28CU RDNA3 on 6nm (7600m) is listed as 128bit GDDR6 with 256 GB/s, and that also needs a 32mb IC on top. To actually use 24CU RDNA3 at the sort of clock speeds that N3 should permit would require a 32mb IC at the very, very least. Caches don't shrink well with node process so a decent IC to mitigate low memory bandwidth will take a substantial proportion of the precious die space within such an APU by themselves.
And lack of need for general use cases.The reality is any modern pc is capable of some level of gpu compute, and the fact that it's not being used by a lot of software is due to lack of standardisation...
And who told you that the target for Strix Point is N33 (which measures a bit more than 200mm^2 alone)? I spoke about low end dGPUs (GTX1650/2050/6400 class). Having a greater number of WGPs can also mean they can be clocked lower, getting a lower power consumption as well. Also, the CU can help doing some GPU compute if available.Except even 128bit LDDDR5-9600 is only gonna be ~150GB/s in bandwidth. For reference, 28CU RDNA3 on 6nm (7600m) is listed as 128bit GDDR6 with 256 GB/s, and that also needs a 32mb IC on top. To actually use 24CU RDNA3 at the sort of clock speeds that N3 should permit would require a 32mb IC at the very, very least. Caches don't shrink well with node process so a decent IC to mitigate low memory bandwidth will take a substantial proportion of the precious die space within such an APU by themselves.
Because this is the historical trend in AMD's APUs? And because the switch from N4 to N3 is lower than the one between N6-7 and N4?Why would you think area will stay constant, when wafers are getting more expensive at each process shrink?
I used $6,000 per 6nm wafer for my estimate so I guess that was pretty damn close. Not that it was super important for the point I was trying to make but it helps.The Steam Deck 2 would be a prime candidate for an 8 core CPU/24 CU GPU (responding to comments I read about it the past few pages, have not had much free time).
When unplugged, the GPU can simply run at lower clocks to save power. When plugged in, clocks can boost to allow higher resolutions and refresh rates. One of the few complaints about the Steam Deck was that it doesn’t scale performance when plugged in.
Just a thought.
You are overpricing by several thousand dollars. N6 was significantly cheaper than N5 as of 6 months ago. A customer like AMD would pay somewhere around $8,000-$9,000 for N7 (note this was $7,000-$8,000 in 2019) based on the numbers I have seen. TSMC made N6 cheaper because of less machine time involved which leads to higher volume. The early numbers I heard for N6 were around $4,000-$5,000, but that was before supply chains blew up. The real (post supply chain issues) number is likely somewhere between $5,000-$7,000. TSMC really wants everyone to transition from N7 to N6 because they can output more wafers per month, which leads to more revenue.
With the economy struggling, those prices will possibly even drop a bit.
Note that most of the numbers I have referenced above came from various leaks in 2018/2019 and a few from last year. I don’t have access to a price sheet or anything, but the sources that provided the numbers were reliable ones.
If it's it's only a small process shrink, then you won't get that much more transistor budget to including a 24 CU GPU.Because this is the historical trend in AMD's APUs? And because the switch from N4 to N3 is lower than the one between N6-7 and N4?
I meant about costs (20K vs 15-16K compared to 15-16K vs 6-8K). Logic per area in N3 is estimated to be 1,7x compared to N5, probably at least 1.5x-1.6x vs. N4.If it's it's only a small process shrink, then you won't get that much more transistor budget to including a 24 CU GPU.
So you are expecting a big transistor increase for minimal cost increase.I meant about costs (20K vs 15-16K compared to 15-16K vs 6-8K). Logic per area in N3 is estimated to be 1,7x compared to N5, probably at least 1.5x-1.6x vs. N4.