- Mar 3, 2017
- 1,777
- 6,791
- 136
MALL doesn't need SoIC-X (MI300 MALL is run-thru a bunch of 2.5D interfaces really).
A large one, it's a memory side cache.
Good lord someone missed his comparch course in college.
It seems that Strix Halo could, in theory, be using MALL. Not sure about other Strix parts...Client yes, battery life is nice.
Server ehhhh.
Joe is a film editor iirc.
They sold more 7950x3ds than 13900k per your source, 7900x3d matches the Intel chip in sales but both are still ahead of the vanilla 7950x. For halo chips they seem to have a very solid place in the lineup, that's a lot more than just existing for a marketing goal. And your logic makes the same argument for Intel which I would also completely disagree with, since flagships aren't supposed to be huge sellers in a random month well after launches.
oh sorry I always mix you up with someone else on an old site who went by the same name.Not exactly. And I may have taken some sort of elementary Computer Architecture 101 class, but it was before they had caches.
my assumption is traditional IOD + CCD's on standard organic substrate.MALL doesn't need SoIC-X (MI300 MALL is run-thru a bunch of 2.5D interfaces really).
A large one, it's a memory side cache.
Good lord someone missed his comparch course in college.
Client yes, battery life is nice.
Server ehhhh.
Also while latency will be higher then L3 vcache it will still be alot faster then main memory.
The question then would be around interconnect bandwidth and pj/bit.
I'm not talking removal of l3. Only moving of vcache memory controller side and increasing the fabric to handle the extra throughput so you could have a max click symmetric 16core design that still gets alot of the vcache benifits.I think it's more complicated than that. AMD has excellent performance and energy efficiency due to their excellent L3. Stripping chiplet out of L3 would result in huge hits in perf and efficiency.
They would need to increase L2 and maybe even keep some on chiplet L3 to still keep quite a few requests local. Zen5 is rumoured to have 2MB or even 3MB of L3, so that is kinda confirming preparations for MALL cache. It might not come in Z5, maybe in Z6. According to rumors there is some sort of "ladder" L3, so maybe AMD is is just going to keep L3 and throw 3D SRAM on IOD.
Curious that I've never heard of AMD's codename for their V-cache. There's gotta be something they lovingly call it in their offices.Crystalwell on iod if you will.
They don't have any.Not sure about other Strix parts
For Zen6 parts?my assumption is traditional IOD + CCD's on standard organic substrate
I think you're onto something here. Rumors do seem to align with this take, where Zen 6 goes all in on silicon bridges to replace the organic substrate Infinity Fabric connection. It's been hinted here that Zen 6 is a big system overhaul (analogous to "Penryn to Nehalem" level of change, implying a cache restructuring) and Kepler concurs w/ the use of silicon bridges. Seeing as how SRAM doesn't scale with advanced nodes, it makes more and more sense to shift the big L3 from the CCD to the IOD, and if the penalty of having the L3 off the CCD is reduced significantly via silicon bridges, it then becomes viable. If you can stack V-cache, which is also on an older node, onto that IOD it nets you a very scalable and cost optimized product.That would allow the MALL cache to be shared between multiple CCDs.
Between 2 CCDs each having its own 64MB of L3 as one alternative and MALL having 128MB shared cache as another alternative, MALL would come out ahead in most scenarios as far as achieving cache hits, but there would be small latency hit vs. a cache hit in CCDs own or stacked L3.
If CCD is connected to IOD+MALL using Hybrid Bond bridges, as could happen with Venice, the extra latency would be reduced.
Looking at AMD deploying MALL to client GPU, datacenter GPU / APU in Mi300, I think MALL will be the answer to CPU, both client and server.
As far as when we could see this, definitely not in Zen 5 client, highly unlikely in Zen 5 Turin server (even though there is a new IOD coming)
But I think highly likely with Zen 6, like 90+ % likely.
No.where Zen 6 goes all in on silicon bridges to replace the organic substrate Infinity Fabric connection.
big papaCurious that I've never heard of AMD's codename for their V-cache. There's gotta be something they lovingly call it in their offices.
I think it's more complicated than that. AMD has excellent performance and energy efficiency due to their excellent L3. Stripping chiplet out of L3 would result in huge hits in perf and efficiency.
They would need to increase L2 and maybe even keep some on chiplet L3 to still keep quite a few requests local. Zen5 is rumoured to have 2MB or even 3MB of L3, so that is kinda confirming preparations for MALL cache. It might not come in Z5, maybe in Z6. According to rumors there is some sort of "ladder" L3, so maybe AMD is is just going to keep L3 and throw 3D SRAM on IOD.
Not happening.That is what I was thinking: substantially increased L2, getting rid of L3, introducing shared MALL and shifting SRAM from L3 to it.
No.So maybe Zen 6...
Yea it's the best L3 in the industry and there's no reason to ever get rid of it.it looks quite challenging to overcome the low latency of L3 that Zen has
I think you're onto something here. Rumors do seem to align with this take, where Zen 6 goes all in on silicon bridges to replace the organic substrate Infinity Fabric connection. It's been hinted here that Zen 6 is a big system overhaul (analogous to "Penryn to Nehalem" level of change, implying a cache restructuring) and Kepler concurs w/ the use of silicon bridges. Seeing as how SRAM doesn't scale with advanced nodes, it makes more and more sense to shift the big L3 from the CCD to the IOD, and if the penalty of having the L3 off the CCD is reduced significantly via silicon bridges, it then becomes viable. If you can stack V-cache, which is also on an older node, onto that IOD it nets you a very scalable and cost optimized product.
View attachment 84713
You sir have strange fetishes.BPWBB <<<< I think you can decipher that![]()
Ya obvious things are obvious.only the highest end and lower volume parts (Mi400, Navi5 Halo, Venice) will get the silicon bridges.
And lower priced parts - CPU, GPU client - will get FOWLP - like RDNA3. Client desktop - likely with Zen 6.
So this is probably about cost and capacity TSMC has for Hybrid Bond. When the capacity catches up, another product can adopt it.
No desktop will be a small, least relevant extension of mobile starting with Zen6.and from Venice, we will see where client desktop may go
with zen 6 the compute aka core dies will not be the same from epyc down to ryzen?o desktop will be a small, least relevant extension of mobile starting with Zen6.
Won't ever have any relation to server anymore.
Period.
No.with zen 6 the compute dies will not be the same from epyc down to ryzen?
so the dies will only be shared between epyc and thread ripper? doesn't this cost more for amd to develop two different compute dies or are they now financially able to do this or, and big or here is it because amd is seeing some limits years in advance with their current method and they want to explore a more vibrant option for client to kill intel there too?
Yes.so the dies will only be shared between epyc and thread ripper?
Yes.doesn't this cost more for amd to develop two different compute dies
$1.45B in quarterly R&D gotta count for something.or are they now financially able to do this
No they're just folding DT into mobile.they want to explore a more vibrant option for client to kill intel there too
ok when you say folding what do you mean, those two will share similar dies with mobile getting a more unique efficient design because it's mobile? does this mean ryzen will get new cores? I assume tr or epyc will get a core count increase to at that that point.Yes.
Yes.
$1.45B in quarterly R&D gotta count for something.
No they're just folding DT into mobile.
Well Zen6 mobile is quite special, we'll talk about it at a later date.those two will share similar dies with mobile getting a more unique efficient design because it's mobile
everything is getting new cores, it's Zen6 after all.does this mean ryzen will get new cores
TR is not a priority but EPYC yes, Venice is another core count bump.I assume tr or epyc will get a core count increase to at that that point.