DrMrLordX
Lifer
Thermals, and also you're adding even more L3 latency (albeit maybe not that much more).Why not on both sides for cache galore?
Thermals, and also you're adding even more L3 latency (albeit maybe not that much more).Why not on both sides for cache galore?
They are not. The DQ pins are not the only pins on the interface. LPDDR5X has 72 active signals per 32-bit dual channel controller, while LPDDR6 has 84 active signals per 48-bit dual channel (4x half channel) interface. 3/2 data signals but only 7/6 times the pins. Or, 96-bit LPDDR6 uses only 168 signals, while 96-bit LPDDR5X uses 216. Even after you adjust for the 8/9 loss of efficiency from sharing the DQ pins, LPDDR6 comes out ahead.
LPDDR6 is a neat and efficient design.
It's 48M + 96M cache extension.So it's 48MB L3 in one plane and only 2*64MB underneath the whole CCD area (L3+12 cores)?
Yeah, I'm not aware of any apples-to-apples comparisons. I would assume that LPDDR6 PHYs need to be a bit bigger on old process tech, because the CA signals are driven at full speed so there are more high-speed single-ended pins, but I'm really not sure. The controllers are probably a bit bigger because they are probably doing more things.Wow I didn't realize they'd economized on the pins so much. Will that efficiency help the physical size of an LPDDR6 only controller vs the equivalent width of LPDDR5X only controllers, or just shoreline? I'm guessing Synopsys probably doesn't have LPDDR6 only IP yet to directly make such a comparison. Or do they?
I imagine Intel's goal is to eventually stack the cache, but they needed a large L3 sooner than they could implement stackingThat would certainly be a dream for AMD if they managed to sucker Intel into L3 size competition, where AMD would be stacking cheaper dies with low latency SRAM, while Intel is ballooning the N2 die size and increasing latency.
Intel HBI yields are just too toilet-tier so client guys bailed out.I imagine Intel's goal is to eventually stack the cache, but they needed a large L3 sooner than they could implement stacking
Is there any AMD patent that would prevent them from doing that? Since the process is being done at TSMC, it is possible that AMD has exclusive rights to the process at TSMC for some time window as well which would force Intel to either wait for the time window to expire, or implement it on their own (already troubled) process.I imagine Intel's goal is to eventually stack the cache, but they needed a large L3 sooner than they could implement stacking
No they don't, and Intel stacks caches for CWF/DMR anyway.Is there any AMD patent that would prevent them from doing that? Since the process is being done at TSMC, it is possible that AMD has exclusive rights to the process at TSMC for some time window as well which would force Intel to either wait for the time window to expire, or implement it on their own (already troubled) process.
Is there any AMD patent that would prevent them from doing that? Since the process is being done at TSMC, it is possible that AMD has exclusive rights to the process at TSMC for some time window as well which would force Intel to either wait for the time window to expire, or implement it on their own (already troubled) process.
for x86/amd64, not really anything else.Intel and AMD also have a patent cross license agreement.
Relaxing the PT a bit for stacked parts is pretty easy.I have to wonder if thermals were also an issue for Intel client. It STILL looks like Intel needs to pull a lot of juice to hit their performance targets and I have to wonder if they could reliably stack cache at the thermal loads required for their Fmax targets?
Is there any AMD patent that would prevent them from doing that? Since the process is being done at TSMC, it is possible that AMD has exclusive rights to the process at TSMC for some time window as well which would force Intel to either wait for the time window to expire, or implement it on their own (already troubled) process.
Oh so all the cache, bus, branch prediction, etc is chopped liver ?for x86/amd64, not really anything else.
Buses are an industry thing; rest yeah.Oh so all the cache, bus, branch prediction, etc is chopped liver ?
It is always an exact match since they moved to wafer on wafer stacking.
Good point. It's currently stacked on top correct?I have to wonder if thermals were also an issue for Intel client. It STILL looks like Intel needs to pull a lot of juice to hit their performance targets and I have to wonder if they could reliably stack cache at the thermal loads required for their Fmax targets?
Pretty sure someone (maybe Cheese) did and they pokerfaced it.BTW, strange that AMD never announced it (as far as I know), nobody asked in any interview.
AMD doesn't do "intermediate steps", they drive the goddamn roadmap at TSMC.Last time Max from High Yield did a video on this, he only speculated, and his conclusion was that AMD is still not using Wafer on Wafer packaging, and still using intermediate step of using carrier wafers.
Pretty sure someone (maybe Cheese) did and they pokerfaced it.
AMD doesn't do "intermediate steps", they drive the goddamn roadmap at TSMC.
It means they get higher margins and you get uhhh.If that's the case, then there should further proliferation of V-Cache CPUs
you have no idea the nightmares about to happen.Competing with AMD in server space will so much harder.
you have no idea the nightmares about to happen.
no, it's just the exterminator.AI all the things?
It means they get higher margins and you get uhhh.
you have no idea the nightmares about to happen.
Well that's mostly down to server Tegra being crap.Subtweet: A comment on competitiveness of Arm vs. x86, especially in orchestrating GPU tasks to improve GPU utilization
Tougher than it looks yes.Substack article about hyperscalers vertical integration ability to compete with Merchant silicon in both CPU and GPU