adroc_thurston
Diamond Member
I mean yeah moar SDPs is a go.More microbumps, wider bus?
you guys need to stop with this meme.Another guess would be creating connections to other CCDs, which would lead to a question "what for?"
I mean yeah moar SDPs is a go.More microbumps, wider bus?
you guys need to stop with this meme.Another guess would be creating connections to other CCDs, which would lead to a question "what for?"
L3<-> L3 transfer perhaps?
NUCA is nice and elegant.I wonder if that ever gets solved that would fit satisfactorily into how AMD handles L3...
It would be a "nice to have" in a desktop, but on server, it would only make sense if the entire CPU worked as unit, rather than various virtualized subsets. So probably too few big use cases in the overall picture.
They do have the same substrate sizes available for both -R and -L.Clarifications by high yield
yeah i dont get the obsession with making your L3 crappy and making coherency a nightmare.
More bumps = more connections. Either wider data pathways for existing payouts, or payouts are changing and they need more wires to connect more chiplets. Maybe there will be a 4 chiplet package on desktop with a separate iGPU chiplet?
One more comment by high yieldClarifications by high yield
*loud incorrect buzzer noise*
who is this forwith a separate iGPU chiplet?
I have been thinking about getting a Copilot+ laptop but frankly are there any "real" uses/programs for NPU yet? So far I have not spotted anything useful. For example I would like Copilot app on windows actually use NPU. Or Copilot addons in office.Afaik NPUs are also better regarding time to first token or in other words execution latency. They work better with small batch sizes. For many applications and single customer use cases this is helpful. But for big number crunching it should be better to move towards the GPU in the longterm. The GPU does also have massive support from a big and wide memory system. Replicate that for an NPU is a waste of sand.
But funnily enough, doesn't Qualcomm add a better link between GPU and NPU to move matrix computations to the NPU (the GPU does not support such acceleration)
Regarding software:
HW differences could be abstracted away by HALs and APIs.
SpellcheckI have been thinking about getting a Copilot+ laptop but frankly are there any "real" uses/programs for NPU yet? So far I have not spotted anything useful. For example I would like Copilot app on windows actually use NPU. Or Copilot addons in office.
So the answer is still no. No real uses as of yet.Spellcheck
Spellcheck as in "To use Microsoft's AI spellcheck in office"? It actually uses NPU?Spellcheck
yeah but they're gonna do spellcheck using a hugeass xformer eating 4GB of your DRAM just for that.Spellcheck doesn't need any sort of AI. Checking grammar needs to be a bit smarter (though nowhere near needing 50 TOPS that Copilot requires) but spellcheck has been around since before CPUs went 32 bits.
What do you mean 4 GByeah but they're gonna do spellcheck using a hugeass xformer eating 4GB of your DRAM just for that.
welcome to the future, gramps.
Compared to IFOP, you can do that now through a wider, faster and lower latency interface to the IOD 😉L3<-> L3 transfer perhaps?
That is a very interesting idea, indeed. For Zen 6 I do not expect something like that to happen. For Zen 7 I think not as well (16/33C CCDs, bigger L3$ and simply faster cores are already a decent enough update). But Zen 7 could still introduce it (core count mania). Would be sick to see a 512C Zen 7 SKU 😉Or daisy-chaining of CCDs </s>
But yes, they tripled RAM bandwidth to around 1.6 TByte/s for the Top-End. With 8 CCD you'd need an interconnect to be at least as wide as 200 GByte/s in order to saturate this. And that is with each CCD demanding an equal share. Current GMI-Wide delivers 128 GByte/s (read) IIRC.
So 256 GByte/s/CCD or even more don't seem like overkill to me.
512C totally doubt this with the meager density gains they have to make the package significantly larger 384C seems possibleThat is a very interesting idea, indeed. For Zen 6 I do not expect something like that to happen. For Zen 7 I think not as well (16/33C CCDs, bigger L3$ and simply faster cores are already a decent enough update). But Zen 7 could still introduce it (core count mania). Would be sick to see a 512C Zen 7 SKU 😉
I have been thinking about getting a Copilot+ laptop but frankly are there any "real" uses/programs for NPU yet?
game.intel.com
You could try to implement a local running LLM to work on Outlook and Word, it's supposedly possible.. but IIRC it's not easy. Microsoft isn't super interested in letting people off the hook on paying $30/month for the full Copilot M365 experience.For example I would like Copilot app on windows actually use NPU. Or Copilot addons in office.