Those want membw that LPDDR can never offer.
BTW, have you seen the picture of Mi450, with HBM, and the base die of each HBM stack also having a memory controller for 2x LPDDR channels? Looks insane...
Those want membw that LPDDR can never offer.
Where's that posted?BTW, have you seen the picture of Mi450, with HBM, and the base die of each HBM stack also having a memory controller for 2x LPDDR channels? Looks insane...
For bulk inferencing where multiple models may be resident, a MUCH bigger pool of LPDDR per card will mean lower costs, better availability of components, and less memory thrashing, cooling requirements and failure rates over the life time of ownership.Those want membw that LPDDR can never offer.
DC GPUs have their FF GFX h/w long excavated.
Where's that posted?
Thanks.
Tons of words to say tokens/sec ratio will be bad.For bulk inferencing where multiple models may be resident, a MUCH bigger pool of LPDDR per card will mean lower costs, better availability of components, and less memory thrashing, cooling requirements and failure rates over the life time of ownership.
Looks normal.BTW, have you seen the picture of Mi450, with HBM, and the base die of each HBM stack also having a memory controller for 2x LPDDR channels? Looks insane...
Buncha trolls here
Tokens / Joule and Tokens / $ matter.Tons of words to say tokens/sec ratio will be bad.
Looks normal.
Wake me up when Qualcomm wants to talk about inference. The joke I posted above only works because QC did not offer any meaningful metric that presents their hardware as efficient. No performance numbers, no power usage attached to that performance.Wake me up when y'all want to talk about inferencing
SemiAnalysis make their living out of taking both the industry in general and AI in particular VERY seriously. So sleep on this: if QC had a truly valuable product in their hands, we'd be swimming in benchmarks and efficiency claims.Buncha trolls here![]()
I'm not convinced by someone just because he takes something seriously --you still have to know what you're talking about too and not have some narrative agenda in mind before writing. Not sure I can take an appeal to authority or some claim of false causation regarding the lack of benchmarks seriously either.Wake me up when Qualcomm wants to talk about inference. The joke I posted above only works because QC did not offer any meaningful metric that presents their hardware as efficient. No performance numbers, no power usage attached to that performance.
SemiAnalysis make their living out of taking both the industry in general and AI in particular VERY seriously. So sleep on this: if QC had a truly valuable product in their hands, we'd be swimming in benchmarks and efficiency claims.
Looks normal.
I'm not convinced by someone just because he takes something seriously --you still have to know what you're talking about too and not have some narrative agenda in mind before writing. Not sure I can take an appeal to authority or some claim of false causation regarding the lack of benchmarks seriously either.
They didn't just enter the AI market, they have had success in DCs with their AI100's released in 2019 and have dominated mobile AI both hardware and software wise for a decade (until Mediatek's strong entry this year...) They pioneered model quantization techniques and have a unique perception stack that uses gauge equivalent CNN's. Along with nVidia's DLSS, their modem \ RF systems are just a few pieces of consumer tech which use non-trivial neural nets fruitfully wholly at the client level.
Given their low power pedigree, I expect the parts to be very efficient, and their direct addressing of memory movement power overhead with a near compute architecture looks right on the money to me. I expect the parts to be a popular choice for inferencing that can improve compute density per rack by being easier to cool.
