- Mar 3, 2017
- 1,777
- 6,791
- 136
Then the question should be, why thats the case, if some other apps, that utilize 4090, can overflow to system RAM, once the VRAM is filled. Naturally this tanks performance, but maybe thats still better than no performance at all?Appears to be one of the main ones, at least.
Afaik not to latency, but definitely to bandwidth. Besides small (<10B) models no one uses system RAM for LLMs, once you offload to that some parts of a large model, performance tanks very quickly.Then the question should be, why thats the case, if some other apps, that utilize 4090, can overflow to system RAM, once the VRAM is filled. Naturally this tanks performance, but maybe thats still better than no performance at all?
Are LLMs latency-sensitive like games?
But its gonna be system RAM, whats gonna be used in case of Halo, no?Afaik not to latency, but definitely to bandwidth. Besides small (<10B) models no one uses system RAM for LLMs, once you offload to that some parts of a large model, performance tanks very quickly.
Where do you deduce this from? I agree, it would help perf, but up until this point it didnt look that way. I still think either 4C CCX +4C CCX will be more likely. Thats what the Geekbench shows. L3 would not be unified between the two CCXs.and it looks like all 8 cores are on the same ring with 16MB L3, which would improve performance when running on 5c cores (due to larger L3) and cost of switching to full cores should be less. So overall power efficiency vs. Strix Point should go up.
Geekbench shows Strix to have 16MB L3 (which is just the Zen5 CCX).Where do you deduce this from? I agree, it would help perf, but up until this point it didnt look that way. I still think either 4C CCX +4C CCX will be more likely. Thats what the Geekbench shows. L3 would not be unified between the two CCXs.
View attachment 113150
Where do you deduce this from? I agree, it would help perf, but up until this point it didnt look that way. I still think either 4C CCX +4C CCX will be more likely. Thats what the Geekbench shows. L3 would not be unified between the two CCXs.
View attachment 113150
Yes, but it shows two "Clusters", which is accurate. I'd be willing to bet Krakan's cluster setup is the same. High latency coms between cores from differing clusters, just like Strix.Geekbench shows Strix to have 16MB L3 (which is just the Zen5 CCX).View attachment 113154
Joe, this is AMD we are talking about, AND presumably their lowest profit SKU. They likely just took the Strix design and neutered the Zen 5C 8 core cluster down to 4, cut the GPU down, and otherwise retained the exact same design to save time and money.Just a speculation, based on the fact that AMD can do ring bus with 8 stops.
It would be wasteful if if Kraken had 4x Zen 5c with a separate L3 and it would suck if the Zen 5c cores had no L3. So just connecting some dots (which may turn out to be wrong).
No.
Just not gonna float in commercial et al.
Joe, this is AMD we are talking about, AND presumably their lowest profit SKU. They likely just took the Strix design and neutered the Zen 5C 8 core cluster down to 4, cut the GPU down, and otherwise retained the exact same design to save time and money.
Which would make it bigger than Hamoa die of X Elite.It isn't small, apparently.
Rumor says it is slightly larger than Phoenix 1.
AMD | Phoenix | 178 mm² |
Qualcomm | Hamoa | 172 mm² |
Qualcomm | Purwa | ~130 mm² |
Geekbench reads that different cores are in different clusters, even if technically they are in the same cluster.Yes, but it shows two "Clusters", which is accurate. I'd be willing to bet Krakan's cluster setup is the same. High latency coms between cores from differing clusters, just like Strix.
Smaller GPU than HPoint and also 4 Zen 5c cores, so it could eventualy be smaller if it wasnt for the bigger NPU, we can expect a comparable 178mm2 size.If Strix Point is 232 mm2, Kraken can save maybe 15% of die area, so maybe ~198 mm2 ballpark?
Smaller GPU than HPoint and also 4 Zen 5c cores, so it could eventualy be smaller if it wasnt for the bigger NPU, we can expect a comparable 178mm2 size.
They'll be happy selling units either way. It's up to the OEMs to decide what they want to deliver to customers.It's gonna be a really awkward time point for AMD - in 2025 dGPU designs will go with Ryzen 200 (hawk), and iGPU designs will go for Ryzen 300 (kraken). Considering their die sizes, the cost of ryzen 200 8-core and ryen 300 8-core will be close.
No they're KRK.AMD - in 2025 dGPU designs will go with Ryzen 200 (hawk)
This was done on purpose, to highlight how tight the cache is for it to be shared between eight cores.