Recent content by basix

  1. B

    Discussion RDNA4 + CDNA3 Architectures Thread

    I suspect somewhat better bandwidth and infinity cache utilization. From another test it can be seen, that the 9070XT benefits more than a 9070. And some gains with pathtracing were incredible, up to +39%...
  2. B

    Discussion RDNA 5 / UDNA (CDNA Next) speculation

    Cerny mentioned, that they want to integrate a more general purpose CNN/DNN hardware architecture for PS6. That are essentially matrix cores as you can find in RDNA4 already, indeed.
  3. B

    Discussion RDNA4 + CDNA3 Architectures Thread

    In my opinion it does not really matter from a functional point of view. 16 GByte is enough for a 9070. Sure, 18 looks bigger than 16 on paper. Practical effects are non-existent except for 1ppm of use cases. 12 vs. 16 could have noticeable effects with raytracing, high-res screens and...
  4. B

    Discussion RDNA 5 / UDNA (CDNA Next) speculation

    But it could. Zen 4 showed us, that the area overhead is minimal (MI300C). But anyways: Taking 1x 32C or 3x 12C does not make a huge difference in core count. But the 12C chiplets could be clocked higher.
  5. B

    Question Zen 6 Speculation Thread

    6.5 GHz would already be awesome. 7 GHz is a dream we still can be dreaming until Zen 6 sees the light of day, but if you bet your house on that, good luck. +14% clock rate together witth ~10% IPC increase are still very respectable +25% ST performance increase. I would be happy with that. And...
  6. B

    Discussion RDNA4 + CDNA3 Architectures Thread

    So you think AMD is going the Nvidia route? Would be interesting to see.
  7. B

    Discussion RDNA 5 / UDNA (CDNA Next) speculation

    I mean, AMDs chiplet architecture is perfectly suited for different XCDs. Much better than a big 800mm2 Die, as the XCDs will probably stick between ~130...170mm2. Hmm, wenn looking at that XCD Die size: Could be well in the range of the 32C Zen 6 Chiplet. So we might see a 8*32C = 256C MI400C...
  8. B

    Discussion RDNA4 + CDNA3 Architectures Thread

    More GEMM sure, but you can make 2x FP32 and 4x GEMM in one go if you want. It is only about the ratio ;) For example something like that: - RDNA5: 2x FP32, 2x GEMM (+FP4 Support), 2x LDS size - CDNA5: 2x FP32, 4x GEMM, 4x LDS size I assume here that making bigger CU/SM in general could yield...
  9. B

    Discussion Nvidia Blackwell in Q1-2025

    Some rumors say 1149$ and 849$ for 5080S and 5070TiS respectively. Not really attractive with that increased price tag.
  10. B

    Discussion RDNA4 + CDNA3 Architectures Thread

    Regarding CU/SM count discussion it looks funny to me, that GB202 has 192SM and GB200 seems to have 148SM (active) according to Chipsandcheese (https://chipsandcheese.com/p/amds-cdna-4-architecture-announcement). But in the end, both GB200 and GB202 have 256 FP32 FLOPS per SM per clock cycle...
  11. B

    Question Zen 6 Speculation Thread

    I'm in to the bet with 6.66 GHz 😈 (if f_base or f_max I didn't say ;))
  12. B

    Discussion RDNA 5 / UDNA (CDNA Next) speculation

    Maybe like that? - MI450X = 1x FP64, 1x Low Precision - MI430X = 2x FP64, 0.5x Low Precision Very nice! Any link to that info? Or did they just double L0$ capacity in general? As an additional note: There was a paper from 2020 (some university together with AMD) which proposed to share...
  13. B

    Discussion RDNA 5 / UDNA (CDNA Next) speculation

    I cannot tell you exactly, how intense matrix math is in game engines. But it is for certain, that matrices get used everywhere in games (you can google that if you want). And today this means, that you make N-times vector * vector math instead of 1-time matrix * vector. For some part this split...
  14. B

    Discussion RDNA 5 / UDNA (CDNA Next) speculation

    Then enlighten me, why it should not be programmable (or what you understand under that term). In the end, cooperative vectors allow regular shader (vector) code to be interweaved / extended with matrix math acceleration. Not more, not less. And as shader code is programmable, the matrix...
  15. B

    Discussion RDNA 5 / UDNA (CDNA Next) speculation

    The matrix accelerators by itself may not be programmable. But that is not the point of this whole stuff. It can integrate into programmable workflows. But it seems you just want to be right with something ("matrix cores are not programmable blablablubb") and neglect the effective use cases and...