Discussion Nvidia Blackwell in Q1-2025

Page 181 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

basix

Senior member
Oct 4, 2024
236
473
96
Well, 16 GPCs are visible on the diagram. It does not mean, that the chip will really look like this.

But I think 16 GPCs is reasonable. I can think of a few reasons for that:
  • Better scaling in general, because 1.33x GPCs (ROPs anyone? :D)
  • Power of 2 scaling for ML/AI Workloads: A100, H100 and B200 all have 8x GPCs. Rubin HPC might have 8 or 16 and Rubin CPX would have 16 GPC as well. ML/AI workloads like power of 2 divisions (MI350X did go back to 256 CU because of that, MI300X featured 320 CU)
  • GauRast: Gaussian Splatting Acceleration (Neural Rendering) within the rasterizer. 1.33x GPCs will bring a boost there (see the GauRast paper https://arxiv.org/html/2503.16681v1)
  • Transformer Attention Acceleration (see GB300 or Rubin CPX) does benefit from exponential functions. On GB300 Nvidia says they pimped the SFU within the SM for that (or emulate SFU EX2 functions). GauRast from the previous point will introduce additional EXP-Units within the rasterizer (GPC frontend). This might create some synergies between Neural Rendering and general ML/AI.
 
Last edited:

Win2012R2

Golden Member
Dec 5, 2024
1,206
1,245
96
Tough to get that much better performance while using the same node and a mandate to keep transistor budgets flat if not lower.
Dropping 32bit PhysX was Borderlands 2 criminal...

But seriously the whole package clearly does not work as they intended, maybe it's -10% clocks, but just feels something gone wrong there, with all that "AI" money Nvidia should be able to afford making a proper gaming card.

It definitely would have looked better if they had used N4P instead of N5.
Ada was designed using special TSMC 4N, no chance they used N5 for Blackwell.
 

jpiniero

Lifer
Oct 1, 2010
16,792
7,242
136
Ada was designed using special TSMC 4N, no chance they used N5 for Blackwell.

4N is an nVidia N5 special.

Rubin, btw, looks like what I thought Blackwell is going to be. It looks like they will double down (if not more) on tensor cores, leaving little room for raster or RT improvements beyond higher clocks. And that's assuming they don't cut core counts at the lower end to keep costs in line with N3's way higher prices.
 
Last edited:
  • Like
Reactions: Tlh97 and Win2012R2