Discussion Nvidia Blackwell in Q4-2024 ?

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Ajay

Lifer
Jan 8, 2001
15,447
7,857
136
Edit: If GB203 is 7*8, that would make the full GB203 have 112 SMs, or about 15% less than the 4090. It'd still be faster of course because of the bandwidth.
Well, the new SM design will either have more 'horse power' per SM, or clock higher (actually, probably both).
 

Mopetar

Diamond Member
Jan 31, 2011
7,837
5,991
136
Is Blackwell named after Alan F. Blackwell? Anyone know?

It's named after the ink bottle that they're going to use to record the obscene profits in their ledger once it goes on sale. They're going to have to record so many entries that it has to be as massive as a well to hold all the ink they'll need.
 
  • Like
Reactions: CP5670

MoogleW

Member
May 1, 2022
50
27
51
I suspect the 5090 will be the full GB203 with 384 bit. I also suspect GB202 is going to be close to the recticle limit of N3E so any product has to be stupidly expensive.

Edit: If GB203 is 7*8, that would make the full GB203 have 112 SMs, or about 15% less than the 4090. It'd still be faster of course because of the bandwidth.

There is no way 5090 is not on GB202 even if its cut down as usual. I expect 5090 to use somewhere between 148-160SM at 3ghz vs 190 SM rtx 6000 Blackwell. 60% faster than 4090, 45% faster than AD102

5080 will be cutdown GB203 with maybe 96SMs performing at rtx 4090+10 to 15%
 

jpiniero

Lifer
Oct 1, 2010
14,590
5,213
136
There is no way 5090 is not on GB202 even if its cut down as usual. I expect 5090 to use somewhere between 148-160SM at 3ghz vs 190 SM rtx 6000 Blackwell. 60% faster than 4090, 45% faster than AD102

5080 will be cutdown GB203 with maybe 96SMs performing at rtx 4090+10 to 15%

If it's close to the N3E recticle limit and 512-bit, it's going to be stupid expensive. Even the cut down model will probably be a Titan and priced like it.
 

CakeMonster

Golden Member
Nov 22, 2012
1,391
497
136
People were buying 4090s for the price knowing full well it wasn't fully enabled (much more cut down than 3090), and assuming there would be a 4090Ti to make it look less good in 6-12 months after. If a 5090 is even more cut down I don't blame NV for thinking they can get away with it.
 

jpiniero

Lifer
Oct 1, 2010
14,590
5,213
136
Well, the new SM design will either have more 'horse power' per SM, or clock higher (actually, probably both).

I have a feeling the SM changes are going to be mainly AI-focused. Although perhaps they will upgrade the RT cores too.
 

Ajay

Lifer
Jan 8, 2001
15,447
7,857
136
I have a feeling the SM changes are going to be mainly AI-focused. Although perhaps they will upgrade the RT cores too.
Probably, but that’s more for their professional GPUs than gaming ones (IMHO). Gaming GPU need to keep upping RT compute power if NV wants to make it more relevant In the future. Their FUTURE all RTRT rendering pipeline project will need a whole tile dedicated to RT (or more than one) to get their. The rasterization pipeline will still be around a decade from now anyway.

Excited about Blackwell, especially since RDNA4 may well be missing it’s high performance GPUs entirely. Wouldn’t want to be Lisa Su on that analyst call.
 

Ajay

Lifer
Jan 8, 2001
15,447
7,857
136
See I think you shouldn't be unless you are willing to shell out the big bux.
Excited about the technology, not the current astronomical prices. When ppl bought the last gen GPUs at super inflated prices, because of the mining phase Nvidia saw an opportunity and took it.
 
Last edited:
  • Like
Reactions: NTMBK

jpiniero

Lifer
Oct 1, 2010
14,590
5,213
136
Excited about the technology, not the current astronomical prices.

But with Moore's Law being Dead (and then some wrt N3E)... the current astronomical prices are not a phase. GB205 I'm guessing will probably be best case maybe a tad faster than the 4070 Ti and (to compensate for the 40+% wafer prices and what's likely a decent price increase for GDDR7) I'm guessing they would want to charge roughly the same $799 that the 4070 Ti is.
 

Ajay

Lifer
Jan 8, 2001
15,447
7,857
136
But with Moore's Law being Dead (and then some wrt N3E)... the current astronomical prices are not a phase. GB205 I'm guessing will probably be best case maybe a tad faster than the 4070 Ti and (to compensate for the 40+% wafer prices and what's likely a decent price increase for GDDR7) I'm guessing they would want to charge roughly the same $799 that the 4070 Ti is.
My guess is that the next 5070Ti will match the performance and price of the RTX 4080 and probably have the same 16GB of RAM. But will have better RT performance.
 

jpiniero

Lifer
Oct 1, 2010
14,590
5,213
136
I think GB205 is going to have a lot less than the 60 SMs that the 4070 Ti has. I'll guess 48 (4*6*2). 4080 perf is asking too much. Could even be 160-bit memory although it's probably 192-bit which would be either 12 or 18 GB.

Edit: Talking about raster perf of course. RT is TBD.
 
Last edited:

Ajay

Lifer
Jan 8, 2001
15,447
7,857
136
I think GB205 is going to have a lot less than the 60 SMs that the 4070 Ti has. I'll guess 48 (4*6*2). 4080 perf is asking too much. Could even be 160-bit memory although it's probably 192-bit which would be either 12 or 18 GB.

Edit: Talking about raster perf of course. RT is TBD.
48 SMs?! Well, unless NV manages a 40%+ increase in perf per SM (including any clock boost), the 5070(Ti) will simply be a miserable chip. At least then it would be cheaper than a 4080 :rolleyes: .
 

dangerman1337

Senior member
Sep 16, 2010
333
5
81

Hmmmm, doubling of Raster Engines and ROPs per GPC (may be 4X the ROPs?). Wonder what Jensen cooking with Blackwell because it seems the SMs are really going to be quite different to Ampere & Lovelace?

On die size I do wonder if they'll do not 128 MB L2 Cache but 64 which makes way more sense to me because 128 MB L2 Cache will be crazy huge and we have 512-bit bus with GDDR7 which is a literal doubling+ of bandwidth. I mean 4090 has 72MB L2 Cache active, what would 128MB achieve if you're literally doubling bandwidth? If 64MB is enough then it would be prefable than adding 100-200mm2 of die space.
 
  • Like
Reactions: dr1337

jpiniero

Lifer
Oct 1, 2010
14,590
5,213
136
On die size I do wonder if they'll do not 128 MB L2 Cache but 64 which makes way more sense to me because 128 MB L2 Cache will be crazy huge and we have 512-bit bus with GDDR7 which is a literal doubling+ of bandwidth. I mean 4090 has 72MB L2 Cache active, what would 128MB achieve if you're literally doubling bandwidth? If 64MB is enough then it would be prefable than adding 100-200mm2 of die space.

I suspect because GB202 is intended to be a GDDR7 AI part and not intended for gaming or even Quadro.
 

jpiniero

Lifer
Oct 1, 2010
14,590
5,213
136

nVidia just announced that Jensen will do a GTC keynote on March 18th. Guessing they will announce Blackwell server then.
 
  • Like
Reactions: Mopetar

Ajay

Lifer
Jan 8, 2001
15,447
7,857
136

nVidia just announced that Jensen will do a GTC keynote on March 18th. Guessing they will announce Blackwell server then.
I mean, if it is indeed called Blackwell. Last I saw it was just H100 Next.
 

MoogleW

Member
May 1, 2022
50
27
51
If it's close to the N3E recticle limit and 512-bit, it's going to be stupid expensive. Even the cut down model will probably be a Titan and priced like it.
What makes you think its near reticle limit?
Look at the changes Nvidia made from A100 to H100 despite the 50% transistor density budget. They doubled the number of 32 bit and 64 bit ALUs, improved tensor cores by 2X, inceased L1 cache by 30% per SM and L2 cache by 20% and inceased the SM count by 20% (so overall L1 cache went up by 56%). They still reduced the overall die size even

On client with 50% budget, just taking H100 and reducing number of ALUs by cutting out 64 bit, reducing L1 cache fom 256KB to A100 level 192KB (they seem to offer 1KB per CUDA core since Turing), and reducing the scale of the hype bus link in Hopper they should maintain a die size of 620 mm squared or less.

Considering that kopite7kimi never mentioned or agreed with the other guy's 128mb L2 cache, the L2 cache may stay same size or reduce to maintain die sizes.
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
14,590
5,213
136
What makes you think its near reticle limit?

It makes sense from the rumors about real or alleged problems with CoWoS packaging supply. I'm guessing nVidia will sell GB202 as an AI option for $$$$$$$$ and I guess they need as much bandwidth and cache as they can get to deliver the performance to justify the pricing. They could still sell a Titan or a 5090 Ti for a lot less but still a lot if they desire.
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
14,590
5,213
136

Seems Blackwell Server at least will ship in volume in 2H 2024. And they might be moving to a yearly release schedule for AI.
 

Tigerick

Senior member
Apr 1, 2022
651
536
106

Wow, look like nVidia might be able to launch the Blackwell series in H2 2024, what a surprise :eek:

As for HBM memory support, I don't think it is necessary for gaming Blackwell cards
 

MoogleW

Member
May 1, 2022
50
27
51

Wow, look like nVidia might be able to launch the Blackwell series in H2 2024, what a surprise :eek:

As for HBM memory support, I don't think it is necessary for gaming Blackwell cards
It will be announced in March and ramped up in H2, Nvidia always does this. I am starting to think gaming will be announced in H2 as well
 

jpiniero

Lifer
Oct 1, 2010
14,590
5,213
136
Soo... what do you think the odds of Blackwell hitting high 3's in clocks? I was messing around with a projected spec list and I think it's only going to work if the clocks are mid-high 3's. Either that or they will have to add more CUDA cores/SM, which I doubt since I am guessing that if they are going to make any changes to the SM, it's going to be mainly AI and possibly RT.