Discussion Nvidia Blackwell in Q4-2024 ?

Page 7 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

jpiniero

Lifer
Oct 1, 2010
14,598
5,216
136
Not important to me. :)

Super Frame Gen incoming? ;)

They could increase the FP32 and RT cores per SM instead. It would definitely help the comparison with previous gen Ada at least... but what do you think is more likely?

With presumed lower die sizes on the lower end products... there's only so much room left for whatever it is they want to change.
 

Mopetar

Diamond Member
Jan 31, 2011
7,837
5,992
136
I hope they make some further solid performance gains in RT. If so the upper mid-range cards will probably be able to run games at top settings without having to use DLSS to compensate for lower frame rates.

If they can get that product to market for $700 or less I think they'll have a lot of happy customers.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,356
2,848
106
Super Frame Gen incoming? ;)
Hope not. :D
They could increase the FP32 and RT cores per SM instead. It would definitely help the comparison with previous gen Ada at least... but what do you think is more likely?

With presumed lower die sizes on the lower end products... there's only so much room left for whatever it is they want to change.
I don't believe in increasing the execution units per SM compared to outright increase the SM count. I don't think the saved space would be significant, but I can be mistaken.
I think Nvidia will keep the current L2 cache size.

I would like something like this:

AD107->GB107: 30SM(+25%); 3840FP32(+25%); 120TMU(+25%); 40ROP(+25%); 24MB L2(+0%); 96-bit 30gbps GDDR7(+32%); 9GB Vram(+13%)

AD106->GB106: 48SM(+33%); 6144FP32(+33%); 192TMU(+33%); 60ROP(+25%); 32MB L2(+0%); 128-bit 30gbps GDDR7(+67%); 12GB Vram(+50%)

AD104->GB104: 72SM(+20%); 9216FP32(+20%); 288TMU(+20%); 96ROP(+20%); 48MB L2(+0%); 192-bit 30gbps GDDR7(+43%); 18GB Vram(+50%)

AD103->GB103: 108SM(+35%); 13824FP32(+35%); 432TMU(+35%); 144ROP(+29%); 64MB L2(+0%); 256-bit 32gbps GDDR7(+39%); 24GB Vram(+50%)

AD102->GB102: 192SM(+33%); 24576FP32(+33%); 768TMU(+33%); 240ROP(+25%); 96MB L2(+0%); 384-bit 32gbps GDDR7(+52%); 36GB Vram(+50%)

or version 2 with different SM config: 128FP32 -> 192FP32(+50%) and 2x more RT units, so @Mopetar will be happy

AD107->GB107v2
: 20SM(-17%); 3840FP32(+25%); 80TMU(-17%); 40ROP(+25%); 24MB L2(+0%); 96-bit 30gbps GDDR7(+32%); 9GB Vram(+13%)

AD106->GB106v2: 32SM(-11%); 6144FP32(+33%); 128TMU(-11%); 60ROP(+25%); 32MB L2(+0%); 128-bit 30gbps GDDR7(+67%); 12GB Vram(+50%)

AD104->GB104v2: 48SM(-20%); 9216FP32(+20%); 192TMU(-20%); 96ROP(+20%); 48MB L2(+0%); 192-bit 30gbps GDDR7(+43%); 18GB Vram(+50%)

AD103->GB103v2: 72SM(-10%); 13824FP32(+35%); 288TMU(-10%); 144ROP(+29%); 64MB L2(+0%); 256-bit 32gbps GDDR7(+39%); 24GB Vram(+50%)

AD102->GB102v2: 128SM(-11%); 24576FP32(+33%); 512TMU(-11%); 240ROP(+25%); 96MB L2(+0%); 384-bit 32gbps GDDR7(+52%); 36GB Vram(+50%)

I think with either of these configs die size should be smaller than Ada predecessor by using N3 process.
 
Last edited:
  • Love
  • Like
Reactions: Tlh97 and Mopetar

CakeMonster

Golden Member
Nov 22, 2012
1,391
498
136
Since we're mostly speculating about new features, I'm more interested in the launch schedule and process nodes. Will they launch in the ~24m time frame we've gotten used to? Will they delay, and in that case why? To pay less for the new node, or to use a different newer node? What does that mean for what comes after? Surely the generation that could normally be expected in 2026 must have some radical changes if the 2024 generation doesn't. Or will they only iterate on architecture because shrinking the process nodes is very uncertain going forward?
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
14,598
5,216
136
and in that case why?

AI Hype. The 4090 is selling extremely well too.

I think with either of these configs die size should be smaller than Ada predecessor by using N3 process.

Problem of course is that the cache has no scaling at all and I don't think the IO has much either. That's a lot of the chip. Cutting the L2 is def a possibility on the lower tier products.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,356
2,848
106
Problem of course is that the cache has no scaling at all and I don't think the IO has much either. That's a lot of the chip. Cutting the L2 is def a possibility on the lower tier products.
That's why I didn't increase L2 and kept the memory width.
In case of the weakest one I actually decreased L2 from 32MB ->24MB and cut memory width by 25%. I don't like the 9GB Vram, but It can't be helped unless we use 128-bit or clamshell.

If I add another 10% higher frequency o top of that and Nvidia set the prices shown below, then It would be a pretty good offering in my opinion.
<=156mm2 GB107 -> $299
<=186mm2 GB106 -> $449
<=295mm2 GB104 -> $699
<=379mm2 GB103 -> $1049
<=609mm2 GB102 -> $1599
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
14,598
5,216
136
That's why I didn't increase L2 and kept the memory width.
In case of the weakest one I actually decreased L2 from 32MB ->24MB and cut memory width by 25%. I don't like the 9GB Vram, but It can't be helped unless we use 128-bit or clamshell.

If I add another 10% higher frequency o top of that and Nvidia set the prices shown below, then It would be a pretty good offering in my opinion.
<=156mm2 GB107 -> $299
<=186mm2 GB106 -> $449
<=295mm2 GB104 -> $699
<=379mm2 GB103 -> $1049
<=609mm2 GB102 -> $1599

I am expecting GB202 and GB203 to be decently more.

The lower end... we'll see.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,356
2,848
106
I am expecting GB202 and GB203 to be decently more.

The lower end... we'll see.
I forgot that the cutdown RTX 4090 cost $1599 and not the full one.

End prices will depend on final performance, but Nvidia will have to offer better perf/price.

edit: Honestly, the prices I set are very good -> rather optimistic for that performance.
~4060Ti 9GB for $299, not bad at all. Why I set them so low is because Nvidia in my opinion don't have any other tech like Frame generator, which would "increase FPS" and make It a big selling point.
 
Last edited:

tajoh111

Senior member
Mar 28, 2005
298
312
136
I'll be content if they can give us slot powered 12GB Blackwell GPU for $250...
Impossible if they use GDDR7.

New memory is expensive when it comes out and when you add in, you want 50% more combined a selling price of 50 dollars less than current pricing of the RTX 4060 makes it impossible unless they want to sell at cost.


Would you rather get 12gb of GDDR6 or 8GB of GDDR7 at 300(not 250).

With the price of 3nm chips being so high,(a17 silicon in apples products cost 130 dollars), combined to a switchover to GDDR7, if Nvidia can keep prices at 299 even with 8gb of GDDR7 memory, you are actually looking at relatively generous pricing. AMD can't even price the 7600 at $250 dollars, why would you expect something this price from Nvidia when 12gb of GDDR7 and 15xmm2 3 nm chip cost more than double compared to 8gb of gddr6 and a 200mm 6nm chip.

https://www.notebookcheck.net/BoM-a...icantly-higher-production-costs.760590.0.html.

Considering a 110mm A17 silicon is 130 dollars, even with maturation of the node and better yields, 150mm2 GPU would cost $100 dollars, potentially higher with Apple getting a better deal.
 

Tigerick

Senior member
Apr 1, 2022
658
537
106
According to Kopite7kimi, NV will use 16-Gbit (2GB) initially. Well, I am speculating NV will use 24GB GDDR7 as RTX5090. For those who still believe the rest of Blackwell lineup will be using GDDR7, here let me help you:-

RTX 5080 - 16GB 256-BIT GDDR7
RTX 5070Ti - 12GB 192-BIT GDDR7
RTX 5060Ti - 8GB 128-BIT GDDR7
RTX 5060 - 6GB 96-BIT GDDR7

Hoho, 6GB video RAM ??? As I said earlier, I really hope after seeing the table no members are fool enough to believe such config make sense... :p

PS: Let me repeat, NV will increase amount of L2 cache on almost all GB200 series cause NV have been crippled the amount of L2 cache in RTX 4000 series. And the reason they are able to do that is to use higher bus GDDR6X..
 
Last edited:

Tigerick

Senior member
Apr 1, 2022
658
537
106
If anything, expect the opposite. Remember there's no cache scaling with N3E at all. So basically cache costs 40%+ more.
NO, AD102 comes with 96MB L2 cache, so NV just maintaining die size with logic density improvement of N3E.
 

Aapje

Golden Member
Mar 21, 2022
1,382
1,864
106
RTX 5060 - 6GB 96-BIT GDDR7

Hoho, 6GB video RAM ??? As I said earlier, I really hope after seeing the table no members are fool enough to believe such config make sense... :p

You are completely ignoring the possibility of a clamshell. If the 5060 will be 96 bit, I guarantee you that they will clamshell it to 12 GB.
 

Tigerick

Senior member
Apr 1, 2022
658
537
106
You are completely ignoring the possibility of a clamshell. If the 5060 will be 96 bit, I guarantee you that they will clamshell it to 12 GB.
You are completely ignoring that 5060 is a low end card, please double up the rest of lineup...
 

jpiniero

Lifer
Oct 1, 2010
14,598
5,216
136
Maybe the 5060 or similar will be far enough out that 3 GB chips would be available then?
 

Tigerick

Senior member
Apr 1, 2022
658
537
106
RTX4080.png

Damn, I remember when NV launched RTX4090 along with 2 versions of RTX4080. What puzzle people are why the hell NV launched two versions of RTX4080 with 2 different dies? I think I can answer with a table: :cool:

  • NV going to split GB203 to two different memory configurations: 24GB and 20GB (just like upcoming RDNA5)
  • 16MB L2 cache per 64-bit memory bus is standard configuration for Ada Lovelace but NV crippled it to make room for upcoming Blackwell series. That's why my speculated Blackwell specs are having L2 cache upgrade. Definitely not downgrade as someone keep saying it.
  • GB205 of course would be faster than RTX4070Ti Super with bigger L2 cache, do you think Jensen would be stupid to launch something slower than current one? Use your mind.
  • All the lineups will get either memory size or type upgrade with L2 cache bumping. This is my speculated Blackwell series.

SRPRTX-4000 LaunchDieMemoryL2 CacheRTX-4000 SuperDieMemoryL2 CacheRTX-5000 LaunchDieMemoryL2 Cache
$1,599RTX 4090AD10224GB G6X72 MBRTX 5090GB20232GB GDDR7112 MB ?
$1,199RTX 4080 16GBAD10316GB G6X64 MBRTX 5080Ti?24GBGB20324GB GDDR796 MB
$999RTX 4080 SuperAD10316GB G6X64 MBRTX 5080 20GBGB20320GB GDDR780 MB
$899RTX 4080 12GBAD10412GB G6X48 MB
$799RTX 4070 TiAD10412GB G6X48 MBRTX 4070Ti SuperAD10316GB G6X48 MBRTX 5070TiGB20516GB GDDR764
MB
 
Last edited:

Aapje

Golden Member
Mar 21, 2022
1,382
1,864
106
@Tigerick

GDDR6X uses the more expensive and more power hungry PAM4 encoding & GDDR7 is still made with DUV. GDDR6X is also limited to one RAM company, so Nvidia can't play the memory manufacturers against each other to get the best price. So I'm not convinced that GDDR7 is actually going to be more expensive for Nvidia than GDDR6X that runs at the same speeds. And the tech companies love their "bigger number better" marketing, so if they switch to GDDR7 for at least the 5070 and up, they can make a big fuss about in their presentations.

Turing also switched completely to GDDR6, where the 10-series used both GDDR5 and GDDR5X.

And your lineup assumes that Nvidia will either use the 384 bit GB202 chip in the 5080, or will have both a 384 bit GB202 and GB203. I don't believe either for a second. The only way the 5080 is going to have 384 bit, is if the 5090 will have 512 bit. But Kopite7kimi just walked that back 8 hours ago and now claims that the bus sizes will be roughly the same.

I foresee a lot of people getting upset over the memory sizes of Blackwell, unless Nvidia does shift some things around, like using a higher end chip per tier than for Ada and/or using more clamshells (or at least for a lower price).
 
  • Like
Reactions: Executor_ and Tlh97

jpiniero

Lifer
Oct 1, 2010
14,598
5,216
136

Kopite seems to think the memory interfaces are the same. IOW expect memory capacity to be the same.

I foresee a lot of people getting upset over the memory sizes of Blackwell, unless Nvidia does shift some things around, like using a higher end chip per tier than for Ada and/or using more clamshells (or at least for a lower price).

Expect people to be mad. Don't expect them to use clamshell again.

I suppose by the time the refresh comes out, 3 GB chips will be available so perhaps they will use that.
 

Tigerick

Senior member
Apr 1, 2022
658
537
106
Kopite7kimi clearly refer to GB203 onward cause we all know GB202 going to use GDDR7. If people still not understand the words "memory interface configuration" of AD10x. Well, wait for more leaks... ;)

Since NV going to launch mobile GPU in Q1-2025 and due to competition of upcoming RDNA4, NV might speed up the launching of mid end cards. Let me finish my fun table below:

SRPRTX-40DieMemoryL2 CacheRTX-50 LaunchDieMemoryL2 CacheSRP ?RDNA 4/5MemoryIC ?
$599RTX 4070 SuperAD10412GB G6X48 MBRTX 5070GB20612GB GDDR748 MB$499N4816GB G664 MB
$399 $449RTX 4060 TiAD1068/16 GB G632 MBRTX 5060 TiGB20612GB G648 MB ?$399N4812GB G648 MB
$299RTX 4060AD1078GB G624 MBRTX 5060GB20716GB G632 MB$299N4416GB G632 MB
 
Last edited: