• We should now be fully online following an overnight outage. Apologies for any inconvenience, we do not expect there to be any further issues.

Discussion Ada/'Lovelace'? Next gen Nvidia gaming architecture speculation

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Saylick

Diamond Member
Sep 10, 2012
4,060
9,486
136
Quick little update. GH100, the biggest die, may not be in an MCM product but the MCM product may consist of two smaller Hopper family dies? I'm not sure why Nvidia wouldn't MCM the big die if it's possible to MCM the smaller dies. Power consumption limits? If that's the case, why have the big die to begin with?
 

jpiniero

Lifer
Oct 1, 2010
16,841
7,285
136
Quick little update. GH100, the biggest die, may not be in an MCM product but the MCM product may consist of two smaller Hopper family dies? I'm not sure why Nvidia wouldn't MCM the big die if it's possible to MCM the smaller dies. Power consumption limits? If that's the case, why have the big die to begin with?

I suppose it could be similar dies but the MCM die has additional logic for coherency. That sounds like a total waste but perhaps they were unsure they would be able to get it working correctly?

If the single die really is ~33% more FP32 shaders, ~20 bigger die AND 25% more power draw, that doesn't sound so great if you factor in the shrink.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
I suppose it could be similar dies but the MCM die has additional logic for coherency. That sounds like a total waste but perhaps they were unsure they would be able to get it working correctly?

If the single die really is ~33% more FP32 shaders, ~20 bigger die AND 25% more power draw, that doesn't sound so great if you factor in the shrink.

That sounds just plain wrong, especially coming from Samsung 8LPP. Hopefully, more info comes along at GTC.
 

CakeMonster

Golden Member
Nov 22, 2012
1,630
810
136
I assume we are talking in context of the high end cards here, else it doesn't make much sense. Given that, it could be that NV have the dual chip technology nailed down, but are just weighing their options?

1) Two lower clocked and more efficient chips but connected that makes it more complex? 2) Or just one chip that pushes 5nm to its max and have no additional complexity, but needs to nail the cooling along with hand picked dies that can reach the target frequency and have very few physical errors?
 

jpiniero

Lifer
Oct 1, 2010
16,841
7,285
136
Oh, duh, yeah. Still, seems wrong - unless GH100 is physically smaller or has some major new functional unit included (or >> cache).

I'm sure they are doubling down on AI/ML performance. Problem is the MI250 is likely to be far faster in FP64.
 
  • Like
Reactions: Saylick

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
How can it be slightly less than 1000mm2 and monolithic?
Fabs work with NV to push the reticle limits to the max. Yields are probably poor, but for GPUs that expensive, it does't matter as much.
 
Last edited:

Frenetic Pony

Senior member
May 1, 2012
218
179
116

The latest rumor is that the top card will draw 850W. I wonder how that would even work with current PC case designs. Maybe the stock FE cooler will be an AIO.

That's beyond even an OAM socket rating. How big would that AIO cooler have to be? 360mm might not be enough...
 

Tup3x

Golden Member
Dec 31, 2016
1,279
1,410
136
250W is about the max I consider reasonable, maybe 300W. Anything higher than that is just too much. It's not practical.
 

Borealis7

Platinum Member
Oct 19, 2006
2,901
205
106
250W is about the max I consider reasonable, maybe 300W. Anything higher than that is just too much. It's not practical.
Thats is what we, as consumers, were taught for the past 16-17 years by NV & AMD (remember the GTX8800 Ultra 370W?), but the truth is technology has advanced since then and PSUs are able to pull much more power these days and output it quite reliably over the power rails to whatever hungry GPU might be connected to them. Advances in cooling and fan technology allow to dissipate more heat away from the hardware and out of the case, and silicon technology can produce densely packed chips with billions of transistors that together amount to the 300-400W operating power.
Just as we'll have to adjust to the new price norms, we will need to adjust ourselves to accept more power hungry hardware, because the number of transistors per mm2 is not going to go down any time soon.
 

Panino Manino

Golden Member
Jan 28, 2017
1,144
1,383
136
Nvidia was hacked and the data stolen is out.
This should give us plenty of information about future architectures.
 

maddogmcgee

Senior member
Apr 20, 2015
411
425
136
Thats is what we, as consumers, were taught for the past 16-17 years by NV & AMD (remember the GTX8800 Ultra 370W?), but the truth is technology has advanced since then and PSUs are able to pull much more power these days and output it quite reliably over the power rails to whatever hungry GPU might be connected to them. Advances in cooling and fan technology allow to dissipate more heat away from the hardware and out of the case, and silicon technology can produce densely packed chips with billions of transistors that together amount to the 300-400W operating power.
Just as we'll have to adjust to the new price norms, we will need to adjust ourselves to accept more power hungry hardware, because the number of transistors per mm2 is not going to go down any time soon.

I mean at 400 watts and two hours a day use, it would be costing me $75 AUD a year in power just for gaming. Then throw in the rest of the system, psu inefficiency, and the fact the video card would likely still use more power on the desktop (where it would easily run for another 10 hours a day) and you are starting to get into a pretty high cost per year.....especially when you could turn down the resolution and use upscaling to get similar performance from a much cheaper card.
 

OscaAndShintjee

Junior Member
Feb 22, 2022
16
11
36
Nvidia was hacked and the data stolen is out.
This should give us plenty of information about future architectures.
That's assuming that the stolen data is legitimate. The only details we've seen are threats to release drivers and firmware, evidence being likely fake leaked "code files" which tell us nothing we don't already know, written in a markup language that I can't even identify. This is all by people who can hardly put two sentences together.
Take what the jittery writers at TPU or TH write with salt, the evidence isn't there.

Edit: Looks like I was wrong in relevance to the leaked files being available to torrent-download.
 
Last edited:

DooKey

Golden Member
Nov 9, 2005
1,811
458
136
I mean at 400 watts and two hours a day use, it would be costing me $75 AUD a year in power just for gaming. Then throw in the rest of the system, psu inefficiency, and the fact the video card would likely still use more power on the desktop (where it would easily run for another 10 hours a day) and you are starting to get into a pretty high cost per year.....especially when you could turn down the resolution and use upscaling to get similar performance from a much cheaper card.
I would suggest that if a person is worried about the cost of electricity to game and use their computer that they shouldn't do either and concentrate on the true necessities of life.
 

Mopetar

Diamond Member
Jan 31, 2011
8,496
7,753
136
Fabs work with NV to push the reticle limits to the max. Yields are probably poor, but for GPUs that expensive, it does't matter as much.

Yield doesn't matter as much when you have something as massively parallel as a GPU like this. Even if a massive die like this has multiple defects they're most likely in areas that are highly redundant and those parts of the hardware can be fused off.

Even if the node isn't mature or just had an abnormally high defect rate, most dies could still be sold. Even the ones that don't have any defects still might just have the weakest performing units turned off.
 
  • Like
Reactions: Ajay

Saylick

Diamond Member
Sep 10, 2012
4,060
9,486
136
LOL, looks like Lovelace configurations might have leaked too:

NVIDIA-Ada-LoveLace-GPU-Specs-1200x207.png


If I remember correctly, that SM count for the top die has been leaked out by the usual Twitter suspects already.

Looks like at best we'll see a doubling in performance from AD102 over GA102, which is in line with all of the previous leaks. ~1.7x SM counts and some clock increases as well.
 
  • Like
Reactions: Mopetar

jpiniero

Lifer
Oct 1, 2010
16,841
7,285
136
Looks like at best we'll see a doubling in performance from AD102 over GA102, which is in line with all of the previous leaks. ~1.7x SM counts and some clock increases as well.

92 TF FP32 would be like 2.3-2.5x compute power. What's interesting is that the lower tier parts don't get that much of an SM increase so their performance increases won't be anywhere near as dramatic.
 

Saylick

Diamond Member
Sep 10, 2012
4,060
9,486
136
92 TF FP32 would be like 2.3-2.5x compute power. What's interesting is that the lower tier parts don't get that much of an SM increase so their performance increases won't be anywhere near as dramatic.
Right, but as you're aware, performance won't scale linearly with TFLOPS. Rumormill says 2x performance increase which I think is realistic.
The clocks might get ramped up for the lower end parts to make up for the smaller increase in SM counts. Just a guess.
 

jpiniero

Lifer
Oct 1, 2010
16,841
7,285
136

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
Yay to cache wars. Jump to this gen will be as large as AMD got moving to "Infinity cache". Probably even more impact on perf and power than for AMD, cause NV is so shader / RT / Tensor heavy and these are hungry for mem bw.
 
  • Like
Reactions: Saylick

Saylick

Diamond Member
Sep 10, 2012
4,060
9,486
136
I guess Nvidia finally realized that it too has to add a big ol' block of LLC to keep up with bandwidth demands without scaling up the memory bus to ludicrous levels. Hindsight will tell us whether Nvidia's approach of using a large, traditional memory bus supplemented with a smaller LLC is better than going all out on cache with a smaller memory bus a la AMD's Infinity Cache. If in the generation following Lovelace we see Nvidia sticking with the same bus width or even reducing it, but adding even more cache, we'd know that AMD's approach won out.