Discussion Ada/'Lovelace'? Next gen Nvidia gaming architecture speculation

dr1337

Senior member
May 25, 2020
331
559
106
Lately for the past month or so infamous twitter leaker kopite7kimi has had a nice trickle of next generation nvidia leaks, haven't seen too many people talking about it.

k1.PNG

And now today with further extrapolation by 3Dcenter.org and videocardz, the full AD102 die is pegged at 18,432 FP32 cores as a massive monolithic 5nm part. Though this news is really early considering Ampere launched only a few months ago, kopite has an excellent track record and also doesn't expect Lovelace for over another year anyways.

IMO this is starting to get interesting. Assuming the leaks are true it begs the question, is there something wrong with hopper? Or was hopper just too far out and Nvidia is refreshing the lineup with "ampere 2.0" in the mean time? (too much pressure from AMD?) And holy cow they're going to nearly double fp32 again?!?! Its hard not to be excited about such a monster graphics card 😆

What do you guys think? Is kopite off their rocker? Is nvidia really going to give us Big Ampere? I know its still really early for any legitimate speculation but its also so rare that we'd get such details so far out, though such a sudden shift in plans is also atypical from nvidia. Still, sounds very interesting, and its gonna be neat to see how this all pans out.
 
Last edited:

Midwayman

Diamond Member
Jan 28, 2000
5,723
325
126
mcm gpus just appear harder than anyone thought. I remember them on the AMD roadmap probably 5 years ago now. First company to crack it will own the gpu market for years.
 

CakeMonster

Golden Member
Nov 22, 2012
1,389
496
136
Turing lasted 2 years, it wouldn't be any surprise if Ampere is also expected to last 2 years. It could have been intended that way all the time, but the clickbaiters are just extracting ad revenue out of it by pretending things are happening every week.
 
  • Like
Reactions: Tlh97 and biostud

Bouowmx

Golden Member
Nov 13, 2016
1,138
550
146
Assuming 2x transistors of GA102, 56.0 B transistors is going to be ~700 mm^2 on Samsung 7/5 nm. Might be leaning TSMC this time.

GDDR6X, 384-bit, 21 GT/s won't be balanced, so HBM2(E) is on the table, unless there is a surprise GDDR revision.
 

GodisanAtheist

Diamond Member
Nov 16, 2006
6,783
7,117
136
Assuming 2x transistors of GA102, 56.0 B transistors is going to be ~700 mm^2 on Samsung 7/5 nm. Might be leaning TSMC this time.

GDDR6X, 384-bit, 21 GT/s won't be balanced, so HBM2(E) is on the table, unless there is a surprise GDDR revision.

-TSMC can barely keep up with manufacturing for its current customers sans NV volume parts. I honestly hope NV sticks with Samsung and that Samsung starts attracting additional customers. Can't have only one cutting edge fab company.
 

Justinus

Diamond Member
Oct 10, 2005
3,173
1,515
136
Nvidia's margins are too thin already with how Ampere shook out - they aren't going to rush to launch an even more gargantuan GPU at even worse margins anytime soon. I'm onboard with us being stuck with RDNA2 and Ampere as they are until 2022.
 
  • Like
Reactions: amenx

Mopetar

Diamond Member
Jan 31, 2011
7,831
5,980
136
Any card that large is going to be for the data center only. We already know that the 3090 didn't scale particularly well over the 3080 so throwing even more cores at a gaming card doesn't get you anywhere, even less so if they continue to push RT since that's where the biggest bottleneck is at right now.

If this is similar to Ampere it may not be indicative of anything else since GA100 was fabbed at TSMC while the rest of the stack was done at Samsung and used a different design in many respects.
 

marcUK2

Member
Sep 23, 2019
74
39
61
I own a 3090, but I'm a creator so I have a use for it otherwise I would buy a 3070/80, but I've played some games just to see what its capable of and imho I think it's just a waste of die space to have any more shader cores.

Raytracing is crap at the moment, but its obviously the future as traditional methods are maxed out in terms of quality. Having done some tests in unreal engine, bringing my 3090 down to 1fps with raytracing, I would rather have 10000 RT cores and 10000 shader cores, than any significant increase in shader cores and lame raytracing for the next decade
 

marcUK2

Member
Sep 23, 2019
74
39
61
I'm kind of interested in why a mcm GPUs should be so hard to pull off, surely as GPU architecture is so highly parallelizbale already, why can't the cores easily be split from the scheduler and io, and be put on separate dies aka Ryzen mcm?
 

Bouowmx

Golden Member
Nov 13, 2016
1,138
550
146
Any card that large is going to be for the data center only. We already know that the 3090 didn't scale particularly well over the 3080 so throwing even more cores at a gaming card doesn't get you anywhere, even less so if they continue to push RT since that's where the biggest bottleneck is at right now.

If this is similar to Ampere it may not be indicative of anything else since GA100 was fabbed at TSMC while the rest of the stack was done at Samsung and used a different design in many respects.
The number of GPC is changing with Ada, so the scaling with GeForce RTX 3080 and 3090, which both have 6 GPC, is not necessarily applicable.

Oops, had a brain stroke. The RTX 3090 has 7 GPC.
 

maddie

Diamond Member
Jul 18, 2010
4,738
4,667
136
I'm kind of interested in why a mcm GPUs should be so hard to pull off, surely as GPU architecture is so highly parallelizbale already, why can't the cores easily be split from the scheduler and io, and be put on separate dies aka Ryzen mcm?
Data movement = power.
 
  • Like
Reactions: Kepler_L2

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
Ampere is quite literally the first architecture from Nvidia since before Fermi that I could see getting a straight forward port to 5nm with only very minor tweaks in the architecture. Scaling is currently much worse with bigger dies / more cores than with past architectures and efficiency is in the dumps (which could be largely to blame on Samsung's node) so perhaps they would fix a few bottlenecks but I could seriously see Nvidia injecting Lovelace in the timeline as Ampere 1.5 on 5nm to "fix" Ampere and buy more time for Hopper.
 
  • Like
Reactions: coercitiv

Mopetar

Diamond Member
Jan 31, 2011
7,831
5,980
136
The efficiency on Ampere is only bad because Nvidia did what AMD had been doing for several generations and pushing the cards to the limits of the silicon which makes them guzzle power.

Based on testing from numerous websites and forum users the power draw can be cut dramatically with a very small decrease in clock speed and an accompanying under-volt.

A die shrink makes sense just because adding more CUDA cores doesn't make a lot of sense, but I think they'll want to find ways to better utilize all of those cores and overhaul the RT portions of the architecture to aim for at least doubling the performance again.
 

Midwayman

Diamond Member
Jan 28, 2000
5,723
325
126
I wonder how much of the poor launch availability can be attributed to them pushing clocks too much? I wonder if a lot of cores are failing out on frequency?
 

marcUK2

Member
Sep 23, 2019
74
39
61
Having done some more testing with my 3090, I can confirm that that actual gpu chip doesn't actually draw that much power in stock settings, so it's not the 8nm node that's bad... it's actually totally eclipsed by the power drawn by the gddr6 and the circuit board , which can use up to 250w on their own. The PCIE slot supplys 60w and the 2x12v 8pins supply around 140w each and my 3090 absolutely maxes out at 350w. I do not raise the power limit, but I have reduced it, and the RAM continues to draw considerable power even when the gpu clocks are under 1ghz and down at 20W
 
  • Like
Reactions: CP5670

Bouowmx

Golden Member
Nov 13, 2016
1,138
550
146
Having done some more testing with my 3090, I can confirm that that actual gpu chip doesn't actually draw that much power in stock settings, so it's not the 8nm node that's bad... it's actually totally eclipsed by the power drawn by the gddr6 and the circuit board , which can use up to 250w on their own. The PCIE slot supplys 60w and the 2x12v 8pins supply around 140w each and my 3090 absolutely maxes out at 350w. I do not raise the power limit, but I have reduced it, and the RAM continues to draw considerable power even when the gpu clocks are under 1ghz and down at 20W
How much does PWR_SRC power draw say? I assume that's the memory power.
85b.png
 

marcUK2

Member
Sep 23, 2019
74
39
61
I would hope that the mcm hopper actually refers to a separation of cuda cores and RT cores, it would be pretty cool to have a reasonable die space devoted to RT.
 

Mopetar

Diamond Member
Jan 31, 2011
7,831
5,980
136
I wonder how much of the poor launch availability can be attributed to them pushing clocks too much? I wonder if a lot of cores are failing out on frequency?

I think it's much the same we saw with AMD where the clocks and voltage are set higher than they probably should be, which does allow much more silicon to qualify and would mean that they're holding less back in reserve for some future card once they have enough chips that fall into that bin.

We could argue about the underlying reason or cause for that all day, but I don't think it has resulted in lower availability at launch. That's just down to limited wafers and other components needed to make their cards.
 

Hitman928

Diamond Member
Apr 15, 2012
5,244
7,793
136
Having done some more testing with my 3090, I can confirm that that actual gpu chip doesn't actually draw that much power in stock settings, so it's not the 8nm node that's bad... it's actually totally eclipsed by the power drawn by the gddr6 and the circuit board , which can use up to 250w on their own. The PCIE slot supplys 60w and the 2x12v 8pins supply around 140w each and my 3090 absolutely maxes out at 350w. I do not raise the power limit, but I have reduced it, and the RAM continues to draw considerable power even when the gpu clocks are under 1ghz and down at 20W

If the board power and VRAM power was so bad (compared to other modern graphics cards), you would expect that the 3070 with GDDR6 (non-x) and a much simpler PCB and components would be much more efficient than a 3080 or 3090, but that doesn't seem to be the case?

performance-per-watt_3840-2160.png

NVIDIA GeForce RTX 3070 Founders Edition Review - Disruptive Price-Performance | TechPowerUp
1609359410912.png
(1054) GeForce RTX 3070 Benchmark Review - YouTube
 

marcUK2

Member
Sep 23, 2019
74
39
61
If the board power and VRAM power was so bad (compared to other modern graphics cards), you would expect that the 3070 with GDDR6 (non-x) and a much simpler PCB and components would be much more efficient than a 3080 or 3090, but that doesn't seem to be the case?

performance-per-watt_3840-2160.png

NVIDIA GeForce RTX 3070 Founders Edition Review - Disruptive Price-Performance | TechPowerUp
View attachment 36619
(1054) GeForce RTX 3070 Benchmark Review - YouTube
....And then shows a graphic showing the 3070 as the most efficient solution out there with it's simple board and generic RAM...ok lol
 
  • Haha
Reactions: Midwayman