Discussion Ada/'Lovelace'? Next gen Nvidia gaming architecture speculation

dr1337 · Dec 28, 2020

Lately for the past month or so infamous twitter leaker kopite7kimi has had a nice trickle of next generation nvidia leaks, haven't seen too many people talking about it.

https://twitter.com/kopite7kimi/status/1336900538185572357

https://twitter.com/x/status/1337739048270479361

https://twitter.com/x/status/1343462867811495937

And now today with further extrapolation by 3Dcenter.org and videocardz, the full AD102 die is pegged at 18,432 FP32 cores as a massive monolithic 5nm part. Though this news is really early considering Ampere launched only a few months ago, kopite has an excellent track record and also doesn't expect Lovelace for over another year anyways.

IMO this is starting to get interesting. Assuming the leaks are true it begs the question, is there something wrong with hopper? Or was hopper just too far out and Nvidia is refreshing the lineup with "ampere 2.0" in the mean time? (too much pressure from AMD?) And holy cow they're going to nearly double fp32 again?!?! Its hard not to be excited about such a monster graphics card 😆

What do you guys think? Is kopite off their rocker? Is nvidia really going to give us Big Ampere? I know its still really early for any legitimate speculation but its also so rare that we'd get such details so far out, though such a sudden shift in plans is also atypical from nvidia. Still, sounds very interesting, and its gonna be neat to see how this all pans out.

Midwayman · Dec 28, 2020

mcm gpus just appear harder than anyone thought. I remember them on the AMD roadmap probably 5 years ago now. First company to crack it will own the gpu market for years.

CakeMonster · Dec 28, 2020

Turing lasted 2 years, it wouldn't be any surprise if Ampere is also expected to last 2 years. It could have been intended that way all the time, but the clickbaiters are just extracting ad revenue out of it by pretending things are happening every week.

Bouowmx · Dec 28, 2020

Assuming 2x transistors of GA102, 56.0 B transistors is going to be ~700 mm^2 on Samsung 7/5 nm. Might be leaning TSMC this time.

GDDR6X, 384-bit, 21 GT/s won't be balanced, so HBM2(E) is on the table, unless there is a surprise GDDR revision.

GodisanAtheist · Dec 28, 2020

Bouowmx said:
Assuming 2x transistors of GA102, 56.0 B transistors is going to be ~700 mm^2 on Samsung 7/5 nm. Might be leaning TSMC this time.

GDDR6X, 384-bit, 21 GT/s won't be balanced, so HBM2(E) is on the table, unless there is a surprise GDDR revision.

-TSMC can barely keep up with manufacturing for its current customers sans NV volume parts. I honestly hope NV sticks with Samsung and that Samsung starts attracting additional customers. Can't have only one cutting edge fab company.

Midwayman · Dec 28, 2020

GodisanAtheist said:
-TSMC can barely keep up with manufacturing for its current customers sans NV volume parts. I honestly hope NV sticks with Samsung and that Samsung starts attracting additional customers. Can't have only one cutting edge fab company.

Well even if Samsung gets more customers there still will only be one cutting edge fab.

JasonLD · Dec 28, 2020

MCM gpus will probably remain on compute only for a while.

Justinus · Dec 28, 2020

Nvidia's margins are too thin already with how Ampere shook out - they aren't going to rush to launch an even more gargantuan GPU at even worse margins anytime soon. I'm onboard with us being stuck with RDNA2 and Ampere as they are until 2022.

Mopetar · Dec 28, 2020

Any card that large is going to be for the data center only. We already know that the 3090 didn't scale particularly well over the 3080 so throwing even more cores at a gaming card doesn't get you anywhere, even less so if they continue to push RT since that's where the biggest bottleneck is at right now.

If this is similar to Ampere it may not be indicative of anything else since GA100 was fabbed at TSMC while the rest of the stack was done at Samsung and used a different design in many respects.

marcUK2 · Dec 28, 2020

I own a 3090, but I'm a creator so I have a use for it otherwise I would buy a 3070/80, but I've played some games just to see what its capable of and imho I think it's just a waste of die space to have any more shader cores.

Raytracing is crap at the moment, but its obviously the future as traditional methods are maxed out in terms of quality. Having done some tests in unreal engine, bringing my 3090 down to 1fps with raytracing, I would rather have 10000 RT cores and 10000 shader cores, than any significant increase in shader cores and lame raytracing for the next decade

marcUK2 · Dec 28, 2020

I'm kind of interested in why a mcm GPUs should be so hard to pull off, surely as GPU architecture is so highly parallelizbale already, why can't the cores easily be split from the scheduler and io, and be put on separate dies aka Ryzen mcm?

Bouowmx · Dec 28, 2020

Mopetar said:
Any card that large is going to be for the data center only. We already know that the 3090 didn't scale particularly well over the 3080 so throwing even more cores at a gaming card doesn't get you anywhere, even less so if they continue to push RT since that's where the biggest bottleneck is at right now.

If this is similar to Ampere it may not be indicative of anything else since GA100 was fabbed at TSMC while the rest of the stack was done at Samsung and used a different design in many respects.

~~The number of GPC is changing with Ada, so the scaling with GeForce RTX 3080 and 3090, which both have 6 GPC, is not necessarily applicable.~~

Oops, had a brain stroke. The RTX 3090 has 7 GPC.

maddie · Dec 28, 2020

marcUK2 said:
I'm kind of interested in why a mcm GPUs should be so hard to pull off, surely as GPU architecture is so highly parallelizbale already, why can't the cores easily be split from the scheduler and io, and be put on separate dies aka Ryzen mcm?

Data movement = power.

marcUK2 · Dec 28, 2020

Maybe we need highly rectangular chips with 1024bit interfaces?

maddie · Dec 28, 2020

marcUK2 said:
Maybe we need highly rectangular chips with 1024bit interfaces?

Internal data movement.

tviceman · Dec 28, 2020

Ampere is quite literally the first architecture from Nvidia since before Fermi that I could see getting a straight forward port to 5nm with only very minor tweaks in the architecture. Scaling is currently much worse with bigger dies / more cores than with past architectures and efficiency is in the dumps (which could be largely to blame on Samsung's node) so perhaps they would fix a few bottlenecks but I could seriously see Nvidia injecting Lovelace in the timeline as Ampere 1.5 on 5nm to "fix" Ampere and buy more time for Hopper.

Mopetar · Dec 30, 2020

The efficiency on Ampere is only bad because Nvidia did what AMD had been doing for several generations and pushing the cards to the limits of the silicon which makes them guzzle power.

Based on testing from numerous websites and forum users the power draw can be cut dramatically with a very small decrease in clock speed and an accompanying under-volt.

A die shrink makes sense just because adding more CUDA cores doesn't make a lot of sense, but I think they'll want to find ways to better utilize all of those cores and overhaul the RT portions of the architecture to aim for at least doubling the performance again.

Midwayman · Dec 30, 2020

I wonder how much of the poor launch availability can be attributed to them pushing clocks too much? I wonder if a lot of cores are failing out on frequency?

marcUK2 · Dec 30, 2020

Having done some more testing with my 3090, I can confirm that that actual gpu chip doesn't actually draw that much power in stock settings, so it's not the 8nm node that's bad... it's actually totally eclipsed by the power drawn by the gddr6 and the circuit board , which can use up to 250w on their own. The PCIE slot supplys 60w and the 2x12v 8pins supply around 140w each and my 3090 absolutely maxes out at 350w. I do not raise the power limit, but I have reduced it, and the RAM continues to draw considerable power even when the gpu clocks are under 1ghz and down at 20W

Bouowmx · Dec 30, 2020

marcUK2 said:
Having done some more testing with my 3090, I can confirm that that actual gpu chip doesn't actually draw that much power in stock settings, so it's not the 8nm node that's bad... it's actually totally eclipsed by the power drawn by the gddr6 and the circuit board , which can use up to 250w on their own. The PCIE slot supplys 60w and the 2x12v 8pins supply around 140w each and my 3090 absolutely maxes out at 350w. I do not raise the power limit, but I have reduced it, and the RAM continues to draw considerable power even when the gpu clocks are under 1ghz and down at 20W

How much does PWR_SRC power draw say? I assume that's the memory power.

marcUK2 · Dec 30, 2020

I would hope that the mcm hopper actually refers to a separation of cuda cores and RT cores, it would be pretty cool to have a reasonable die space devoted to RT.

marcUK2 · Dec 30, 2020

Bouowmx said:
How much does PWR_SRC power draw say? I assume that's the memory power.

I've had it up near 150w. Hover over the label with the mouse and it tells you it's the RAM. I'll post a pic next time I'm at the pc

Mopetar · Dec 30, 2020

Midwayman said:
I wonder how much of the poor launch availability can be attributed to them pushing clocks too much? I wonder if a lot of cores are failing out on frequency?

I think it's much the same we saw with AMD where the clocks and voltage are set higher than they probably should be, which does allow much more silicon to qualify and would mean that they're holding less back in reserve for some future card once they have enough chips that fall into that bin.

We could argue about the underlying reason or cause for that all day, but I don't think it has resulted in lower availability at launch. That's just down to limited wafers and other components needed to make their cards.

Hitman928 · Dec 30, 2020

marcUK2 said:
Having done some more testing with my 3090, I can confirm that that actual gpu chip doesn't actually draw that much power in stock settings, so it's not the 8nm node that's bad... it's actually totally eclipsed by the power drawn by the gddr6 and the circuit board , which can use up to 250w on their own. The PCIE slot supplys 60w and the 2x12v 8pins supply around 140w each and my 3090 absolutely maxes out at 350w. I do not raise the power limit, but I have reduced it, and the RAM continues to draw considerable power even when the gpu clocks are under 1ghz and down at 20W

If the board power and VRAM power was so bad (compared to other modern graphics cards), you would expect that the 3070 with GDDR6 (non-x) and a much simpler PCB and components would be much more efficient than a 3080 or 3090, but that doesn't seem to be the case?

NVIDIA GeForce RTX 3070 Founders Edition Review - Disruptive Price-Performance | TechPowerUp

(1054) GeForce RTX 3070 Benchmark Review - YouTube

marcUK2 · Dec 30, 2020

Hitman928 said:
If the board power and VRAM power was so bad (compared to other modern graphics cards), you would expect that the 3070 with GDDR6 (non-x) and a much simpler PCB and components would be much more efficient than a 3080 or 3090, but that doesn't seem to be the case?

NVIDIA GeForce RTX 3070 Founders Edition Review - Disruptive Price-Performance | TechPowerUp
View attachment 36619
(1054) GeForce RTX 3070 Benchmark Review - YouTube

....And then shows a graphic showing the 3070 as the most efficient solution out there with it's simple board and generic RAM...ok lol

Discussion Ada/'Lovelace'? Next gen Nvidia gaming architecture speculation

Senior member

Diamond Member

Golden Member

Golden Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Member

Member

Golden Member

Diamond Member

Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Member

Golden Member

Member

Member

Diamond Member

Diamond Member

Member