Discussion Ada/'Lovelace'? Next gen Nvidia gaming architecture speculation

Page 8 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

dr1337

Senior member
May 25, 2020
337
566
106
Though the GDDR7 part seems silly, afaik it's not even done being finalized by JEDEC as a spec yet, let alone coming out this year.
This is exactly what everyone thought when ampere was leaked as having gddr6x, which still has no JEDEC specs. I'd absolutely believe micron would sell nvidia gddr7 before its officially in specification, but id believe that part of his 'leak' is actually just speculation more. Samsung is pushing standard gddr6 modules up to speeds higher than currently available g6x, so a new generation/iteration of gddr6x does seem very likely.
 

Saylick

Diamond Member
Sep 10, 2012
3,170
6,398
136
i'm getting GTX480 vibes again
This was my exact thought. This will just be Fermi v2.0, and with twice the power consumption of the original Fermi to boot. Good lord, how is anyone going to game with a 500W+ GPU without their room heating up 10 degrees. It's almost a guarantee that the AC unit will need to be turned on whenever you game.
 

Mopetar

Diamond Member
Jan 31, 2011
7,842
5,993
136
nVidia has solved the mining problem! by making cards that draw enormous amounts of power, they make mining on them not profitable! Praise Jensen!

ETH miners actually run their cards with an underclock since it's memory bound. The Ampere cards perform quite efficiently when they aren't being pushed to the limits for that last 5% performance bump.

Even if they were mining something that's compute bound, clock speeds and voltage will be adjusted to maximize profit. NVidia would have to create a worse performing card in general for miners to shun it.
 

Timmah!

Golden Member
Jul 24, 2010
1,419
631
136
How can it be slightly less than 1000mm2 and monolithic?

It will be single die per wafer - sized :-D

The 4090 with 144SMs (well probably slightly less) and rumored 96 MB of L2 cache sounds great. Will be all over it, but hope the price wont be higher than current 3090 inflated price...which is like 2500 EUROs with VAT.
 

Saylick

Diamond Member
Sep 10, 2012
3,170
6,398
136
How is Nvidia planning to counter 256MB/512MB of stacked Infinity Cache on RDNA3?
96 MB of on-die cache, humongous monolithic die, ramping up clocks and consequently power, and most importantly of all: launch Lovelace before RDNA3 with the full might of the Nvidia marketing team.
 
  • Haha
Reactions: igor_kavinski

Saylick

Diamond Member
Sep 10, 2012
3,170
6,398
136
Videocardz has some leaks (more so from the usual Twitter leakers) about Hopper.

 

Aapje

Golden Member
Mar 21, 2022
1,382
1,865
106
This is exactly what everyone thought when ampere was leaked as having gddr6x, which still has no JEDEC specs. I'd absolutely believe micron would sell nvidia gddr7 before its officially in specification, but id believe that part of his 'leak' is actually just speculation more. Samsung is pushing standard gddr6 modules up to speeds higher than currently available g6x, so a new generation/iteration of gddr6x does seem very likely.

GDDR6X uses PAM4 signalling to send more data over the same size bus. This is expensive, so I wouldn't expect GDDR7 to use PAM4.

GDDR7 will require a smaller production process, but Samsung has been having problems with their 5 nm yield and there is a shortage of production capacity for the forseeable future. So they might not be able to produce it right now. They should be already be tooling up their plants for GDDR7 to make it into the first version of Lovelace, which I haven't heard anything about, so I don't see that happening. Perhaps for a refresh in 2023.
 

GodisanAtheist

Diamond Member
Nov 16, 2006
6,817
7,177
136
If Nvidia manages to win with just 96MB cache against RDNA3's 512MB, I'll be like "WOAHHHHH!".

- Its NV's market to lose, quite honestly. They're in kind of an awkward position (from a hobbyist perspective) since if they win its what everyone expected because its what they've done for every launch since 7xxx Ghz editions, while if they lose it will be a shocking upset for the same reason.

RDNA 2 competing across the stack in raster performance while also brining RT performance to the table is already a massive feat that I don't think most folks really expected (a lot of "AMD will be competing with the 3070" posts pre-launch).

AMD weirdly had the benefit of basically 0 expectations after the Vega/Polaris debacle, although folks are certainly more expectant of RDNA3 than either of the prior archs.
 

Aapje

Golden Member
Mar 21, 2022
1,382
1,865
106
I think that AMD's GPU side is really benefiting from the investments on the CPU side, as they seem to have really taken advantage from innovations that were made for CPU's. For example, both multi-die and infinity cache were developed for the CPU's, but also used in the GPU's.

I wonder if this is why Nvidia tried to buy ARM, to also take advantage of such synergy. In the future, Nvidia will be up against two competitors with a strong CPU division, who can thus make investments that pay off in two markets, while Nvidia has to earn it all back by selling GPU's.
 

Saylick

Diamond Member
Sep 10, 2012
3,170
6,398
136
- Its NV's market to lose, quite honestly. They're in kind of an awkward position (from a hobbyist perspective) since if they win its what everyone expected because its what they've done for every launch since 7xxx Ghz editions, while if they lose it will be a shocking upset for the same reason.

RDNA 2 competing across the stack in raster performance while also brining RT performance to the table is already a massive feat that I don't think most folks really expected (a lot of "AMD will be competing with the 3070" posts pre-launch).

AMD weirdly had the benefit of basically 0 expectations after the Vega/Polaris debacle, although folks are certainly more expectant of RDNA3 than either of the prior archs.
Yeah, being the market leader, it's always been Nvidia's market to lose. I don't think a competitor has ever been this close to Nvidia in terms of overall performance, perf/W, and feature set in a LONG time. AMD was already on-par with Nvidia in raster performance with RDNA 2, and they are expected to make another step-function jump up in raster and RT performance with RDNA 3. Even if Lovelace beats RDNA 3 in RT (even if it were say 30-50% faster), I feel like we're still in that awkward transition phase where RT is getting better every year but there's still no real game that leverages RT in a way that makes the game unplayable or not enjoyable if RT wasn't enabled. In other words, RT is still taking a back seat to rasterization in modern games; it is only used to enhance certain visual effects but as a whole it isn't a "requirement". Going back to feature set, FSR 2.0 is going to come out later this year, and if all reports are true, AMD is really trying to button it up so that RDNA 3 hits it out of the park without controversy or trouble. Lastly, if AMD is already gunning at Nvidia's top end, Intel will come in and hit Nvidia's bottom end when Arc launches later this year. Rumors are saying that Intel is willing to make less profit per card just to get a foothold into the discrete GPU market, and they plan on launching ASAP to wring out the customer's wallet before AMD or Nvidia can properly launch their next-gen cards. Nvidia is being squeezed in the mobile/laptop space as well for the aforementioned reasons. Being that they don't design CPUs, once AMD and Intel get a competitive laptop GPU in both performance and feature set, Intel and AMD will simply just bundle their CPU+GPU together and push Nvidia out.

If you've been following Nvidia's business moves in the last few months, you'll notice that Nvidia has been buying a bunch of smaller companies to flesh out their HPC/enterprise software stack. I think Nvidia knows that their core markets will be more competitive than ever before, and in response they have to diversify and expand their other markets to compensate.
 
  • Like
Reactions: GodisanAtheist

Saylick

Diamond Member
Sep 10, 2012
3,170
6,398
136
I think that AMD's GPU side is really benefiting from the investments on the CPU side, as they seem to have really taken advantage from innovations that were made for CPU's. For example, both multi-die and infinity cache were developed for the CPU's, but also used in the GPU's.

I wonder if this is why Nvidia tried to buy ARM, to also take advantage of such synergy. In the future, Nvidia will be up against two competitors with a strong CPU division, who can thus make investments that pay off in two markets, while Nvidia has to earn it all back by selling GPU's.
Absolutely. The GPU side benefits not just from the fact that they are leveraging learning from their CPU design teams in the design of GPUs, but the fact that the CPU side was successful and profitable means that the amount of financial resources AMD can allocate to their GPU side has never been greater.
 

gdansk

Platinum Member
Feb 8, 2011
2,116
2,615
136
And both AMD and Nvidia see that the future (for certain tasks) is putting memory, compute right next to the CPU with stronger and stronger "glue".

ARM acquisition mades sense in that regard but I don't think it is *necessary* in order for Nvidia to create converged accelerators or whatever you want to call them.
 

Glo.

Diamond Member
Apr 25, 2015
5,711
4,558
136
If Nvidia manages to win with just 96MB cache against RDNA3's 512MB, I'll be like "WOAHHHHH!".
Short story, they won't.

Again: don't expoect miracles from Ada. Its Ampere, on smaller node, with larger L2 cache, that lifts the mem bandwidth bottleneck.

To some degree...
 

gdansk

Platinum Member
Feb 8, 2011
2,116
2,615
136
It is, however, a very large node upgrade and if the rumors are correct Nvidia is willing to ship space heaters.

So it might be ahead. Do we really know enough to say it won't be?
 

Saylick

Diamond Member
Sep 10, 2012
3,170
6,398
136
Videocardz has some leaks (more so from the usual Twitter leakers) about Hopper.

Videocardz dropping some bombs before tomorrow's GTC presentation:

Looks like 144 SMs in total, assuming the die shot is representative. I count 12 SMs per row, and there's about 12 rows (6 above and 6 below centerline). Nvidia will likely disable some SMs for yield reasons, however.
NVIDIA-Hopper-H100.jpg


Edit: Just wanted to bring up GA100 for comparison. Overall, the layout looks pretty much similar to server Ampere but I'm sure there's some secret sauce to Hopper, e.g. next generation tensor units and/or doubling of FP32 units like gaming Ampere. GH100 does look a little bigger than GA100 just going off of the size of the HBM stacks. In the Hopper render, you can see that the die is wider than 3 stacks of HBM (there's some gaps between each stack), while for Ampere the die is slightly less wide than 3 stacks without any gaps between the stacks. The height of the die looks comparable between the two generations, seeing that GA100 is more square in aspect ratio while GH100 is wider than it is tall.
EX_HYvcXgAAMqf8.jpg
 
Last edited:

jpiniero

Lifer
Oct 1, 2010
14,605
5,224
136

Yikes, 700 W. And it's on N4 too instead of N5. Of course the big increase is in low precision (6x)
 
  • Like
Reactions: Saylick

Saylick

Diamond Member
Sep 10, 2012
3,170
6,398
136

Yikes, 700 W. And it's on N4 too instead of N5. Of course the big increase is in low precision (6x)
Looks like there's a doubling of FP/tensor units somewhere, not unlike gaming Ampere, because the transistor count is only 80B, which isn't a 3x bump over GA100. Also, it's not like the die is 3x bigger either. They are up against reticle limits already.
 
Jul 27, 2020
16,329
10,340
106

Yikes, 700 W. And it's on N4 too instead of N5. Of course the big increase is in low precision (6x)
Meh. It can't run Crysis :p
 
  • Love
  • Haha
Reactions: psolord and Saylick

Saylick

Diamond Member
Sep 10, 2012
3,170
6,398
136
Looks like there's a doubling of FP/tensor units somewhere, not unlike gaming Ampere, because the transistor count is only 80B, which isn't a 3x bump over GA100. Also, it's not like the die is 3x bigger either. They are up against reticle limits already.
Yep, looks like ratio of 2:1 for INT32 to FP32 units just like gaming Ampere:
FOdzdTYXsAMTgqv.png

Edit: Some more tables from Nvidia:
NVIDIA-H100-GPU-2-876x1200.jpg


Edit2: Here's the link to Nvidia's whitepaper, essentially: https://developer.nvidia.com/blog/nvidia-hopper-architecture-in-depth/

1647969080241.png

1647969005203.png
1647969017296.png
1647969032226.png
 
Last edited: