Discussion Ada/'Lovelace'? Next gen Nvidia gaming architecture speculation

insertcarehere · Feb 7, 2023

Now this is pretty interesting, Nvidia seems to have changed how they offer mobile GPUs in terms of PCB packaging between Ampere and Ada.

In Ampere the PCB1 "large" package (for large notebooks) covered everything from 3080ti to 3050ti, while the PCB2 "small" package covered from 3050ti downwards (3050ti offered in both package sizes).

In Ada the PCB1 "large" package is only limited to RTX4090/80 laptops. While PCB2 "small" package covers everything from RTX 4070 laptops downwards.

Technically speaking, this means all the thin/light notebooks which were only limited to 3050ti on Ampere could technically spec up to a RTX 4070 this generation.

leoneazzurro · Feb 8, 2023

That is quite probably a consequence of the shrinking of die sizes and memory buses on Ada: 3070 mobile was based on GA104 (392 mm^2) with 256 bit bus, 4070 mobile on AD106 (190 mm^2) with 128 bit bus. 3060 also had a bigger de than AD106, GA106 was 276 mm^2 with up to 192 bit bus.

jpiniero · Feb 13, 2023

https://twitter.com/x/status/1625164363391971331

Appears the 4060 desktop has the same specs as the laptop and not using a further cut down AD106. At those specs it might have difficulty beating the 3060 Ti.

TESKATLIPOKA · Feb 13, 2023

First RTX 4060 Laptop tested (Videocardz)

I think performance is pretty good, scoring ~12% higher than the fastest RTX 3060 laptop in notebookcheck database.
Not to my liking is only 8GB Vram, but that was to be expected.
What is terrible is the laptop prices unless you are ok with a low-mid Alder lake, those weren't as expensive.
I hope with AMD CPU It will be lower, but I am pretty sceptical.

TESKATLIPOKA · Feb 13, 2023

jpiniero said:
https://twitter.com/x/status/1625164363391971331

Appears the 4060 desktop has the same specs as the laptop and not using a further cut down AD106. At those specs it might have difficulty beating the 3060 Ti.

Actually, Laptop model could be faster, because It has dynamic boost(+25W).

MrTeal · Feb 13, 2023

jpiniero said:
https://twitter.com/x/status/1625164363391971331

Appears the 4060 desktop has the same specs as the laptop and not using a further cut down AD106. At those specs it might have difficulty beating the 3060 Ti.

Depending on clocks it should have 15-16TF compute which is right around the 3060Ti. That's similar to the 3090Ti and 4070Ti, so that seems reasonable. The proposed 4060 has more bandwidth relative to the 3060Ti than the 4070Ti does to the 3090Ti though, so maybe it won't suffer as much at higher res relative to last gen.

Mopetar · Feb 13, 2023

The VRAM is going to make it look worse than a 3060 in some titles where 8 GB isn't enough. The benchmarks for the Harry Potter game where RT was turned on and the 3060 was beating the 3080 were kind of funny.

insertcarehere · Feb 13, 2023

Mopetar said:
The VRAM is going to make it look worse than a 3060 in some titles where 8 GB isn't enough. The benchmarks for the Harry Potter game where RT was turned on and the 3060 was beating the 3080 were kind of funny.

That benchmark was from HWUB, but the 3080 seems to do fine here as per Computerbase.

I think Hogwarts Legacy just has performance issues that prevent it from being a consistently benchmarkable game at this state.

jpiniero · Feb 14, 2023

insertcarehere said:
That benchmark was from HWUB, but the 3080 seems to do fine here as per Computerbase.

I think Hogwarts Legacy just has performance issues that prevent it from being a consistently benchmarkable game at this state.

I haven't watched the video but I think the HUB benchmarks are from further in the game.

jpiniero · Feb 17, 2023

nVidia just announced that Jensen is doing a GTC keynote on March 21. Could be when the 4070 and 4060 Ti is officially announced.

igor_kavinski · Feb 17, 2023

jpiniero said:
nVidia just announced that Jensen is doing a GTC keynote on March 21.

Let us all manifest together 16GB RAM for the 4060.

Om Om Om

psolord · Feb 20, 2023

igor_kavinski said:
Let us all manifest together 16GB RAM for the 4060.

Om Om Om

/Jensen mode on

Sure thing buddy. You have 16GB RAM in your PC.

Here's 2GB of VRAM!

/Jensen mode off

Mopetar · Feb 20, 2023

Let's just be glad we're getting 2GB of VRAM and I'm he hasn't decided to start selling cards with no additional VRAM (to offset increasing costs) and then sell it as an optional upgrade that can be purchased separately.

Everything old is new again. Just make sure to get in before the shader extension packs make an appearance.

TESKATLIPOKA · Feb 23, 2023

Notebookcheck.net

	Laptop GPU	TDP	TimeSpy graphic	Cyberpunk 2077 QHD	F1 22 QHD	The Witcher 3 v.400 QHD
Schenker XMG Neo 17 Engineering Sample	RTX 4070	115W + 25W	12529 (147%)	57 (178%)	52 (158%)	54 (146%)
Razer Blade 18	RTX 4070	115W + 25W	11683 (137%)	54 (169%)	------------	------------
Gigabyte Aero 16	RTX 4070	80W + 25W	------------	53 (166%)	44 (133%)	------------
MSI Katana 17	RTX 4060	80W + 25W	10299 (121%)	46 (144%)	41 (124%)	41 (111%)
Schenker XMG Focus 15 Engineering Sample	RTX 4050	115W + 25W	8536 (100%)	32 (100%)	33 (100%)	37 (100%)

Many more games are tested in that link, I just included a few.
In my opinion, RTX 4060 looks best from the bunch, even If It has only 75% of TDP. Regrettable, they didn't have the highest TDP version, I think It would mean +10% of performance at least.

leoneazzurro · Feb 23, 2023

TESKATLIPOKA said:
Notebookcheck.net

Laptop GPU TDP TimeSpy graphic Cyberpunk 2077 QHD F1 22 QHD The Witcher 3 v.400 QHD
Schenker XMG Neo 17 Engineering Sample RTX 4070 115W + 25W 12529 (147%) 57 (178%) 52 (158%) 54 (146%)
Razer Blade 18 RTX 4070 115W + 25W 11683 (137%) 54 (169%) ------------ ------------
Gigabyte Aero 16 RTX 4070 80W + 25W ------------ 53 (166%) 44 (133%) ------------
MSI Katana 17 RTX 4060 80W + 25W 10299 (121%) 46 (144%) 41 (124%) 41 (111%)
Schenker XMG Focus 15 Engineering Sample RTX 4050 115W + 25W 8536 (100%) 32 (100%) 33 (100%) 37 (100%)

Many more games are tested in that link, I just included a few.
In my opinion, RTX 4060 looks best from the bunch, even If It has only 75% of TDP. Regrettable, they didn't have the highest TDP version, I think It would mean +10% of performance at least.

That's unlikely as the 4060 starts to go asintotically flat at 105W already

https://twitter.com/x/status/1628436531865358336

Here are all the performance curves in terms of Timespy for each chip and TDP, compared to the last generation, from a reliable source.

Edit: One thing to note is that XMG/Clevo Laptops with the 4070/4080/4090 have the possibility to use a liquid cooling system. Jarrod received that with their 4070 model but they did not use it for comparing apples with apples. It is possible that by using that system you can increase substantially the performance of the GPU. I say this because there is quite the difference between the Razer Blade and the XMG systems in the notebookcheck results, and that with the same TDP settings. So either:

1) 4070 GPU chips greatly vary their performance according to the individual IC
2) XMG cooling is way better than the Razer Blade cooling
3) XMG is using the LCS

Looking at the Skyjuice table, however, it seems that 2) can be the right answer

TESKATLIPOKA · Feb 23, 2023

leoneazzurro said:
That's unlikely as the 4060 starts to go asintotically flat at 105W already

https://twitter.com/x/status/1628436531865358336

Here are all the performance curves in terms of Timespy for each chip and TDP, compared to the last generation, from a reliable source.

The question is If this is also true for demanding games or only TimeSpy.
If yes, then I don't understand why they go up to 140W(115+25W) or 33-47% higher.

4050 stops scaling at 95W, 4060/4070 stops scaling at 105W.
This means they hit their highest clockspeed at that TGP and more does nothing except increasing power consumption. Nvidia should have increased frequency or lowered TGP, to improve perf/W.

edit:
Those 2 chips are too close in size to each other and performance is also pretty close, but AD106 has 8% lower max boost.
AD106: 186mm² 36SM
AD107: 156mm² 24SM
In my opinion, AD106 should have been made larger 40SM + 192bit 12GB. I don't think It would be more than 210mm².

leoneazzurro · Feb 23, 2023

Timespy seems to be a good index of the performance scaling for same or similar architectures. By that, you could predict the performance of 4060 compared to 4070-4080-4090 and also why Jarrod and HWUB test saw practically no difference between the 3070 Ti and the 4070 (which is confirmed by other sources and not only Skyjuice and co.). Increasing frequency means adding more consumption... As why they try to push the 4070 and 4060 to an higher power than needed to max their performance, I cannot respond, it is a commercial choice and in any case performance/W is better than the previous gen... but not at all TGP points. Possibly with better cooling they could squeeze a little more performance by that (the LCS example) at max TGP.
In any case I always found notebookcheck tests not really reliable, they don't always specify the exact settings for the games or tell about the dependence by the CPU used, while Jarrod and HWUB are way more complete and detailed.

MrTeal · Feb 23, 2023

In the curve by Geekerwan are those 30 series comparisons mobile parts (IE 3080 = GA104) or desktop (GA102)? I'm guessing RTX3080 mobile based on the data source and tight grouping with the 3070 line, but that's pretty terrible if RTX4070 is ~10% faster than RTX3070 even if it is at lower power.

leoneazzurro · Feb 23, 2023

These are the mobile RTX3000 parts. The reason about the 4070 having these results is that even if Ada is way more efficient and more powerful per SM than Ampere the dies they chose for the mobile parts are way more limited in the base amount of resources. The 4070 as it is now has 4608 CUDA cores while the 3070TI has 5888 of them and bus width is halved (128bit vs 256 bit), so even if the boost clock is lower on the 3070Ti, the gap in efficiency seems to be filled by the difference in raw resources. 4N TSMC process is probably way more expensive than the Samsung one used in Ampere cards, so probably die costs are not much different. These Ada chips do quite good for their size, and they can be set with a way higher power efficiency, but in absolute power they are put in comparison with way bigger dies that use the "slower but wider" card and have larger bandwidth available.

TESKATLIPOKA · Feb 23, 2023

MrTeal said:
In the curve by Geekerwan are those 30 series comparisons mobile parts (IE 3080 = GA104) or desktop (GA102)? I'm guessing RTX3080 mobile based on the data source and tight grouping with the 3070 line, but that's pretty terrible if RTX4070 is ~10% faster than RTX3070 even if it is at lower power.

Actually, RTX 4070 laptop is doing pretty good against the previous gen. It performs a bit better than a RTX3080 laptop, which is a full 48SM GA104.
It could do even better If It wasn't limited to just 2175MHz vs 2370MHz for AD107.
From a performance perspective I think the new generation are doing pretty good, what I don't like is the amount of Vram. AD107 should have been made with 192-bit 12GB memory, then It would be a decent midrange GPU for QHD.

MrTeal · Feb 23, 2023

TESKATLIPOKA said:
Actually, RTX 4070 laptop is doing pretty good against the previous gen. It performs a bit better than a RTX3080 laptop, which is a full 48SM GA104.
It could do even better If It wasn't limited to just 2175MHz vs 2370MHz for AD107.
From a performance perspective I think the new generation are doing pretty good, what I don't like is the amount of Vram. AD107 should have been made with 192-bit 12GB memory, then It would be a decent midrange GPU for QHD.

Yeah for the dies themselves AD106 is doing decently well compared to GA104, providing similar performance at ~75-80% of the power. In terms of product positioning though, in the upper mainstream where you see x70 laptops, there's really not much improvement in performance between a 3070 laptop and a 4070 laptop.

TESKATLIPOKA · Feb 23, 2023

MrTeal said:
Yeah for the dies themselves AD106 is doing decently well compared to GA104, providing similar performance at ~75-80% of the power. In terms of product positioning though, in the upper mainstream where you see x70 laptops, there's really not much improvement in performance between a 3070 laptop and a 4070 laptop.

Honestly speaking, I also don't understand the point of releasing AD106 with those specs.
It's 20% bigger with 20% higher performance than AD107, but the same Vram.
AD106 laptop has theoretically 1.38x more TFLOPs, yet performance is <=20% better.
AD106 should have come with wider bus and 12GB Vram, then It would be a good GPU.
Even a bit more SM wouldn't hurt.
I think 40SM + 192-bit GDDR6 AD106 would be <215mm2.

Don't mention laptop prices. Just looking at them makes me so sad.
Honestly speaking, buying a new laptop with only 8GB Vram is not a good option, but paying 2499€ for the cheapest RTX 4080 12GB is seriously a lot.
N32 doesn't look like It will be any cheaper based on what N22 based laptops are currently going for.

leoneazzurro · Feb 23, 2023

TESKATLIPOKA said:
Honestly speaking, I also don't understand the point of releasing AD106 with those specs.
It's 20% bigger with 20% higher performance than AD107, but the same Vram.
AD106 laptop has theoretically 1.38x more TFLOPs, yet performance is <=20% better.
AD106 should have come with wider bus and 12GB Vram, then It would be a good GPU.
Even a bit more SM wouldn't hurt.
I think 40SM + 192-bit GDDR6 AD106 would be <215mm2.

In general there may be several reasons:

number one is costs, 5N and 4N are very expensive processes so the more mainstream products must be small for being profitable in that market range.
Another reason is that these dies are mobile first, and they are destined to cover also the market for thinner and lighter notebooks which are not suitable for large memory buses/bigger dies (because of space/packaging limits), and moreover having a bigger die with a 128 bit bus would have resulted in bottlenecking the bigger die itself.
But in general yes, there is space for a GPU with intermediate specs between the 4070 mobile and 4080mobile, possibly a "4070 Ti" with a 192bit bus which will probably come out next year with the usual refresh of the line (4060Ti, 4080Ti, 4090Ti are quite possible as well).

TESKATLIPOKA · Feb 23, 2023

leoneazzurro said:
In general there may be several reasons:

number one is costs, 5N and 4N are very expensive processes so the more mainstream products must be small for being profitable in that market range.
Another reason is that these dies are mobile first, and they are destined to cover also the market for thinner and lighter notebooks which are not suitable for large memory buses/bigger dies (because of space/packaging limits), and moreover having a bigger die with a 128 bit bus would have resulted in bottlenecking the bigger die itself.

1. That beefed up AD106 would cost more(extra 25-30mm2 and 4GB Vram), true, but performance would be also better. Nvidia could ask more for It, which will naturally result in higher laptop price, which is bad, but still cheaper than RTX 4080 12GB and more future-proof than with only 8GB Vram.

2. As you said, the space is a problem. If they used HBM It would be resolved, but they keep using GDDR6. With HBM they wouldn't even need big L2 cache and memory PHY would be also smaller. Amount of Vram would also no longer be a problem.

3. Current AD106 is a "big die" paired with only 128-bit bus. It is already bottlenecked.

Heartbreaker · Feb 23, 2023

TESKATLIPOKA said:
Those 2 chips are too close in size to each other and performance is also pretty close, but AD106 has 8% lower max boost.
AD106: 186mm² 36SM
AD107: 156mm² 24SM
In my opinion, AD106 should have been made larger 40SM + 192bit 12GB. I don't think It would be more than 210mm².

36 is 50% more than 24 SMs. That's a fairly substantial step.

Discussion Ada/'Lovelace'? Next gen Nvidia gaming architecture speculation

Senior member

Golden Member

Lifer

Platinum Member

Platinum Member

Diamond Member

Diamond Member

Senior member

Lifer

Lifer

Lifer

Platinum Member

Diamond Member

Platinum Member

Golden Member

Platinum Member

Golden Member

Diamond Member

Golden Member

Platinum Member

Diamond Member

Platinum Member

Golden Member

Platinum Member

Diamond Member