Question 'Ampere'/Next-gen gaming uarch speculation thread

Page 30 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Ottonomous

Senior member
May 15, 2014
559
292
136
How much is the Samsung 7nm EUV process expected to provide in terms of gains?
How will the RTX components be scaled/developed?
Any major architectural enhancements expected?
Will VRAM be bumped to 16/12/12 for the top three?
Will there be further fragmentation in the lineup? (Keeping turing at cheaper prices, while offering 'beefed up RTX' options at the top?)
Will the top card be capable of >4K60, at least 90?
Would Nvidia ever consider an HBM implementation in the gaming lineup?
Will Nvidia introduce new proprietary technologies again?

Sorry if imprudent/uncalled for, just interested in the forum member's thoughts.
 

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
I understood it. But this means that Samsung's 8nm process is worse than TSMC's 16nm process and nVidia cant produce anything on it.

No, what I think he's saying is that Samsung's 7nm process is worse than TSMC's 7nm process, so NV switched their high-performance parts to TSMC 7nm (N7, N7P, or N7+? Who knows?) and their low-performance parts to Samsung 8nm, and that they won't be using Samsung 7nm for anything.

At the same time they will release a +-600mm^2 Die on TSMC's 7nm process without problems...

N7 has been in volume production for almost two years now (July 2018). If 600mm2 is within the reticle limit, I can see it being possible. The yields on that process should be, by now, phenomenal. Alternatively they may be using N7P, and I would still expect yields on that process to be quite good.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
But the difference would be so huge that the process is literally worse than TSMCs 16nm process. AMD said that they only got 20% improvement from the 7nm process.
 

Glo.

Diamond Member
Apr 25, 2015
5,661
4,419
136
Overall performance is the IPC of the GPUs across different workloads. Fact is Turing often does more with less. And it doesn't even use any of the advanced features it has yet: Variable Rate Shading, Mesh Shaders, Sampler Feedback, Tesnor Cores, Ray Tracing. And When it does it literally will blow RDNA1 out of the water, just wait and see how RDNA1 crumbles under the pressure of the new wave of consoles games.

So let me rephrase it. You have taken out OVERALL performance benchmarks and used them as argument in the discussion about IPC of Turing vs RDNA1, to confirm your own views, but when faced with real, apples to apples comparison of IPC between GPUs, at the same clock, same ALU count, same bandwidth, you disregarded them as Failing Miserably, because they did not confirmed your views, on the topic.

Gotch'ya.


Sorry, all you did is prove yourself woefully wrong and embarrassed yourself in the process.

You claimed NVIDIA won't have a share in 7nm from TMSC, WRONG.
You claimed NVIDIA next gen gaming GPUs will be 8nm from Samsung, WRONG.
You claimed gaming Ampere hasn't taped out yet WRONG.
Only GA100 is at TSMC 7 nm process, and Gaming arch, which I never said is Ampere, I always called it Next Gen. Graphics from Nvidia is 8 nm for 2020 year. 7 nm products from Nvidia at high-end land are due for 2021. And I always said that GA100 chip is 7 nm TSMC product.

I claimed that GAMING cards haven't taped out and it turns out they taped out only recently and only for 8 nm products. I never claimed that Ampere, GA100, HPC chip not taped out. Find me a post in which I talk about Ampere as gaming architecture, and not Next gen gaming from Nvidia.

Only you guys can blame yourselves for your lack of reading comprehension. And for your own BIASES.
 
Last edited:
  • Like
Reactions: Kuiva maa

Glo.

Diamond Member
Apr 25, 2015
5,661
4,419
136
I understood it. But this means that Samsung's 8nm process is worse than TSMC's 16nm process and nVidia cant produce anything on it. At the same time they will release a +-600mm^2 Die on TSMC's 7nm process without problems...
.
It means that Samsung 8 nm, which is in fact 10 nm process, is marginally better in terms of efficiency than 12 nm FFN, and equally woeful in terms of density, compared to 7 nm TSMC.

Again, you should work on your reading comprehension skills, mate. Or at least stop manipulating, or twisting my words.
 

Glo.

Diamond Member
Apr 25, 2015
5,661
4,419
136
No, what I think he's saying is that Samsung's 7nm process is worse than TSMC's 7nm process, so NV switched their high-performance parts to TSMC 7nm (N7, N7P, or N7+? Who knows?) and their low-performance parts to Samsung 8nm, and that they won't be using Samsung 7nm for anything.



N7 has been in volume production for almost two years now (July 2018). If 600mm2 is within the reticle limit, I can see it being possible. The yields on that process should be, by now, phenomenal. Alternatively they may be using N7P, and I would still expect yields on that process to be quite good.
Exactly. SS's process are woeful in terms of performance, compared to TSMC.

Nvidia taped out whole next gen gamign lineup on 8 nm process, but they had to switch to 7 nm TSMC beucase they know what AMD can do on this process, and on 8 nm process, higher end would simply be uncompetitive against RDNA2. Nvidia can get away with 8 nm for low- end because based on how things look like now - AMD will use RDNA1 for the low end, and it takes into account both: Navi 14 and Navi 10 GPUs.
But the difference would be so huge that the process is literally worse than TSMCs 16nm process. AMD said that they only got 20% improvement from the 7nm process.
It is only worse than TSMC's N7 process. If you believe that Nvidia would be able to make physical design optimizations on any process in such short time they got between Turing, and next gen - you should discuss this with any engineer. It took Nvidia years, and billions of dollars flushed into toilet named 16 nm TSMC process to optimize their physical design for this process. Why do you think Nvidia would simply take out all of those optimizations, bring them to Samsung, or even for the matter to TSMC and use them straight away? It takes years.

It is even plainly put in the one of twitts that Mr. Chia translated on twitter: Nvidia underestimated AMD and what they can achieve on TSMC's process, with RDNA2. Effect? PS5's RDNA2 GPU clocking up to 2.23 GHz, for 36 CU design.

Nvidia on 8 nm process would not be able to past 2 GHz mark. On ANY GPU. That is all, that is in play here.
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
5,661
4,419
136
We know enough now, a 52CU RDNA2 part (in Xbox Series X) is equal to an RTX 2080 in Gears 5, that means clock for clock, RDNA2 is about the same as RDNA1, or very slightly better (~5%), AMD only claimed 50% performance per watt improvements, nothing more. Most of that is spent on reducing the power draw of RDNA2 compared to RDNA1. Granted that will mean AMD can now build bigger chips, instead of the 40CU part consuming 225W, AMD can now build a 80CU part while consuming about 250W.
As I have said before, I will say again. You know nothing about RDNA2 GPUs.

But this is not the thread for this. Stop pushing the discussion into this direction, with your more detailed "opinions".
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106

Glo.

Diamond Member
Apr 25, 2015
5,661
4,419
136

Glo.

Diamond Member
Apr 25, 2015
5,661
4,419
136
Samsung's 14nm process is as efficient as TSMC's 16nm (Pascal). And 8nm should allow for ~60m xtors/mm^2: https://fuse.wikichip.org/news/1443/vlsi-2018-samsungs-8nm-8lpp-a-10nm-extension/
In theory, eh? 14 and 16 nm processes from TSMC and GloFo should allow for maximum of 40 mln xTors/mm2. And yet, both Nvidia and AMD the best they could achieve was 20-25 mln xTors/mm2 on ALL of their 14/16(and 12 Nm FFN) designs.

Just like TSMC's 7 nm should allow for 80 mln xTors/mm2 on Apple A chips, and 40 mln xTors/mm2 on 7 nm Radeon and Zen 2 CPUs and next gen consoles?

One, last time. 12 nm FFN from TSMC is custom Nvidia only version developed by TSMC specifically for Nvidia because it had specific physical design optimizations, that are not available anywhere else. Samsung's process is marginally better than this, because it lacks those optimizations. Nvidia and Sammy could've developed them, if they would have time, which Nvidia does not have, based on product timeline we know of.

That is all.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
Xavier has 25,7m xtors/mm^2. Orin has 17 billion transistors. So how big should Orin be if Samsung cant even archive 40m xtors/mm^2?
 

Glo.

Diamond Member
Apr 25, 2015
5,661
4,419
136
Xavier has 25,7m xtors/mm^2. Orin has 17 billion transistors. So how big should Orin be if Samsung cant even archive 40m xtors/mm^2?
Xavier is 350 mm2 die, that is sub 30W part.

As you know by now, mobile parts will always have different, higher transistor density compared to high-performance parts. dGPUs are high-performance parts.

With 35 mln xTors/mm2, Orin should be around 485 mm2. With 30 mln xTors/mm2 it is 567 mm2 die. Even with 40 mln xTors/mm2 Orin will be 425 mm2 die.
 

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
AMD said that they only got 20% improvement from the 7nm process.

I can provide some perspective here, since I owned a GF 14nm VegaFE and own a TSMC 7nm Radeon VII. The differences are night and day. I have some posts about it from last year in the AMD subforum if you want me to dig them up. Actually, I think I just found the relevant data:


To recap:

On my old system, running VegaFE locked @ 1585 MHz via undervolt and +50% power limit, the total system power draw in Superposition 1080p was ~510W. Using the same setup with Radeon VII running stock clocks and an undervolt to .991v, total system power draw was ~370W. Actual clocks would waver around 1770 MHz for Radeon VII with that configuration (same as if I used the stock voltage; AVFS is funny like that). Moving from 14nm to 7nm Vega, I was able to knock ~140W off my wall power draw while maintaining clocks approximately 200 MHz higher. That is much, much better than a 20% improvement in efficiency. Of course, Radeon VII has 4 fewer CUs than VegaFE so that helps a bit with power reduction as well. But still! It sips power compared to VegaFE. That 20% figure can't be right.
 

sontin

Diamond Member
Sep 12, 2011
3,273
149
106
DrMrLordX thx for the information.
And from Navi AMD has published the numbers:
AMD-Radeon-RX-5700-XT-Navi-15.jpg


7nm process was just under 15% or so from the overall gains.


As you know by now, mobile parts will always have different, higher transistor density compared to high-performance parts. dGPUs are high-performance parts.

Xavier is less than 10% more dense than Turing (TU116 and TU106).
 

Glo.

Diamond Member
Apr 25, 2015
5,661
4,419
136
DrMrLordX thx for the information.
And from Navi AMD has published the numbers:
AMD-Radeon-RX-5700-XT-Navi-15.jpg


7nm process was just under 15% or so from the overall gains.




Xavier is less than 10% more dense than Turing (TU116 and TU106).
Which means that if 8 nm SS process is anywhere between 35-40 mln xTors/mm2 for high density part it does not paint great picture for high-performance parts that were yielded by Nvidia on this process, and redesigning them for 7 nm TSMC is a good choice.
 

Konan

Senior member
Jul 28, 2017
360
291
106
Nvidia taped out whole next gen gamign lineup on 8 nm process, but they had to switch to 7 nm TSMC

Be honest now, you just made that up, right mate?. Just like you're saying that the GA Nvidia card is only HPC on TSMC 7nm.
Where is the proof of this?

According to reports in The China Times and DigiTimes, Nvidia has been one of TSMC’s main customers of the 7nm process node, which seems to confirm that Ampere GPUs, including those used in consumer graphics cards such as the RTX 3080, will be based on this manufacturing process.

DigiTimes writes that in addition to reserving production slots for TSMC’s 5nm production capacity in 2021, Nvidia will be handing some of the work over to Samsung. Specifically, the Korean giant will produce lower-end Ampere graphics cards, which could use its 7nm EUV or 8nm process nodes.

The key takeaway, however, seems to be that Nvidia Ampere is going to be both a gaming and professional architecture, marking a change from the Volta/Turing days or AMD's own bifurcation of its gaming and pro GPU tech. And that performance-wise it's going to be more than just a die-shrink of the RTX 20-series' Turing.

There has never been any reports, articles by any tech press, including the "tmz" of the world saying that Nvidia taped everything out with Samsung and then had to switch. That is BS and you know it.
In fact I linked a reliable source earlier who said Ampere had already taped out over a year ago and the suggestion was it was on N+7nm.

I seem to keep asking you where you get your info from or that you are assuming as you keep talking as fact and it is best to clarify both.
 
  • Like
Reactions: xpea and DXDiag

Glo.

Diamond Member
Apr 25, 2015
5,661
4,419
136
Be honest now, you just made that up, right mate?. Just like you're saying that the GA Nvidia card is only HPC on TSMC 7nm.
Where is the proof of this?
If I would answer all of those questions I would write about information that I cannot share, so lets just wait and see how things will turn out, shall we? ;)
 

DXDiag

Member
Nov 12, 2017
165
121
116
You have taken out OVERALL performance benchmarks and used them as argument in the discussion about IPC of Turing vs RDNA1,
Yes, overall beats any singular one workload.

7 nm products from Nvidia at high-end land are due for 2021. And I always said that GA100 chip is 7 nm TSMC product.
No source has ever said that, all point to the contrary.

As I have said before, I will say again. You know nothing about RDNA2 GPUs.
And you've shown time and time again that you know nothing about next gen NVIDIA GPUs. So please stop dwelling on the matter till we have solid info from NVIDIA themselves.

Also AMD and next gen Consoles have given us plenty to work with regarding RDNA2, thank you very much, maybe you chose to remain ignorant about the matter, but that's your prerogative.
 

beginner99

Diamond Member
Jun 2, 2009
5,208
1,580
136
I understood it. But this means that Samsung's 8nm process is worse than TSMC's 16nm process and nVidia cant produce anything on it.

Why? That conclusion you simply pull out of thin air.

It doesn't mean that at all. It just means it worse than tsmc 7nm. However we don't know if it is yields or clocks.
 

mopardude87

Diamond Member
Oct 22, 2018
3,348
1,575
96
All i know my body is beyond ready for Ampere, like i need it NOW. Please come out soon and at a reasonable price so i could double down on a pair. I rather not buy the 2060 i got planned when my stimulus arrives or next check whichever comes first. Like please shut up and take my money Nvidia! Don't rob us either if all possible that would be great thanks. :) I will pay for the performance but not over $1,200 unless it was like some 1080ti next level stuff.

In the market for a 2060 as my main build KRONOS needs a primary driver/gamer/secondary folder card as my 1080ti only does crunching. The 1080ti is so overkill for 1080p for what i actively play i prefer to see it running at Folding@Home as i get the most use out of it and its part of my customs and beliefs never to waste computer resources when it can be helped. If May 14th pans out and we get a summer release like i wish then yeah the 1080ti will become the primary. Can double down later.
 

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136
I think nVidia could definitely release a GA100 Titan soon. It would be mega expensive if they do.

You could definately question how many people would buy an insanely expensive GPU right now but they could do it.
 
  • Like
Reactions: mopardude87

Glo.

Diamond Member
Apr 25, 2015
5,661
4,419
136
I think nVidia could definitely release a GA100 Titan soon. It would be mega expensive if they do.

You could definately question how many people would buy an insanely expensive GPU right now but they could do it.
It depends on when AMD will release their high end GPUs.

"Titan A" will however be released, based on GA100 chip.
All i know my body is beyond ready for Ampere, like i need it NOW. Please come out soon and at a reasonable price so i could double down on a pair. I rather not buy the 2060 i got planned when my stimulus arrives or next check whichever comes first. Like please shut up and take my money Nvidia! Don't rob us either if all possible that would be great thanks. :) I will pay for the performance but not over $1,200 unless it was like some 1080ti next level stuff.

In the market for a 2060 as my main build KRONOS needs a primary driver/gamer/secondary folder card as my 1080ti only does crunching. The 1080ti is so overkill for 1080p for what i actively play i prefer to see it running at Folding@Home as i get the most use out of it and its part of my customs and beliefs never to waste computer resources when it can be helped. If May 14th pans out and we get a summer release like i wish then yeah the 1080ti will become the primary. Can double down later.
There is a good chance that next gen Nvidia GPUs might be lower in prices compared to this gen.

However, I fully expect that GTX 1660 Ti replacement, the RTX 3060 will start from 299$...
 

Glo.

Diamond Member
Apr 25, 2015
5,661
4,419
136
Yes, overall beats any singular one workload.
So now you moved the goal post of the discussion that you, yourself started, because it did not confirmed your own, personal beliefs about the topic. You claimed that RDNA1 has lower IPC than Turing, when faced with facts, Apple to Apple comparison, you moved the goalpost to overall performance.

Gotcha x2.

Kek.
 

DXDiag

Member
Nov 12, 2017
165
121
116
So now you moved the goal post of the discussion that you, yourself started,
There is no goalpost to be moved, period, the discussion started with the fact that Turing does more performance than RDNA with the same default specs, despite being a node behind.

This is wrong. IPC (Instructions per clock) is how many instructions can be processed during one clock cycle. Overall performance is entirely different from IPC.
That's with CPUs, where you have to isolate single threaded workloads and measure performance from there, but that's not the case with GPUs, Overall performance is the same as IPC in GPUs, it's directly related, as GPUs don't separate single threaded from multi threaded, GPUs are always multi-threaded.
 
  • Like
Reactions: xpea