Question 'Ampere'/Next-gen gaming uarch speculation thread

Page 120 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Ottonomous

Senior member
May 15, 2014
559
292
136
How much is the Samsung 7nm EUV process expected to provide in terms of gains?
How will the RTX components be scaled/developed?
Any major architectural enhancements expected?
Will VRAM be bumped to 16/12/12 for the top three?
Will there be further fragmentation in the lineup? (Keeping turing at cheaper prices, while offering 'beefed up RTX' options at the top?)
Will the top card be capable of >4K60, at least 90?
Would Nvidia ever consider an HBM implementation in the gaming lineup?
Will Nvidia introduce new proprietary technologies again?

Sorry if imprudent/uncalled for, just interested in the forum member's thoughts.
 

DDH

Member
May 30, 2015
168
168
111
Doubt it.

AMD could really hit Nvidia where it hurts, and that's great. They'll take turns lowering prices, and I can get a sweet nv card for cheap. Going to say that AMD managed to get within 85% of all of nv's upcoming lineup.
And there in lies the problem. You only care about AMD competing so you can get a cheaper NVIDIA card. If you won't buy an AMD card then you helped enable the higher prices from NVIDIA. If AMD doesn't compete in the high-end then sucks to be you, giving NVIDIA shareholders a nice slice of your hard earned

Sent from my SM-N975F using Tapatalk
 

DDH

Member
May 30, 2015
168
168
111
1. We know the clock-speeds. They are almost unchanged from Turing (yet TDP is up 100W) so it can't be that.
2. AMD ended up using way too much transistors. In the end the added clock-speeds were very disappointing, they hitting power/heat limits. way too soon vs Pascal (see next point). Probably it was due to shoe-string budgets of AMD at the time.
3. Pascal managed to increase clocks way higher (even the 14nm Samsung versions) with much less transistors used to achieve it (vs Maxwell). They did a lot of work on the low-level layout side of things on both Maxwell and Pascal.
4. Based on Xbox Series X specs RDNA2 seems to be to AMD what Maxwell was to Nvidia, huge perf/watt increase and higher clocks from almost minimal extra transistors.

TL;DR:
  • Wasting a bunch of transistors to get the clock speed up almost never works nor is a good idea. See Pentium 4, Bulldozer, Vega as examples. I don't believe NVIDIA is doing it.
  • Getting clock-speeds up due to better physical implementation seems to work way better (See Pascal, Renoir Vega, possibly RDNA2)
I think point one is likely wrong. The clock speeds for turning, like Pascal, boosted way higher than reported from NVIDIA. I wouldn't be surprised to see ampere clock speeds remaon closer to what NVIDIA has listed.

Sent from my SM-N975F using Tapatalk
 

Glo.

Diamond Member
Apr 25, 2015
5,803
4,777
136
I think point one is likely wrong. The clock speeds for turning, like Pascal, boosted way higher than reported from NVIDIA. I wouldn't be surprised to see ampere clock speeds remaon closer to what NVIDIA has listed.

Sent from my SM-N975F using Tapatalk
If 350W and more is true for GA102 chip, no way in hell they stay around the rated clock speeds.

They HAVE TO clock higher under lighter load conditions.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,523
3,038
136
Then why talk about margins? Can't ignore a large segment. AMD has had APUs/CPUs/GPUs in consoles for a long time. Remember? Where did that get AMD? Or do you mean the semi-custom Zen2 processor being used? Because I still fail to see how that'll translate well into the PC market. The gaming industry is slow to evolve. You'll be stuck waiting half a decade before a majority of developers begin using upwards of 8 cores or more regularly.
Please read once more my post which you quoted first, but I will repeat myself here once more.
I was talking about a price war between AMD and Nvidia. AMD won't cut their margins on GPUs to instead gain a market share, If the available supply of 7nm wafers is limited and they can sell the limited number of GPUs for higher prices. Look at RDNA1 and their selling prices, why do you think they kept them so high? Because they could sell everything they made and the customers were willing to pay the asked price.
CPUs are the largest market for AMD not to mention the most successful so they have the highest priority to be made and most of 7nm wafers will be used on them.
The reason why I mentioned SoC was because of the GPUs! AMD doesn't need to worry about game developers not optimizing for their Hardware(GPU) even with a smaller market share in PC GPUs If AMD has SoCs(CPU+GPU) in 2 mayor consoles, which are soon to be released.
 
Last edited:

kurosaki

Senior member
Feb 7, 2019
258
250
86
So, double the RT-cores and the image distorting tech DLSS. DLSS is not a nice implementation if you want games to look good, it's a way to make games look ass and crank up perf. If you want more fps, just lower the res instead and apply some regular AA..
 
  • Like
Reactions: Panino Manino

KompuKare

Golden Member
Jul 28, 2009
1,191
1,487
136
AMD could really hit Nvidia where it hurts, and that's great. They'll take turns lowering prices, and I can get a sweet nv card for cheap. Going to say that AMD managed to get within 85% of all of nv's upcoming lineup.
No /sarcasm tag, so I presume you are serious.
Well, if that attitude is common then AMD might as well exit the GPU business.
If "I wish AMD could compete at the high end" is just code for "I want them to compete to bring the prices down so I can buy Nvidia cheaper" then that's not a viable business model.
 

psolord

Platinum Member
Sep 16, 2009
2,094
1,234
136
So, double the RT-cores and the image distorting tech DLSS. DLSS is not a nice implementation if you want games to look good, it's a way to make games look ass and crank up perf. If you want more fps, just lower the res instead and apply some regular AA..

Won't you lose the screen's 1:1 pixel mapping when running on non native resolution, making things even worse?

Digital Foundry has said some neat things regarding DLSS 2.0 though, but I don't have a DLSS capable card to test myself. Seems ok in the video.
 

kurosaki

Senior member
Feb 7, 2019
258
250
86

I don't know man. Sitting on a 1440p myself and would neither go lower in res manually or start to upscale. It's not worth it with upscaling.
Just meant with my former reply that DLSS or manual downgrading by lowering resolution is equally bad. We are cheating ourselves to higher framerates, but it looks good on paper though... "4k with RTX and DLSS 3.0" WOWZA! But all i hear is a fancy phrase for "1440p upscaled with crappy image and higher FPS" But we make it look like it's all great features and clearly superior to the competition.
 
Last edited:

Gideon

Golden Member
Nov 27, 2007
1,774
4,145
136
Wow, so a ~2x perfomance increase with DLSS and RTX @ 4K.

Impressive for the settings, but considering it has more than 4x the tensor cores,new RTX cores and 50% more bandwidth (that RT really needs) it's the best-case scenario ...

Should make it abundantly clear that pure rasterization improvement can't really be much more than ~50%, at best.
 

Gideon

Golden Member
Nov 27, 2007
1,774
4,145
136
I don't know man. Sitting on a 1440p myself and would neither go lower in res manually or start to upscale. It's not worth it with upscaling.
Just meant with my former reply that DLSS or manual downgrading by lowering resolution is equally bad. We are cheating ourselves to higher framerates, but it looks good on paper though...

That Video is old DLSS implementation which was quite bad (sometimes even worse than simple upscaling). This is a more accurate comparison:
and this:
 
  • Like
Reactions: psolord

kurosaki

Senior member
Feb 7, 2019
258
250
86
That Video is old DLSS implementation which was quite bad (sometimes even worse than simple upscaling). This is a more accurate comparison:
and this:
But it will never look as good, the tradeoff is going to be a janky ride, from almost nice upscaling, to quite bad. We are cheating ourselves to higher framerates, but it looks good on paper though... "4k with RTX and DLSS 3.0" WOWZA! But all i hear is a fancy phrase for "1440p upscaled with crappy image and higher FPS" But we make it look like it's all great features and clearly superior 4k perf. to the competition.
 

n0x1ous

Platinum Member
Sep 9, 2010
2,572
248
106
I think the 4k gains will be substantial (which is what I care about) but lower resolutions the gains will cpu limited like they already are on 2080ti today
 

Asterox

Golden Member
May 15, 2012
1,039
1,823
136
1. We know the clock-speeds. They are almost unchanged from Turing (yet TDP is up 100W) so it can't be that.
2. AMD ended up using way too much transistors. In the end the added clock-speeds were very disappointing, they hitting power/heat limits. way too soon vs Pascal (see next point). Probably it was due to shoe-string budgets of AMD at the time.
3. Pascal managed to increase clocks way higher (even the 14nm Samsung versions) with much less transistors used to achieve it (vs Maxwell). They did a lot of work on the low-level layout side of things on both Maxwell and Pascal.
4. Based on Xbox Series X specs RDNA2 seems to be to AMD what Maxwell was to Nvidia, huge perf/watt increase and higher clocks from almost minimal extra transistors.

TL;DR:
  • Wasting a bunch of transistors to get the clock speed up almost never works nor is a good idea. See Pentium 4, Bulldozer, Vega as examples. I don't believe NVIDIA is doing it.
  • Getting clock-speeds up due to better physical implementation seems to work way better (See Pascal, Renoir Vega, possibly RDNA2)

Well, for X Box X 52CU/1800mhz GPU can eat 130-140W.

If we compare that to RX 5700XT/40 CU 1750mhz(it can eat up to 240W), then it is very simple what AMD have with RDNA2.

 

Gideon

Golden Member
Nov 27, 2007
1,774
4,145
136
But it will never look as good, the tradeoff is going to be a janky ride, from almost nice upscaling, to quite bad. We are cheating ourselves to higher framerates, but it looks good on paper though... "4k with RTX and DLSS 3.0" WOWZA! But all i hear is a fancy phrase for "1440p upscaled with crappy image and higher FPS" But we make it look like it's all great features and clearly superior 4k perf. to the competition.
Well not quite, did you actually look at the videos (especially the last one) ? For Instance at times Death Stranding looks better with DLSS than native resolution with Temporal AA (that's built into the engine). Even in that game though it has artefacts with smoke and some buildings.

I agree that this tech is sometimes overhyped but to say: "crappy image" is definitely hyperbole.

Plenty of people would accept the image-quality loss if this allows you to play on High/Ultra quality in upscaled 1440p/1080p instead of Low 4K, or allows to turn raytracing effects on.

I'm personally really looking forward to some kind of image-reconstruction on RDNA2 hardware as well (considering DLSS 1.9 worked on shader cores and still had considering perfomance uptick, there's bound do be something on consoles at some point).
 
  • Like
Reactions: psolord and ozzy702

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Wow, so a ~2x perfomance increase with DLSS and RTX @ 4K.

Impressive for the settings, but considering it has more than 4x the tensor cores,new RTX cores and 50% more bandwidth (that RT really needs) it's the best-case scenario ...

Should make it abundantly clear that pure rasterization improvement can't really be much more than ~50%, at best.

Actually GA102 has less Tensor cores (384) vs TU102 (544), but due to 4x Higher Throughput per Tensor Core in Ampere it makes the GA102 have 2.8x the Tensor Core Throughput vs TU102.
Ampere only has a 4x Tensor Cores per SM when Turing has 8x Tensor Cores per SM.

edit: correction on Tensor cores per SM between Ampere and Turing.
 
Last edited:

Gideon

Golden Member
Nov 27, 2007
1,774
4,145
136
Actually GA102 has less Tensor cores (348) vs TU102 (544), but due to 4x Higher Throughput per Tensor Core in Ampere it makes the GA102 have 2.8x the Tensor Core Throughput vs TU102.
Ampere only has a single Tensor Core per SM when Turing has 2x Tensor Cores per SM.
Thank you for the correction, makes sense.
 
  • Like
Reactions: AtenRa

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
2x increase in RT performance is great, but it needs to be backed up with a 60+% rasterization performance increase too. The TDP is up 40% from Turing, which is up 10% from Pascal.

A 40-50% perf/w improvement would being two back to back generations of little to no perf/w improvements, which would be incredibly terrible.
 

DiogoDX

Senior member
Oct 11, 2012
747
279
136
Well, seems to be a fake

- source: Chiphell forums
- card render seems to be much smaller than RTX 3090 card on photos leaked before
- "2080Ti" instead of "2080 Ti"
- "Minecraft RTX" instead of "Minecraft with RTX"
Nvidia calls it Minecraft RTX. Quake 2 too.