Question 'Ampere'/Next-gen gaming uarch speculation thread

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Ottonomous

Senior member
May 15, 2014
559
293
136
How much is the Samsung 7nm EUV process expected to provide in terms of gains?
How will the RTX components be scaled/developed?
Any major architectural enhancements expected?
Will VRAM be bumped to 16/12/12 for the top three?
Will there be further fragmentation in the lineup? (Keeping turing at cheaper prices, while offering 'beefed up RTX' options at the top?)
Will the top card be capable of >4K60, at least 90?
Would Nvidia ever consider an HBM implementation in the gaming lineup?
Will Nvidia introduce new proprietary technologies again?

Sorry if imprudent/uncalled for, just interested in the forum member's thoughts.
 

Dribble

Platinum Member
Aug 9, 2005
2,076
611
136
I suspect approximately 30% better raster performance as Nvidia have been sticking to that sort of figure for a while. That means slightly faster clocks combined with a few more cores, but not too many as they'll want to shrink the chips quite a bit to lower costs (those huge dies would be very hard to do with a new 7nm process).

As ray tracing is the big thing and probably easy to improve I guess they'll at least double RT performance.

I can't see memory going up by too much as it's expensive, so say 1/3 more (6gb->8gb, or 8gb->11gb), and it'll just be DDR6 with a faster clocks.

The one thing I'm not sure about is the tensor stuff as that hasn't really worked. They add additional complexity, require a whole lot of dev work and Nvidia super computers. They aren't in the consoles which will just denoise/sharpen with normal shaders. It would be simple to just drop them, but if they don't they'll need to more die space and/or significantly improved architecture to make them work?
 

Midwayman

Diamond Member
Jan 28, 2000
5,723
325
126
After he said that “you’ll be impressed...” did he get really quiet and mumble under his breath “... at how much more we’re going to charge for our new cards.”

:p

They want you to feel a sense of pride and accomplishment when you can afford their cards.
 

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
The tensor core stuff is probably staying in so they can sell the cards for neural net based stuff too. DLSS always a bit more of an attempt to leverage that than a justification.
 
  • Like
Reactions: soresu

soresu

Diamond Member
Dec 19, 2014
4,181
3,651
136
They aren't in the consoles which will just denoise/sharpen with normal shaders
"Normal" is a bit of a misnomer when everyone is changing their uArch's and ISA's to better handle/accelerate ML workloads.

Vega 20 had significant changes to favor ML, and I would imagine that Navi and the console derivatives have similar improvements - so the "normal" shaders may not be strictly ML focused, but they should likely offer decent performance, especially now that Khronos have formed a working subgroup for ML compute in Vulkan, which is the perfect crossover for denoising graphics.
 

soresu

Diamond Member
Dec 19, 2014
4,181
3,651
136
The tensor core stuff is probably staying in so they can sell the cards for neural net based stuff too. DLSS always a bit more of an attempt to leverage that than a justification.
Yes, the tensor cores are a selling point for the Tesla professional cards much more than any current use on the consumer oriented Geforce line.

Considering the large income that their professional cards beings in they won't dump tensor cores anytime soon, unless they invent some new ultimate ISA/uArch that can do ML and regular GPGPU compute at the same time as efficiently as current tensor cores can.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,814
1,294
136
The future is clearly Volta and Turing rebrands! Till Nvidia comes out with the ground-up new design that is incompatible to all previous architectures. With their new VLIW(Denver/Carmel HW re-compiling w/ SIMD -> Vector bundles) with Vector instructions GPGPU, hence the days of SIMDs are over. Nvidia will kill RDNA and Sounds with ease.
 
Last edited:

soresu

Diamond Member
Dec 19, 2014
4,181
3,651
136
The future is clearly Volta and Turing rebrands! Till Nvidia comes out with the ground-up new design that is incompatible to all previous architectures. With their new VLIW(Denver/Carmel HW re-compiling w/ SIMD -> Vector bundles) with Vector instructions GPGPU, hence the days of SIMDs are over. Nvidia will kill RDNA and Sounds with ease.
There are days when I genuinely can't tell if it's acerbic sarcasm or not.
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
I would be very surprised if nVidia comes out with new cards so soon. It took us how many years to go from 10x0 to 20x0? I don't seen that dramatically shortening.
 

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
Easy to check on the main site - 97/80 @ 18/9/2014;
1080 @ 27/05/2016, 1070 06/10/2016.
2080 @ 20/9/2018.

So something in xx 2020 has to be expected. It won't be too early in the year with the super ones launched a month or two back of course.
(Well the big AI one might be. Volta was 5/10/2017.).
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
I would be very surprised if nVidia comes out with new cards so soon. It took us how many years to go from 10x0 to 20x0? I don't seen that dramatically shortening.
Pascal and Turing were built on, essentially, on the same process node. The die sizes on the current GPUs are very large, the economics are probably favorable for releasing new GPUs with smaller die sizes than Turing (though AIB prices will likely stay high due to design and implementation costs).
 

soresu

Diamond Member
Dec 19, 2014
4,181
3,651
136
Pascal and Turing were built on, essentially, on the same process node. The die sizes on the current GPUs are very large, the economics are probably favorable for releasing new GPUs with smaller die sizes than Turing (though AIB prices will likely stay high due to design and implementation costs).
Same size as Turing on 7nm would be bad even 12 months after 7nm mass market use.

Until Big Navi arrives and is known to be bigger, the largest 7nm chip we know of is Vega 20 at 335 mm2, which can't have been cheap that early on.
 

Thala

Golden Member
Nov 12, 2014
1,355
653
136
I assume they stay with 2SMs/TPC and 64CUDA/SM but will increase RT cores from 1 to 2 per SM.
Currently TU102 consists of 36TPC - i assume they will add 12TPC (+33%) for AM102 - which would still significantly reduce die size compared to Turing.
 

Caveman

Platinum Member
Nov 18, 1999
2,539
34
91
Ok techno folks... in broad terms, who will have the fastest card available for consumer purchase Dec 2019, May 2020 and Nov 2020?
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
Ok techno folks... in broad terms, who will have the fastest card available for consumer purchase Dec 2019, May 2020 and Nov 2020?
Probably Nvidia - I don't think they will want to lose their 'halo' status, but their top card will cost and arm and a leg. The 'middle' range is where the fight will be at - at least there should be allot of options for a change.
 

shortylickens

No Lifer
Jul 15, 2003
80,287
17,082
136
When I worked at Hynix in 2008 they were talkin about UV lithography but never got around to.
We shut down before it could happen. Anyway, as we all know smaller components means lower current so you can shove more in a chip before you overheat.
I am betting increased clock speeds and pathways but nothing major in terms of performance. I think at this point its mostly overhauls of the architecture that lead to real advancements.
Am probably still gonna get me something like a 2080 if theres a Christmas sale. I needs more frames!
 

Bouowmx

Golden Member
Nov 13, 2016
1,150
553
146
Summary of an ol' Twitter user's now-deleted writings. Kopite7Kimi joined Twitter in 2010; and this year, has reported on NVIDIA Turing SUPER very early. 3DCenter (German), English.

Concerning HPC, GA100 has 6144 (6*1024) -bit HBM, and GA101 has half as much (3072). HPC will use TSMC 7 nm +.

GA100 allegedly has 8192 cores (8 GPC, 8 TPC per GPC, 2 SM per TPC, 64 cores per SM).

Concerning GeForce, it will use Samsung 7 nm.

More stuff in the links.


My thoughts:
8192 won't fit in the 1-2-3 pattern as used in Kepler*, Maxwell, Pascal, and Turing: -06-level has 1x (the baseline) cores, -04-level has 2x, and -00/02-level has 3x.
I'm going to guess 1-2-4?

- GeForce RTX 3060 (A106): 2048 cores
- 3070 (A104): 3072
- 3080 (A104): 4096
- Mind the gap
- 3080 Ti (A102): 7936
- TITAN (A102): 8192
 
  • Like
Reactions: GodisanAtheist

GodisanAtheist

Diamond Member
Nov 16, 2006
8,397
9,800
136
My thoughts:
8192 won't fit in the 1-2-3 pattern as used in Kepler*, Maxwell, Pascal, and Turing: -06-level has 1x (the baseline) cores, -04-level has 2x, and -00/02-level has 3x.
I'm going to guess 1-2-4?

- GeForce RTX 3060 (A106): 2048 cores
- 3070 (A104): 3072
- 3080 (A104): 4096
- Mind the gap
- 3080 Ti (A102): 7936
- TITAN (A102): 8192

- Given that nodes are lasting longer than ever, I can see NV heavily paring down the launch edition of their chips with heavily harvested parts while making up performance with clock-speed (to remain competitive with the prior gen's one tier higher card), refreshing with a super line, then refreshing with the full part with an up-numbered part.

Nevertheless, I can see NV going with a 6144sp GA102 part to keep total die size down on a new node while still offering a substantial performance increase over the prior gen, as well as allowing more space for the inevitable increase in RTX hardware.
 

alcoholbob

Diamond Member
May 24, 2005
6,390
469
126
I suspect approximately 30% better raster performance as Nvidia have been sticking to that sort of figure for a while. That means slightly faster clocks combined with a few more cores, but not too many as they'll want to shrink the chips quite a bit to lower costs (those huge dies would be very hard to do with a new 7nm process).

As ray tracing is the big thing and probably easy to improve I guess they'll at least double RT performance.

I can't see memory going up by too much as it's expensive, so say 1/3 more (6gb->8gb, or 8gb->11gb), and it'll just be DDR6 with a faster clocks.

The one thing I'm not sure about is the tensor stuff as that hasn't really worked. They add additional complexity, require a whole lot of dev work and Nvidia super computers. They aren't in the consoles which will just denoise/sharpen with normal shaders. It would be simple to just drop them, but if they don't they'll need to more die space and/or significantly improved architecture to make them work?

30% better than 2080 Super or 2080 Ti? Because the later will be quite tough unless they plan on going big die right off the bat or this is truly a new architecture that beats the pants off of Turing.
 

Dribble

Platinum Member
Aug 9, 2005
2,076
611
136
30% better than 2080 Super or 2080 Ti? Because the later will be quite tough unless they plan on going big die right off the bat or this is truly a new architecture that beats the pants off of Turing.
30% from the combo of architectural improvements, higher clock speeds and core count increases is pretty easy. Being as this is a huge node size reduction they can probably do more but I suspect they will intentionally target that so they can pull another 30% gain with the 4xxx series on the same node. That is unless AMD produce something really good in which case they'll have to not hold back...
 

jpiniero

Lifer
Oct 1, 2010
16,979
7,382
136
30% from the combo of architectural improvements, higher clock speeds and core count increases is pretty easy. Being as this is a huge node size reduction they can probably do more but I suspect they will intentionally target that so they can pull another 30% gain with the 4xxx series on the same node.

I guess the question is whether it's realistic to think AMD would do GPU's on TSMC's 5 nm node in 2021. I will for now say no (and only CPUs) but it's worth watching.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
30% from the combo of architectural improvements, higher clock speeds and core count increases is pretty easy. Being as this is a huge node size reduction they can probably do more but I suspect they will intentionally target that so they can pull another 30% gain with the 4xxx series on the same node. That is unless AMD produce something really good in which case they'll have to not hold back...
Higher clock speeds are not likely on the new 7nm processes. Like CPU, GPUs are hitting a clock speed wall. It would be a win just to maintain the current clock speeds that Turing can reach.
 

alcoholbob

Diamond Member
May 24, 2005
6,390
469
126
30% from the combo of architectural improvements, higher clock speeds and core count increases is pretty easy. Being as this is a huge node size reduction they can probably do more but I suspect they will intentionally target that so they can pull another 30% gain with the 4xxx series on the same node. That is unless AMD produce something really good in which case they'll have to not hold back...

Reason I said that was the last jump 980 Ti to 1080 was almost 100% the result of clockspeed gains. That gain was once in a lifetime for nvidia and has never happened before from a nodeshrink. I dont believe a 300mm2 die can pull off that much of an increase over a 2080ti which is a 754mm2 die.

If they pull it off I suspect they go large right away on 7nm, i.e. around 400mm2, which wont bode well for the price ($800 gtx 3080?)
 

psolord

Platinum Member
Sep 16, 2009
2,142
1,265
136
I don't know about the specs, but I do know Nvidia must be very careful with their prices this time around.

Ps5 and Xbox Scarlett are coming in one year. Ps5 is rumored to cost 500$. If it can do 4k/60fps in a convincing way, people will be fleeing by the millions if you need 350$ just for equal graphics for the pc.
 
  • Like
Reactions: CP5670

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
Reason I said that was the last jump 980 Ti to 1080 was almost 100% the result of clockspeed gains. That gain was once in a lifetime for nvidia and has never happened before from a nodeshrink. I dont believe a 300mm2 die can pull off that much of an increase over a 2080ti which is a 754mm2 die.

If they pull it off I suspect they go large right away on 7nm, i.e. around 400mm2, which wont bode well for the price ($800 gtx 3080?)
Good point. NV should get a ~50% shrink out of 7nm. With more RTX support and CCs, they will likely need to go above 400mm2 to get a solid uptick in performance. Maybe around 500mm2 to maintain the 3080Ti's out of reach halo status.