Question 'Ampere'/Next-gen gaming uarch speculation thread

Ottonomous · Nov 1, 2019

How much is the Samsung 7nm EUV process expected to provide in terms of gains?
How will the RTX components be scaled/developed?
Any major architectural enhancements expected?
Will VRAM be bumped to 16/12/12 for the top three?
Will there be further fragmentation in the lineup? (Keeping turing at cheaper prices, while offering 'beefed up RTX' options at the top?)
Will the top card be capable of >4K60, at least 90?
Would Nvidia ever consider an HBM implementation in the gaming lineup?
Will Nvidia introduce new proprietary technologies again?

Sorry if imprudent/uncalled for, just interested in the forum member's thoughts.

Dribble · Nov 4, 2019

I suspect approximately 30% better raster performance as Nvidia have been sticking to that sort of figure for a while. That means slightly faster clocks combined with a few more cores, but not too many as they'll want to shrink the chips quite a bit to lower costs (those huge dies would be very hard to do with a new 7nm process).

As ray tracing is the big thing and probably easy to improve I guess they'll at least double RT performance.

I can't see memory going up by too much as it's expensive, so say 1/3 more (6gb->8gb, or 8gb->11gb), and it'll just be DDR6 with a faster clocks.

The one thing I'm not sure about is the tensor stuff as that hasn't really worked. They add additional complexity, require a whole lot of dev work and Nvidia super computers. They aren't in the consoles which will just denoise/sharpen with normal shaders. It would be simple to just drop them, but if they don't they'll need to more die space and/or significantly improved architecture to make them work?

Midwayman · Nov 4, 2019

Mopetar said:
After he said that “you’ll be impressed...” did he get really quiet and mumble under his breath “... at how much more we’re going to charge for our new cards.”

They want you to feel a sense of pride and accomplishment when you can afford their cards.

Qwertilot · Nov 4, 2019

The tensor core stuff is probably staying in so they can sell the cards for neural net based stuff too. DLSS always a bit more of an attempt to leverage that than a justification.

soresu · Nov 4, 2019

Dribble said:
They aren't in the consoles which will just denoise/sharpen with normal shaders

"Normal" is a bit of a misnomer when everyone is changing their uArch's and ISA's to better handle/accelerate ML workloads.

Vega 20 had significant changes to favor ML, and I would imagine that Navi and the console derivatives have similar improvements - so the "normal" shaders may not be strictly ML focused, but they should likely offer decent performance, especially now that Khronos have formed a working subgroup for ML compute in Vulkan, which is the perfect crossover for denoising graphics.

soresu · Nov 4, 2019

Qwertilot said:
The tensor core stuff is probably staying in so they can sell the cards for neural net based stuff too. DLSS always a bit more of an attempt to leverage that than a justification.

Yes, the tensor cores are a selling point for the Tesla professional cards much more than any current use on the consumer oriented Geforce line.

Considering the large income that their professional cards beings in they won't dump tensor cores anytime soon, unless they invent some new ultimate ISA/uArch that can do ML and regular GPGPU compute at the same time as efficiently as current tensor cores can.

NostaSeronx · Nov 4, 2019

The future is clearly Volta and Turing rebrands! Till Nvidia comes out with the ground-up new design that is incompatible to all previous architectures. With their new VLIW(Denver/Carmel HW re-compiling w/ SIMD -> Vector bundles) with Vector instructions GPGPU, hence the days of SIMDs are over. Nvidia will kill RDNA and Sounds with ease.

soresu · Nov 5, 2019

NostaSeronx said:
The future is clearly Volta and Turing rebrands! Till Nvidia comes out with the ground-up new design that is incompatible to all previous architectures. With their new VLIW(Denver/Carmel HW re-compiling w/ SIMD -> Vector bundles) with Vector instructions GPGPU, hence the days of SIMDs are over. Nvidia will kill RDNA and Sounds with ease.

There are days when I genuinely can't tell if it's acerbic sarcasm or not.

Stuka87 · Nov 5, 2019

I would be very surprised if nVidia comes out with new cards so soon. It took us how many years to go from 10x0 to 20x0? I don't seen that dramatically shortening.

Qwertilot · Nov 5, 2019

Easy to check on the main site - 97/80 @ 18/9/2014;
1080 @ 27/05/2016, 1070 06/10/2016.
2080 @ 20/9/2018.

So something in xx 2020 has to be expected. It won't be too early in the year with the super ones launched a month or two back of course.
(Well the big AI one might be. Volta was 5/10/2017.).

Ajay · Nov 5, 2019

Stuka87 said:
I would be very surprised if nVidia comes out with new cards so soon. It took us how many years to go from 10x0 to 20x0? I don't seen that dramatically shortening.

Pascal and Turing were built on, essentially, on the same process node. The die sizes on the current GPUs are very large, the economics are probably favorable for releasing new GPUs with smaller die sizes than Turing (though AIB prices will likely stay high due to design and implementation costs).

soresu · Nov 5, 2019

Ajay said:
Pascal and Turing were built on, essentially, on the same process node. The die sizes on the current GPUs are very large, the economics are probably favorable for releasing new GPUs with smaller die sizes than Turing (though AIB prices will likely stay high due to design and implementation costs).

Same size as Turing on 7nm would be bad even 12 months after 7nm mass market use.

Until Big Navi arrives and is known to be bigger, the largest 7nm chip we know of is Vega 20 at 335 mm2, which can't have been cheap that early on.

Ajay · Nov 5, 2019

soresu said:
Same size as Turing on 7nm would be bad even 12 months after 7nm mass market use.

Well, Nvidia had to recoup their investment - so I don't think they really had a choice, financially.

Thala · Nov 5, 2019

I assume they stay with 2SMs/TPC and 64CUDA/SM but will increase RT cores from 1 to 2 per SM.
Currently TU102 consists of 36TPC - i assume they will add 12TPC (+33%) for AM102 - which would still significantly reduce die size compared to Turing.

Caveman · Nov 10, 2019

Ok techno folks... in broad terms, who will have the fastest card available for consumer purchase Dec 2019, May 2020 and Nov 2020?

Ajay · Nov 10, 2019

Caveman said:
Ok techno folks... in broad terms, who will have the fastest card available for consumer purchase Dec 2019, May 2020 and Nov 2020?

Probably Nvidia - I don't think they will want to lose their 'halo' status, but their top card will cost and arm and a leg. The 'middle' range is where the fight will be at - at least there should be allot of options for a change.

shortylickens · Nov 10, 2019

When I worked at Hynix in 2008 they were talkin about UV lithography but never got around to.
We shut down before it could happen. Anyway, as we all know smaller components means lower current so you can shove more in a chip before you overheat.
I am betting increased clock speeds and pathways but nothing major in terms of performance. I think at this point its mostly overhauls of the architecture that lead to real advancements.
Am probably still gonna get me something like a 2080 if theres a Christmas sale. I needs more frames!

Bouowmx · Nov 14, 2019

Summary of an ol' Twitter user's now-deleted writings. Kopite7Kimi joined Twitter in 2010; and this year, has reported on NVIDIA Turing SUPER very early. 3DCenter (German), English.

Concerning HPC, GA100 has 6144 (6*1024) -bit HBM, and GA101 has half as much (3072). HPC will use TSMC 7 nm +.

GA100 allegedly has 8192 cores (8 GPC, 8 TPC per GPC, 2 SM per TPC, 64 cores per SM).

Concerning GeForce, it will use Samsung 7 nm.

More stuff in the links.

My thoughts:
8192 won't fit in the 1-2-3 pattern as used in Kepler*, Maxwell, Pascal, and Turing: -06-level has 1x (the baseline) cores, -04-level has 2x, and -00/02-level has 3x.
I'm going to guess 1-2-4?

- GeForce RTX 3060 (A106): 2048 cores
- 3070 (A104): 3072
- 3080 (A104): 4096
- Mind the gap
- 3080 Ti (A102): 7936
- TITAN (A102): 8192

GodisanAtheist · Nov 14, 2019

Bouowmx said:
My thoughts:
8192 won't fit in the 1-2-3 pattern as used in Kepler*, Maxwell, Pascal, and Turing: -06-level has 1x (the baseline) cores, -04-level has 2x, and -00/02-level has 3x.
I'm going to guess 1-2-4?

- GeForce RTX 3060 (A106): 2048 cores
- 3070 (A104): 3072
- 3080 (A104): 4096
- Mind the gap
- 3080 Ti (A102): 7936
- TITAN (A102): 8192

- Given that nodes are lasting longer than ever, I can see NV heavily paring down the launch edition of their chips with heavily harvested parts while making up performance with clock-speed (to remain competitive with the prior gen's one tier higher card), refreshing with a super line, then refreshing with the full part with an up-numbered part.

Nevertheless, I can see NV going with a 6144sp GA102 part to keep total die size down on a new node while still offering a substantial performance increase over the prior gen, as well as allowing more space for the inevitable increase in RTX hardware.

alcoholbob · Nov 15, 2019

Dribble said:
I suspect approximately 30% better raster performance as Nvidia have been sticking to that sort of figure for a while. That means slightly faster clocks combined with a few more cores, but not too many as they'll want to shrink the chips quite a bit to lower costs (those huge dies would be very hard to do with a new 7nm process).

As ray tracing is the big thing and probably easy to improve I guess they'll at least double RT performance.

I can't see memory going up by too much as it's expensive, so say 1/3 more (6gb->8gb, or 8gb->11gb), and it'll just be DDR6 with a faster clocks.

The one thing I'm not sure about is the tensor stuff as that hasn't really worked. They add additional complexity, require a whole lot of dev work and Nvidia super computers. They aren't in the consoles which will just denoise/sharpen with normal shaders. It would be simple to just drop them, but if they don't they'll need to more die space and/or significantly improved architecture to make them work?

30% better than 2080 Super or 2080 Ti? Because the later will be quite tough unless they plan on going big die right off the bat or this is truly a new architecture that beats the pants off of Turing.

Dribble · Nov 15, 2019

alcoholbob said:
30% better than 2080 Super or 2080 Ti? Because the later will be quite tough unless they plan on going big die right off the bat or this is truly a new architecture that beats the pants off of Turing.

30% from the combo of architectural improvements, higher clock speeds and core count increases is pretty easy. Being as this is a huge node size reduction they can probably do more but I suspect they will intentionally target that so they can pull another 30% gain with the 4xxx series on the same node. That is unless AMD produce something really good in which case they'll have to not hold back...

jpiniero · Nov 15, 2019

Dribble said:
30% from the combo of architectural improvements, higher clock speeds and core count increases is pretty easy. Being as this is a huge node size reduction they can probably do more but I suspect they will intentionally target that so they can pull another 30% gain with the 4xxx series on the same node.

I guess the question is whether it's realistic to think AMD would do GPU's on TSMC's 5 nm node in 2021. I will for now say no (and only CPUs) but it's worth watching.

Ajay · Nov 15, 2019

Dribble said:
30% from the combo of architectural improvements, higher clock speeds and core count increases is pretty easy. Being as this is a huge node size reduction they can probably do more but I suspect they will intentionally target that so they can pull another 30% gain with the 4xxx series on the same node. That is unless AMD produce something really good in which case they'll have to not hold back...

Higher clock speeds are not likely on the new 7nm processes. Like CPU, GPUs are hitting a clock speed wall. It would be a win just to maintain the current clock speeds that Turing can reach.

alcoholbob · Nov 15, 2019

Dribble said:
30% from the combo of architectural improvements, higher clock speeds and core count increases is pretty easy. Being as this is a huge node size reduction they can probably do more but I suspect they will intentionally target that so they can pull another 30% gain with the 4xxx series on the same node. That is unless AMD produce something really good in which case they'll have to not hold back...

Reason I said that was the last jump 980 Ti to 1080 was almost 100% the result of clockspeed gains. That gain was once in a lifetime for nvidia and has never happened before from a nodeshrink. I dont believe a 300mm2 die can pull off that much of an increase over a 2080ti which is a 754mm2 die.

If they pull it off I suspect they go large right away on 7nm, i.e. around 400mm2, which wont bode well for the price ($800 gtx 3080?)

psolord · Nov 17, 2019

I don't know about the specs, but I do know Nvidia must be very careful with their prices this time around.

Ps5 and Xbox Scarlett are coming in one year. Ps5 is rumored to cost 500$. If it can do 4k/60fps in a convincing way, people will be fleeing by the millions if you need 350$ just for equal graphics for the pc.

Ajay · Nov 17, 2019

alcoholbob said:
Reason I said that was the last jump 980 Ti to 1080 was almost 100% the result of clockspeed gains. That gain was once in a lifetime for nvidia and has never happened before from a nodeshrink. I dont believe a 300mm2 die can pull off that much of an increase over a 2080ti which is a 754mm2 die.

If they pull it off I suspect they go large right away on 7nm, i.e. around 400mm2, which wont bode well for the price ($800 gtx 3080?)

Good point. NV should get a ~50% shrink out of 7nm. With more RTX support and CCs, they will likely need to go above 400mm2 to get a solid uptick in performance. Maybe around 500mm2 to maintain the 3080Ti's out of reach halo status.

Question 'Ampere'/Next-gen gaming uarch speculation thread

Senior member

Platinum Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Lifer

Diamond Member

Lifer

Golden Member

Platinum Member

Lifer

No Lifer

Golden Member

Diamond Member

Diamond Member

Platinum Member

Lifer

Lifer

Diamond Member

Platinum Member

Lifer