Question 'Ampere'/Next-gen gaming uarch speculation thread

Ottonomous · Nov 1, 2019

How much is the Samsung 7nm EUV process expected to provide in terms of gains?
How will the RTX components be scaled/developed?
Any major architectural enhancements expected?
Will VRAM be bumped to 16/12/12 for the top three?
Will there be further fragmentation in the lineup? (Keeping turing at cheaper prices, while offering 'beefed up RTX' options at the top?)
Will the top card be capable of >4K60, at least 90?
Would Nvidia ever consider an HBM implementation in the gaming lineup?
Will Nvidia introduce new proprietary technologies again?

Sorry if imprudent/uncalled for, just interested in the forum member's thoughts.

AtenRa · Aug 25, 2020

tviceman said:
But come on, a new node, a new architecture, and a new crazy high TDP - only for 3090 to be 50% faster than the 2080 TI? I don't buy that.

If the RTX 3090 Ampere is a conservative 30% more efficient than RTX 2080 TI Turing, that alone will translates into a 65% performance uplift over Turing based on the rumored TDP. So then 1.5 x 1.65 = 2.475, which is nearly 150% faster than the 5700 XT. But this is just at rasterization. Nvidia may (or may not) also hold a very commanding lead in RT capabilities.

I strongly believe Ampere architecture is more optimized for RT and ML than Raster performance,.
The new Tensor Cores on Ampere with native INT8/INT4, perhaps will need 2x the transistor count vs Turning Tensor Cores. So if they need 2x the transistor count plus they will increase the Tensor Core count vs Turning, then Tensor Cores will occupy a larger area percentage in Ampere vs Turing. Add extra RT cores, add extra bandwidth lanes for communication and a few other things and a lot of the die size goes to RT and ML performance increases and not for Raster.

For those reasons I believe that RTX3090 will need to increase clocks way above what they originally aiming for in order to reach the +60% over the RTX2080Ti. And that is why we have the TDP increase over Turning RTX2080Ti.

Ajay · Aug 25, 2020

AtenRa said:
I strongly believe Ampere architecture is more optimized for RT and ML than Raster performance,.
The new Tensor Cores on Ampere with native INT8/INT4, perhaps will need 2x the transistor count vs Turning Tensor Cores. So if they need 2x the transistor count plus they will increase the Tensor Core count vs Turning, then Tensor Cores will occupy a larger area percentage in Ampere vs Turing. Add extra RT cores, add extra bandwidth lanes for communication and a few other things and a lot of the die size goes to RT and ML performance increases and not for Raster.

Are the registers in Turing's tensor cores 16b? Is Ampere going to 32b registers?

AtenRa · Aug 25, 2020

Ajay said:
Are the registers in Turing's tensor cores 16b? Is Ampere going to 32b registers?

I dont know about the registers but,

You need huge transistor allocation to have half the Tensor Cores per SM (4 Tensors per SM in Ampere vs 8 Tensors in Turing) and at the same time have 2x more FP16/FP32 throughput.
That means each Tensor core in Ampere has 4 times the performance of each Tensor Core in Turing.

From NVIDIA Ampere architecture

https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf

The A100 SM diagram is shown in Figure 7. Volta and Turing have eight Tensor Cores per SM, with each Tensor Core performing 64 FP16/FP32 mixed-precision fused multiply-add (FMA) operations per clock. The A100 SM includes new third-generation Tensor Cores that each perform 256 FP16/FP32 FMA operations per clock. A100 has four Tensor Cores per SM, which together deliver 1024 dense FP16/FP32 FMA operations per clock, a 2x increase in computation horsepower per SM compared to Volta and Turing.

Konan · Aug 25, 2020

Krteq said:
This one is huuuge

WCCFTech - NVIDIA GeForce RTX 3090 & RTX 3080 Ampere GA102 GPU Allegedly Pictured – Massive Die For Enthusiast Gaming Graphics Cards

The TW on the chip indicates Taiwan so TSMC? or maybe diffused in Korea and made in Taiwan ??

Glo. · Aug 25, 2020

https://twitter.com/x/status/1298265934939865093

Another day, another leak. Ampere brings 35% perf. increase over Turing?

Also dates, for release are in.

Krteq · Aug 25, 2020

Konan said:
The TW on the chip indicates Taiwan so TSMC? or maybe diffused in Korea and made in Taiwan ??

Check my second post right after that one... so yep, most probably diffused in Korea and made in Taiwan

Krteq said:
GP107 also manufactured at Samsung (14nm)

Krynj · Aug 25, 2020

As somebody that is currently waiting to pull the trigger on something in the $400/$500 range for my next build, should I expect any product announcement in that price range? Or will we just see price cuts to the current 20xx cards? The GPU is basically the last component I need before I pull the trigger on my new parts, so I was just looking for a bit of insight on what to expect. I haven't bought an Nvidia card in about 17 years, or built a system in about 8, so I'm a bit out of the loop to say the least.

Konan · Aug 25, 2020

Krynj said:
As somebody that is currently waiting to pull the trigger on something in the $400/$500 range for my next build, should I expect any product announcement in that price range? Or will we just see price cuts to the current 20xx cards? The GPU is basically the last component I need before I pull the trigger on my new parts, so I was just looking for a bit of insight on what to expect. I haven't bought an Nvidia card in about 17 years, so I'm not too familiar with their product launches.

This time next week we have a strong chance of learning official pricing and some product info from Nvidia in their event on Sept. 1.
Could be a RTX 3070/3060 in that price range

Stuka87 · Aug 25, 2020

Konan said:
The TW on the chip indicates Taiwan so TSMC? or maybe diffused in Korea and made in Taiwan ??

'S TW' is for Samsung. Not TSMC.

EDIT: Correction, that's just where the final assembly took place. Its in no relation to where the chip was diffused. nVidia often has that on chips, but also sometimes has 'B KOREA' on the chip.

DiogoDX · Aug 25, 2020

Glo. said:
https://twitter.com/x/status/1298265934939865093

Another day, another leak. Ampere brings 35% perf. increase over Turing?

Also dates, for release are in.

There was a rumor that only the 3080 will launch in september. 3080 being 35% faster that a 2080Ti lines up with the rumors too.

Glo. · Aug 25, 2020

DiogoDX said:
There was a rumor that only the 3080 will launch in september. 3080 being 35% faster that a 2080Ti lines up with the rumors too.

https://twitter.com/x/status/1298312087425486849

Nope. Its 35% over RTX 2080 Ti, with RTX 3090.

DooKey · Aug 25, 2020

Glo. said:
https://twitter.com/x/status/1298312087425486849

Nope. Its 35% over RTX 2080 Ti, with RTX 3090.

If that ends up true I'll wait for Hopper on 5nm.

sontin · Aug 25, 2020

Nonsense. The 3080 has 41% more compute units and 57,5% more bandwidth than the 2080 super.

DiogoDX · Aug 25, 2020

Glo. said:
https://twitter.com/x/status/1298312087425486849

Nope. Its 35% over RTX 2080 Ti, with RTX 3090.

More in 4K. So could be 40% faster like the early rumors.

Glo. · Aug 25, 2020

I think it still will be closer to 40-45% rasterization performance uplift over previous generation.

CakeMonster · Aug 25, 2020

35% for the steps listed there is very disappointing for a 2 year wait. Especially considering the price speculations.

I don't really believe it, but my qualifications are close to zero.

CastleBravo · Aug 25, 2020

CakeMonster said:
35% for the steps listed there is very disappointing for a 2 year wait. Especially considering the price speculations.

Alternative theory; 3080 is 2080ti replacement at 35% higher performance, and 3090 is in a new performance category.

Glo. · Aug 25, 2020

CastleBravo said:
Alternative theory; 3080 is 2080ti replacement at 35% higher performance, and 3090 is in a new performance category.

https://twitter.com/x/status/1298312087425486849

Already disproven

.

"35% across the board. 4k you're looking at more of a jump 2080Ti > 3090 2080 > 3080 etc "

sontin · Aug 25, 2020

A twitter post from someone is no proof.

CakeMonster · Aug 25, 2020

CastleBravo said:
Alternative theory; 3080 is 2080ti replacement at 35% higher performance, and 3090 is in a new performance category.

Yep, that's IMO what could justify another price hike for the top card and it would make financial sense for NV if its technically possible.

Glo. · Aug 25, 2020

sontin said:
A twitter post from someone is no proof.

The coping...

Glo. · Aug 25, 2020

The thing that baffles me is...

No way Jenhsen will let it be this slow. Not compared to Navi 2. Nvidia Engineers know, (well they know 90-95% at this point...) how Navi 2 will behave. They can clock it to hell.

Why only 35%?

I don't buy it. It will be faster. Around 40-45% in 1080p, and faster in 4K.

sontin · Aug 25, 2020

No, but you have to decide who you believe. What happened to the Timespy extreme leaks? Huh?

Ajay · Aug 25, 2020

AtenRa said:
I dont know about the registers but,

You need huge transistor allocation to have half the Tensor Cores per SM (4 Tensors per SM in Ampere vs 8 Tensors in Turing) and at the same time have 2x more FP16/FP32 throughput.
That means each Tensor core in Ampere has 4 times the performance of each Tensor Core in Turing.

From NVIDIA Ampere architecture

https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf

Thanks! Looking over that documentation; I think it’s pretty clear that for GA102, the total cuda core count will go down as will FP64 FMAs and number of tensor cores. I don’t think consumer TPUs will need fp64 either (unless A102 is also for engineering workstations as well). So I think some reasonable cuts can be made, especially if NV is aiming for higher clocks. Hopefully we’ll see actual products soon and know for sure.

Kenmitch · Aug 25, 2020

sontin said:
No, but you have to decide who you believe. What happened to the Timespy extreme leaks? Huh?

Why argue back and forth? It's a speculation thread after all.

Wait for reviews to drop. Then you can gloat or pout if needed.

Question 'Ampere'/Next-gen gaming uarch speculation thread

Senior member

Lifer

Lifer

Lifer

Senior member

Diamond Member

Golden Member

Platinum Member

Senior member

Diamond Member

Senior member

Diamond Member

Golden Member

Diamond Member

Senior member

Diamond Member

Golden Member

Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Diamond Member