Question 'Ampere'/Next-gen gaming uarch speculation thread

Ottonomous · Nov 1, 2019

How much is the Samsung 7nm EUV process expected to provide in terms of gains?
How will the RTX components be scaled/developed?
Any major architectural enhancements expected?
Will VRAM be bumped to 16/12/12 for the top three?
Will there be further fragmentation in the lineup? (Keeping turing at cheaper prices, while offering 'beefed up RTX' options at the top?)
Will the top card be capable of >4K60, at least 90?
Would Nvidia ever consider an HBM implementation in the gaming lineup?
Will Nvidia introduce new proprietary technologies again?

Sorry if imprudent/uncalled for, just interested in the forum member's thoughts.

darkswordsman17 · Jun 8, 2020

Double posted somehow.

Veradun · Jun 8, 2020

darkswordsman17 said:
Those were quite different. They weren't pushing air over a heatsink, they just had openings on the backside that enabled the blower fan to pull air from both the back and front of the card.

Nope

Ajay · Jun 8, 2020

Glo. said:
https://twitter.com/x/status/1269909686674808832

And here, Kopite says that power draw will acttually be between 300 and 375W.

Considering the die configurations of 102 chip, I would say that he means that RTX 3080 is going to use 300W of power, and RTX 3090 will(or something with full die) will use 375W of power.

But that just nominal specs. 300W for 2x8 pin PCIe power connectors plus 75W PCIe slot power. So it's odd to have a 'leak' that just measures typical PCIe power specs.

Glo. · Jun 8, 2020

Ajay said:
But that just nominal specs. 300W for 2x8 pin PCIe power connectors plus 75W PCIe slot power. So it's odd to have a 'leak' that just measures typical PCIe power specs.

He is talking about ACTUAL power draw. Not what is possible.

jpiniero · Jun 8, 2020

Glo. said:
He is talking about ACTUAL power draw. Not what is possible.

He may be just assuming that's why they chose dual 8 pin versus 8+6.

sontin · Jun 8, 2020

So these are the low end versions? Or is he talking about non taped out GPUs?

Glo. · Jun 8, 2020

jpiniero said:
He may be just assuming that's why they chose dual 8 pin versus 8+6.

Or, more likely, he talked with actual engineer who works on those GPUs and he got his info?

The range from 300 to 375 is not without a reason, here.

Konan · Jun 8, 2020

There was a debate on chiphell forums regarding the attached (note: speculation). Now it looks like the thread got pulled strangely, or member locked. Anyways, I think the higher end Nvidia cards Titan / 3090 / 3080 TDP looks reasonable.
Also note the GDDR6 for 3080 and 3090/Titan with GDDR6x.

Glo. · Jun 8, 2020

Well, Kopite was the first to even come out with GDDR6X idea. And yet, people find it easy to believe this part of his info, and find it hard to believe 102 dies will draw past 300W of power.

Despite of what he was correct on in previous times when it goes to Nvidia products...

jpiniero · Jun 8, 2020

Glo. said:
Well, Kopite was the first to even come out with GDDR6X idea. And yet, people find it easy to believe this part of his info, and find it hard to believe 102 dies will draw past 300W of power.

Despite of what he was correct on in previous times when it goes to Nvidia products...

If it's 280, 290 W that might be too close for comfort.

Remember the 2080 Ti FE was also dual 8-pin.

Glo. · Jun 8, 2020

jpiniero said:
If it's 280, 290 W that might be too close for comfort.

Remember the 2080 Ti FE was also dual 8-pin.

For 3080, sure. That is what we might see. Something between 260 and 300W under load.

I find it baffling that people knowing three things: Ampere/GA100 TDP of 400W, Samsung Process being less efficient than TSMC's 7 nm, and more importantly - Nvidia being required to crank up the clock speeds way past 1.9 GHz, losing efficiency, reject the idea that next gen Nvidia GPUs could use up to 375W of power, for Founders Edition.

Guys, wake up. The paint is already on the wall touted time and time again by different people.

sontin · Jun 8, 2020

GA100 provides 3x more performance than GV100. So you are claiming that gaming Ampere will have 3x more performance, too? Great.

Glo. · Jun 8, 2020

Erm, no, it actually experiences decrease in GFLOP/watt, compared to Volta GPUs.

Why would we see anything different with Gaming cards, especially considering there are no physical design improvements on SS's process, because Nvidia hand't had the time to make them?

sontin · Jun 8, 2020

So power consumption is over 900W for A100?!

Bd8OFMG6pzP4dMn_wTv0BdoaDf-IX7iPZmleSFDVaGpjh0ZmaG-KxC3AaxlHmJuevsIPPLpOs5sfOevzHeb5VhaWfOVZBB6t-SpD1t5JutiIHRDEoyNztO5ZEG5i8IPwxP00s2hV

NVIDIA Ampere Architecture In-Depth | NVIDIA Technical Blog

Today, during the 2020 NVIDIA GTC keynote address, NVIDIA founder and CEO Jensen Huang introduced the new NVIDIA A100 GPU based on the new NVIDIA Ampere GPU architecture. This post gives you a look…

devblogs.nvidia.com

3x more performance with FP16 and 11x more than T4 (70W).

Krteq · Jun 8, 2020

Hmmm, wait a sec... what game engine is using TF32 for shaders?

sontin · Jun 8, 2020

Which gaming Ampere version will have huge TensorCores which cant be used for game rendering outside of DL? A100 delivers 2,5x more performance with TensorCores (FP16). Claiming that GFLOPs/Watt decreased is just wrong.

DXDiag · Jun 8, 2020

Glo. said:
Erm, no, it actually experiences decrease in GFLOP/watt, compared to Volta GPUs.

Nope, You are measuring traditional flops, this is an AI chip, most of it's transistors have gone to the tensor cores, for which it provides crazy 2.5X speeds against previous gen.

And the 400W thing is irrelevant, the A100 is SXM4 form factor, this form factor is 400W for unlocked performance during extended usage. NVIDIA could have made this chip 350W or 300W within the same clocks if they wanted to, V100 SXM2 was 300W, SXM3 was 350W and 450W respectively.

You need to follow the world of data centers to understand this.

Glo. · Jun 8, 2020

DXDiag said:
Nope, You are measuring traditional flops, this is an AI chip, most of it's transistors have gone to the tensor cores, for which it provides crazy 2.5X speeds against previous gen.

And the 400W thing is irrelevant, the A100 is SXM4 form factor, this form factor is 400W for unlocked performance during extended usage. NVIDIA could have made this chip 350W or 300W within the same clocks if they wanted to, V100 SXM2 was 300W, SXM3 was 350W and 450W respectively.

You need to follow the world of data centers to understand this.

I always find it funny, when people are unable to look the message that is BEYOND one particular post, and read the context.

That is not the point I was making.

Secondly. Nvidia is rumored to increase Tensor, and RT capabilities with next gen. gaming cards, compared to Turing. You guys genuinely believe that those capabilities come as a free lunch from perspective of power draw? The very reason why Ampere GA100 chip is less efficient per watt compared to Volta in FP32 and FP64 are those increased AI capabilities in GA100 chip!

What do you guys think will happen with gaming cards, considering that Samsungs process is LESS EFFICIENT than TSMC's 7 nm process, Nvidia will crank the clock speeds to the roof to compete with AMD, and they haven't had time to optimize the physical design of next gen gaming cards, like they had opportunity with Turing?

The end result is 300W power draw on RTX 3080 based on 102 chip, and RTX 3090 drawing way past 300W of power. Why is it so hard to see this?

sontin · Jun 8, 2020

With FP16 A100 can deliver 2x the efficiency of V100 but with FP32 is would be worse?

And why cant you accept that Samsung has started mass production of 8nm chips since the summer of 2018 which would have given nVidia enough time to optimize the process like with the 7nm process.

Tup3x · Jun 8, 2020

Why do you assume that it's less efficient at FP32 and FP64? The TDP is when under maximum load. It is likely more efficient when running FP32 stuff than Turing. Die shrink alone should take care of that. Compute Ampere is different enough so that it probably doesn't make sense to judge gaming Ampere cards based on how it performs.

Glo. · Jun 8, 2020

Tup3x said:
Why do you assume that it's less efficient at FP32 and FP64? The TDP is when under maximum load. It is likely more efficient when running FP32 stuff than Turing. Die shrink alone should take care of that. Compute Ampere is different enough so that it probably doesn't make sense to judge gaming Ampere cards based on how it performs.

Because the numbers, yes theoretical, show that Ampere is actually less efficient in FP32/64 per watt.

Also, so FP32 Load is not "full" load, now. Got it.

Also, its actually on you to prove that that 400W TDP rating is not done on maximum load of FP32 numbers... And that it is not full load.

Stuka87 · Jun 8, 2020

sontin said:
So power consumption is over 900W for A100?!

NVIDIA Ampere Architecture In-Depth | NVIDIA Technical Blog

Today, during the 2020 NVIDIA GTC keynote address, NVIDIA founder and CEO Jensen Huang introduced the new NVIDIA A100 GPU based on the new NVIDIA Ampere GPU architecture. This post gives you a look…

devblogs.nvidia.com

3x more performance with FP16 and 11x more than T4 (70W).

You cannot compare that load to gaming load that we will see with 3x00 series chips. Those are very specific use cases, and part of the reason Ampere is faster there is because of improvements in those functions. Not from raw horse power alone. The benchmarks in that article are very much cherry picked by nVidia marketing. None of them translate to the gaming cards.

Ajay · Jun 8, 2020

sontin said:
With FP16 A100 can deliver 2x the efficiency of V100 but with FP32 is would be worse?

And why cant you accept that Samsung has started mass production of 8nm chips since the summer of 2018 which would have given nVidia enough time to optimize the process like with the 7nm process.

Apparently SS 8nm DUV is just a bump of their 10nm process and is inferior to TSMC 7N. If Samsung 7N EUV delivered, then Nvidia would be in a different position.

Glo. · Jun 8, 2020

sontin said:
With FP16 A100 can deliver 2x the efficiency of V100 but with FP32 is would be worse?

And why cant you accept that Samsung has started mass production of 8nm chips since the summer of 2018 which would have given nVidia enough time to optimize the process like with the 7nm process.

Considering that CUT DOWN to 108 SM's chip draws 400W of power, why do you even believe Nvidia had time to optimize the physical design for N7 process, especially considering that is bog standard N7 process, and not like a version of 16 nm TSMC that Nvidia and TSMC called 12 nm FFN, for marketing purposes?

Remember guys, GA100 chip has 108 SM's out of whole design having 128 SM's. How much will it draw power? 500W? And remember, this thing has HBM2! Where each stack draws 4W's of power. So we are talking about 20-30W for memory subsystem.

GDDR6 and GDDR6X will draw the same amount of power, but per memory chip!

Why is it so hard to believe, that RTX 3080 could use 300W of power, and RTX 3090 375W's, considering even the person who has got legit, correct info touts that 102 based GPUs draw anywhere between 300-375W of power?

Deal with it.

Glo. · Jun 8, 2020

To be fair: one of my sources said to me that 2080 Ti replacement, that is currently being worked on was drawing anywhere between 260-300W of power, depending on workloads.

The thing that I do not know was what clock speeds it was on. Also, considering the leak of 3080 Shroud I believe this GPU is closer to launch, on which I base my assumption that it is that 300W GPU talked here.

Question 'Ampere'/Next-gen gaming uarch speculation thread

Senior member

Lifer

Senior member

Lifer

Diamond Member

Lifer

Diamond Member

Diamond Member

Senior member

Diamond Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Lifer

Diamond Member

Diamond Member