Question 'Ampere'/Next-gen gaming uarch speculation thread

Ottonomous · Nov 1, 2019

How much is the Samsung 7nm EUV process expected to provide in terms of gains?
How will the RTX components be scaled/developed?
Any major architectural enhancements expected?
Will VRAM be bumped to 16/12/12 for the top three?
Will there be further fragmentation in the lineup? (Keeping turing at cheaper prices, while offering 'beefed up RTX' options at the top?)
Will the top card be capable of >4K60, at least 90?
Would Nvidia ever consider an HBM implementation in the gaming lineup?
Will Nvidia introduce new proprietary technologies again?

Sorry if imprudent/uncalled for, just interested in the forum member's thoughts.

JoeRambo · Sep 8, 2020

Pascal had 128 FP32 cores per SM, and was unable to issue any of them at same clock as Int ops. During Turing Gen, NV revealed that sampled workloads have average mix of ~100 fp32 + 36 int ops.
So it was logical that Turing moved to SM with 64 Cuda cores, that have 64 FP32 and 64 int cores and is able to issue execution on both. This was done to avoid idling FP32 cores when Int ops are being issued.

With Amper gaming arch, there is additional block of FP32 resources, and SM now has a total 128 Cuda cores, 128 FP32 and 64 int units. So it can execute each clock either 128 FP32 or 64 FP32 + 64 int ops.

Looks like a great fit for that mix revealed during Turing gen? Except the fact that it can't do 128 fp32 + 64 int ops per clock and when there is a mix involved, "peak" throughput per clock is the same as Turing.

It obviuosly is a very good tradeoff as for 100 + 36 average mix fp32 resource utilization will still be very good, but to achieve peak throughput on gaming Ampere, the less int ops in the mix, the better.

JoeRambo · Sep 8, 2020

nieur said:
Do you have source for this info? i am not doubting you but would like to read

It is from

https://www.reddit.com/r/nvidia/comments/ilhao8

They even go on to explain that perf gain will obviuosly differ depending on instruction mix.

And there are way more factors like register file size and ports, L1/L2 bw and sizes that matter in utilization of Cuda core resources. In fact 3080 even regressed in L2 cache versus 2080ti. It is as if NV ran out of die area and power budget to make it proper 128 fp + 128 int SM monster.

sontin · Sep 8, 2020

No, nVidia has warps with 32 threads. They would have to double everything within a SM. So with Ampere they doubled the FP32 throughput with relativ few transistors.

lixlax · Sep 8, 2020

Is Nvidia going to get sued for selling "fake" (CUDA) cores like AMD was for Bulldozer marketing? As I understand there is still only half of the amount of actual CUDA cores compared to what is marketed, just each core can do up to 2 operations per cycle!??

itsmydamnation · Sep 8, 2020

lixlax said:
Is Nvidia going to get sued for selling "fake" (CUDA) cores like AMD was for Bulldozer marketing? As I understand there is still only half of the amount of actual CUDA cores compared to what is marketed, just each core can do up to 2 operations per cycle!??

No,
Understand that there is a difference between
Data paths
Execution units
Instruction dispatch /retire

Ampere has all the execution units claimed 128 fp and 64 int but not the other resources to sustain them every cycle.

A/// · Sep 8, 2020

Bouowmx said:
Kind of hinting to Ampere GeForce being a short generation.

Curious, why do you say that? Do you think they plan on getting Hopper out the door by late 2021?

Bouowmx · Sep 8, 2020

A/// said:
Curious, why do you say that? Do you think they plan on getting Hopper out the door by late 2021?

I'm very unsure about Hopper GeForce, but Hopper A100 successor is a possibility.

Intel Xe-HP Graphics: Early Samples Offer 42+ TFLOPs of FP32 Performance

www.anandtech.com

Coming 2021, Intel Xe-HP with MCM: 4x4096 cores at 1.3 GHz for ~42 TFLOPS. NVIDIA A100 has 6912 cores at 1.4 GHz for 19 TFLOPS, or if all 8192 cores are enabled, 23 TFLOPS. I assume NVIDIA wants to get ahead of this with its own MCM architecture on TSMC 5 nm.

CakeMonster · Sep 8, 2020

Way too early to speculate about next gen probably. I would like to know, but realistically we won't. If it takes 24 months or more, the 3090 will look better, and the 3080 worse.

amenx · Sep 8, 2020

What a disaster this would be if true...

Ethereum Miners Eye NVIDIA's RTX 30 Series GPU as RTX 3080 Offers 3-4x Better Performance in Eth | Hardware Times

Images have surfaced on China’s Baidu forums showing crypto-miners hoarding NVIDIA’s new GeForce RTX 3080 graphics cards in the dozens. Considering that the launch date is still several days away, it’s a surprise that these miners were able to get their hands on so many units: Update: The first...

www.hardwaretimes.com

MrTeal · Sep 8, 2020

amenx said:
What a disaster this would be if true...

Ethereum Miners Eye NVIDIA's RTX 30 Series GPU as RTX 3080 Offers 3-4x Better Performance in Eth | Hardware Times

Images have surfaced on China’s Baidu forums showing crypto-miners hoarding NVIDIA’s new GeForce RTX 3080 graphics cards in the dozens. Considering that the launch date is still several days away, it’s a surprise that these miners were able to get their hands on so many units: Update: The first...

www.hardwaretimes.com

Maybe we can get some temporary leeway on the no profanity rule in the tech forums?

Kenmitch · Sep 8, 2020

amenx said:
What a disaster this would be if true...

Ethereum Miners Eye NVIDIA's RTX 30 Series GPU as RTX 3080 Offers 3-4x Better Performance in Eth | Hardware Times

Images have surfaced on China’s Baidu forums showing crypto-miners hoarding NVIDIA’s new GeForce RTX 3080 graphics cards in the dozens. Considering that the launch date is still several days away, it’s a surprise that these miners were able to get their hands on so many units: Update: The first...

www.hardwaretimes.com

Karma rearing it's ugly head. /s

Mopetar · Sep 8, 2020

A/// said:
Curious, why do you say that? Do you think they plan on getting Hopper out the door by late 2021?

I have a feeling that the Samsung process has a sizable share of the blame concerning the massive power numbers we're seeing for this card. Sure it's fair to say that with all of the other stuff they added that driving PPW wasn't a priority, but it's still a big jump.

If that is the case NVidia wants to get to 5nm as soon as they can. Just look how badly AMD was hamstrung by GF being a node behind Intel for so long.

Assuming AMD finally has a card worth all of the usual hype that the community builds up before a launch that doesn't leave NVidia with a lot of room. Ideally it means we get even better prices than we already have as both companies butt heads for market share.

I wouldn't be surprised if Hopper is more conservative at pushing RT or new features, but offers a massive boost in efficiency. Obviously they get a bigger uplift if the Samsung process really is at fault, but I have no doubt that there are architecture gains to be made, especially on newer technology.

Markfw · Sep 8, 2020

amenx said:
What a disaster this would be if true...

Ethereum Miners Eye NVIDIA's RTX 30 Series GPU as RTX 3080 Offers 3-4x Better Performance in Eth | Hardware Times

Images have surfaced on China’s Baidu forums showing crypto-miners hoarding NVIDIA’s new GeForce RTX 3080 graphics cards in the dozens. Considering that the launch date is still several days away, it’s a surprise that these miners were able to get their hands on so many units: Update: The first...

www.hardwaretimes.com

I hate miners for what they have done to the video card industry.

Glo. · Sep 8, 2020

It is true guys. Mining craze is incoming fast.

JTsyo · Sep 8, 2020

So they are saying to lower prices make the new cards a better deal than AICS for mining?

Kenmitch · Sep 8, 2020

Markfw said:
I hate miners for what they have done to the video card industry.

What has Parker Schnabel done to you personally? /s

Now Eddie Beats....He deserves whatever you dish his way!

jpiniero · Sep 8, 2020

nieur said:
Igors Lab uploaded two videos for tools given by nvidia

I'm intrigued. You would think being able to accurately measure the GPU power would be something nVidia wouldn't want you to be able to do.

Stuka87 · Sep 8, 2020

jpiniero said:
I'm intrigued. You would think being able to accurately measure the GPU power would be something nVidia wouldn't want you to be able to do.

nVidia likes to only show GPU (The chip) power, and exclude all other power consumption (rest of board, memory, etc). Its why their TDP numbers are typically off.

sontin · Sep 8, 2020

You wrote AMD wrong. nVidia's TDP is the whole board and their API shows the TDP and not just the GPU power.

Konan · Sep 8, 2020

Stuka87 said:
nVidia likes to only show GPU (The chip) power, and exclude all other power consumption (rest of board, memory, etc). Its why their TDP numbers are typically off.

sontin said:
You wrote AMD wrong. nVidia's TDP is the whole board and their API shows the TDP and not just the GPU power.

Good article here explaining differences - https://www.igorslab.de/en/performa...-cards-demystified-and-calculated-igorslab/3/

raghu78 · Sep 8, 2020

For those interested here is a good explanation on Ampere

Thala · Sep 8, 2020

sontin said:
You wrote AMD wrong. nVidia's TDP is the whole board and their API shows the TDP and not just the GPU power.

Absolutely correct. NVidia always publish TGP (total GPU power) which is including GPU, board, RAMs, voltage converters and even the fan.

Glo. · Sep 8, 2020

Actually that tool for measuring the GPU power is tremendous.

VirtualLarry · Sep 8, 2020

sontin said:
You wrote AMD wrong. nVidia's TDP is the whole board and their API shows the TDP and not just the GPU power.

Actually, for newer cards, it only shows % of power limit. At least in HWMonitor.

Accord99 · Sep 8, 2020

VirtualLarry said:
Actually, for newer cards, it only shows % of power limit. At least in HWMonitor.

Try HW Info. The amount of data provided is impressive.

Question 'Ampere'/Next-gen gaming uarch speculation thread

Senior member

Golden Member

Golden Member

Diamond Member

Member

Platinum Member

Diamond Member

Golden Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Moderator Emeritus, Elite Member

Diamond Member

Lifer

Diamond Member

Lifer

Diamond Member

Diamond Member

Senior member

Diamond Member

Golden Member

Diamond Member

No Lifer

Platinum Member