Question 'Ampere'/Next-gen gaming uarch speculation thread

Ottonomous · Nov 1, 2019

How much is the Samsung 7nm EUV process expected to provide in terms of gains?
How will the RTX components be scaled/developed?
Any major architectural enhancements expected?
Will VRAM be bumped to 16/12/12 for the top three?
Will there be further fragmentation in the lineup? (Keeping turing at cheaper prices, while offering 'beefed up RTX' options at the top?)
Will the top card be capable of >4K60, at least 90?
Would Nvidia ever consider an HBM implementation in the gaming lineup?
Will Nvidia introduce new proprietary technologies again?

Sorry if imprudent/uncalled for, just interested in the forum member's thoughts.

raghu78 · Aug 17, 2020

Thala said:
Lol, power scales with bandwidth not with size...and GDDR6X has higher bandwidth efficiency.
Anyway your calculation is nonsense.

Memory power scales with bandwidth and capacity. Do you think double the VRAM on RTX Titan Ampere is going to come free of cost. Hint: The 17 Gbps on Titan Ampere vs the 21 Gbps on the RTX 3090 is for power reasons.

https://twitter.com/x/status/1273889616282521603

2nd Gen NVIDIA TITAN GA102-400-A1 5376 24GB 17Gbps
GeForce RTX 3090 GA102-300-A1 5248 12GB 21Gbps
GeForce RTX 3080 GA102-200-Kx-A1 4352 10GB 19Gbps

Here is a review of RX 5500XT 4GB and 8GB from Sapphire which are identical other than VRAM capacity. The 8GB version draws more power vs the 4GB version

Home - OC3D

www.overclock3d.net

Thala · Aug 17, 2020

raghu78 said:
Memory power scales with bandwidth and capacity. Do you think double the VRAM on RTX Titan Ampere is going to come free of cost. Hint: The 17 Gbps on Titan Ampere vs the 21 Gbps on the RTX 3090 is for power reasons.

The power adder based on capacity is mostly the power needed for refreshing the cells, which is much smaller than dynamic power required for accessing a page. So Glo's calculation continues to be nonsense.

I also have a hint for you: 8 GByte version drawing more power than the 4 GByte version is also attributed to the fact that the 8 GByte version is doing more work per time unit (e.g. games running faster).

And now go back and read again what Glo has claimed/calculated before nitpicking that power-bandwidth scaling is not totally correct.

raghu78 · Aug 17, 2020

Thala said:
The power adder based on capacity is mostly the power needed for refreshing the cells, which is much smaller than dynamic power required for accessing a page. So Glo's calculation continues to be nonsense.

I wanted to point out that your statement about power scaling only with bandwidth and not size was incorrect. BTW you do not have a detailed breakup of memory power scaling with capacity. So one could argue you are speculating .

Thala · Aug 17, 2020

raghu78 said:
I wanted to point out that your statement about power scaling only with bandwidth and not size was incorrect. BTW you do not have a detailed breakup of memory power scaling with capacity. So one could argue you are speculating .

Nope i am not speculating, i am speaking of experience with our designs using different amount of memories, where i have precise numbers. You can also roughly calculate this by looking into the datasheet of the memories - no speculation needed.
Thing is the correlation with used bandwidth is much higher than the correlation with capacity*. Glo just multiplied power with capacity such that double the capacity took double the amount of power.

*I could imagine a combination, where you have very low effective bandwidth usage while having huge capacity, where the capacity based power starts to dominate.

IntelUser2000 · Aug 17, 2020

Glo. said:
R9 290X memory subsystem was using 80W of power, with 6000 MHz GDDR5, with 16 memory chips.
RX 480 memory subsystem was using 37W's of power with 7000 MHz GDDR5, with 8 memory chips.

Well in this case the R9 290X has double the width. It's not quite double the bandwidth, but its very close. The RX 480's memory might have an advantage of using better process as well.

Yes memory is different because capacity doesn't have a linear relation with power consumption. There's some extra power if you double the amount of chips but nowhere near double. Of course if you double capacity with same chip count the increase will be minimal.

Thala · Aug 17, 2020

IntelUser2000 said:
Well in this case the R9 290X has double the width. It's not quite double the bandwidth, but its very close. The RX 480's memory might have an advantage of using better process as well.

Thats perfectly fine reasoning about the observed differences.

Konan · Aug 17, 2020

If you are a Kopitekimi twitter believer then there will be 3080 20GB AIB partner cards

https://twitter.com/x/status/1295520974796816384

DXDiag · Aug 17, 2020

Glo. said:
84 SM Ampere GPU - 40% Rasterization perf. improvement, over RTX 2080 Ti, 375W TGP - RTX 3090.
68 SM GPU - 10-15% rasterization improvement over RTX 2080 Ti, 320W TGP - RTX 3080.
48 SM GPU - 10% rasterization improvement over RTX 2080 Super - RTX 3070 Ti.
40 SM GPU - RTX 2080 performance - RTX 3070.
36 SM GPU - RTX 2070 Super performance - RTX 3060 Ti
30 SM GPU - RTX 2070 performance - RTX 3060
24 SM GPU - RTX 2060 performance - RTX 3050 Ti.
20 SM GPU - GTX 1660 Ti performance - RTX 3050.

LOL, Nope, prepare for your second biggest expectations fail. The 84 SM GPU is 60% faster or more.

A/// · Aug 17, 2020

What a fantastic, well thought out post.

Stuka87 · Aug 18, 2020

A/// said:
What a fantastic, well thought out post.

Best to just ignore it.

Konan said:
If you are a Kopitekimi twitter believer then there will be 3080 20GB AIB partner cards

https://twitter.com/x/status/1295520974796816384

Wonder how thats going to work. Would be a strange bus layout compared to the other cards.

Konan · Aug 18, 2020

Stuka87 said:
Wonder how thats going to work. Would be a strange bus layout compared to the other cards.

Sounds like the founders edition will have the irregular PCB and funky doubled sided on the card fans. The partner cards probably three slots thick and some interesting designs... maybe.. hehe
Probably some water cooled versions.

As for bus layout what do you think?

TempAcc99 · Aug 18, 2020

Konan said:
If you are a Kopitekimi twitter believer then there will be 3080 20GB AIB partner cards

https://twitter.com/x/status/1295520974796816384

It's hard to believe because it would make these cards too good for deeplearning stuff unless NV starts blocking that on a driver level. There would be very little need to get a Titan or Quadro card which usually have more RAM.

GoodRevrnd · Aug 18, 2020

TempAcc99 said:
It's hard to believe because it would make these cards too good for deeplearning stuff unless NV starts blocking that on a driver level. There would be very little need to get a Titan or Quadro card which usually have more RAM.

Maybe they'll do some distributed deep learning to improve DLSS? Also, if so, spooky.

A/// · Aug 18, 2020

TempAcc99 said:
It's hard to believe because it would make these cards too good for deeplearning stuff unless NV starts blocking that on a driver level. There would be very little need to get a Titan or Quadro card which usually have more RAM.

Yeah, that does sound like NVidia. Didn't they put a stop to using GeForce in datacenter or was that words only and not actually anything that prevents the cards from being used?

TempAcc99 · Aug 18, 2020

A/// said:
Yeah, that does sound like NVidia. Didn't they put a stop to using GeForce in datacenter or was that words only and not actually anything that prevents the cards from being used?

That's words only but still legally binding. But I wasn't thinking about datacenter use but workstation use for which you can use GeForce for anything you want. The limitation with GeForce usually was the amount of RAM. If your data doesn't fit into the GPUs memory, then it gets very slow to train so that a slower quadro with more vram actually gives better performance. Hence my comment that this amount of vram is doubtful.

A/// · Aug 18, 2020

TempAcc99 said:
That's words only but still legally binding. But I wasn't thinking about datacenter use but workstation use for which you can use GeForce for anything you want. The limitation with GeForce usually was the amount of RAM. If your data doesn't fit into the GPUs memory, then it gets very slow to train so that a slower quadro with more vram actually gives better performance. Hence my comment that this amount of vram is doubtful.

You wouldn't use a single card. That's the point. NVidia's software suite would probably prevent the workload from being done if it detects GeForce cards and not Quadros. There's an open source project on the internet but it doesn't get much love.

If you're talking about straight up workstation use for rendering stuff, then certain software make use of what Quadro offers at the hardware level than what GeForce does. Quadros use ECC VRAM. If you're doing heavy renders for a mechanical device or system in software like SolidWorks a small error may really cost a lot. It's a no brainer to use Quadro at that point, including the ability to render larger files as you pointed out, simulation work and raw performance at the upper end of Quadros.

There's always going to be a major benefit to choose Quadro cards, especially if no business is going to risk messing up because they saved a few thousand dollars.

AtenRa · Aug 18, 2020

Could the double FP32 be something similar to AMDs dual Compute Unit ??

Glo. · Aug 18, 2020

DXDiag said:
LOL, Nope, prepare for your second biggest expectations fail. The 84 SM GPU is 60% faster or more.

Are you a guy like Blue Nugroho, from Twitter who to this day believes PS5 is not RDNA2 based?

Your posts in both threads suggest that.

Glo. · Aug 18, 2020

AtenRa said:
Could the double FP32 be something similar to AMDs dual Compute Unit ??

Interesting thought. That would fit.

jpiniero · Aug 18, 2020

Stuka87 said:
Wonder how thats going to work. Would be a strange bus layout compared to the other cards.

It'd have to be either 2 GB chips or 2 1 GB chips sharing the same bus.

It does make sense from a financial standpoint; they can charge something in the range of the Ti .

jpiniero · Aug 18, 2020

NVIDIA announces GTC 2020 Jensen Huang keynote for October 5th - VideoCardz.com

NVIDIA GTC 2020 goes global and digital NVIDIA CEO Jensen Huang will host a keynote for GTC 2020 digital conference in October. This is not the first GTC keynote Jensen hosted this year. Back in May, Jensen unveiled the first Ampere-based product called A100, a compute accelerator featuring...

videocardz.com

Jensen is also doing a GTC Keynote on October 5th. Probably Ampere Quadros is the focus but they could also talk about the second tier Ampere Gaming.

Stuka87 · Aug 18, 2020

jpiniero said:
It'd have to be either 2 GB chips or 2 1 GB chips sharing the same bus.

It does make sense from a financial standpoint; they can charge something in the range of the Ti .

Micron stated the cards would all have a base of 12GB. If this is true (And it should be given the source) doubling the memory would bring you to 24GB, not 20. The only way to end up with 20 is to gimp the bus in some way. And while nVidia has done this before (GTX 970), not sure they want to go down that road again.

Now if the non-founders cards do have 10GB, that would mean they are running a different memory bus than the FE cards, which would be really weird and something I dont think we have ever seen as it would mean AIB cards would have less memory bandwidth than the FE cards.

jpiniero · Aug 18, 2020

Stuka87 said:
Micron stated the cards would all have a base of 12GB. If this is true (And it should be given the source) doubling the memory would bring you to 24GB, not 20. The only way to end up with 20 is to gimp the bug in some way. And while nVidia has done this before (GTX 970), not sure they want to go down that road again.

The 3080 is 320 bit, so 10 (or 20). The 3090 is 384, so 12 or 24.

Stuka87 · Aug 18, 2020

jpiniero said:
The 3080 is 320 bit, so 10 (or 20). The 3090 is 384, so 12 or 24.

Ahh, that tweet did specifically state the 3080 at the bottom. Would still be very strange to put that much memory on that card.

Tup3x · Aug 18, 2020

Stuka87 said:
Ahh, that tweet did specifically state the 3080 at the bottom. Would still be very strange to put that much memory on that card.

I'd imagine that they'll launch 20GB version if AMD has cards with more vram. For marketing bigger number is better.

Question 'Ampere'/Next-gen gaming uarch speculation thread

Senior member

Diamond Member

Golden Member

Diamond Member

Golden Member

Elite Member

Golden Member

Senior member

Member

Diamond Member

Diamond Member

Senior member

Member

Diamond Member

Diamond Member

Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Lifer

Lifer

Diamond Member

Lifer

Diamond Member

Golden Member