• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Question 'Ampere'/Next-gen gaming uarch speculation thread

Page 87 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Ottonomous

Senior member
How much is the Samsung 7nm EUV process expected to provide in terms of gains?
How will the RTX components be scaled/developed?
Any major architectural enhancements expected?
Will VRAM be bumped to 16/12/12 for the top three?
Will there be further fragmentation in the lineup? (Keeping turing at cheaper prices, while offering 'beefed up RTX' options at the top?)
Will the top card be capable of >4K60, at least 90?
Would Nvidia ever consider an HBM implementation in the gaming lineup?
Will Nvidia introduce new proprietary technologies again?

Sorry if imprudent/uncalled for, just interested in the forum member's thoughts.
 
Lol, power scales with bandwidth not with size...and GDDR6X has higher bandwidth efficiency.
Anyway your calculation is nonsense.

Memory power scales with bandwidth and capacity. Do you think double the VRAM on RTX Titan Ampere is going to come free of cost. Hint: The 17 Gbps on Titan Ampere vs the 21 Gbps on the RTX 3090 is for power reasons.


2nd Gen NVIDIA TITAN GA102-400-A1 5376 24GB 17Gbps
GeForce RTX 3090 GA102-300-A1 5248 12GB 21Gbps
GeForce RTX 3080 GA102-200-Kx-A1 4352 10GB 19Gbps

Here is a review of RX 5500XT 4GB and 8GB from Sapphire which are identical other than VRAM capacity. The 8GB version draws more power vs the 4GB version

 
Memory power scales with bandwidth and capacity. Do you think double the VRAM on RTX Titan Ampere is going to come free of cost. Hint: The 17 Gbps on Titan Ampere vs the 21 Gbps on the RTX 3090 is for power reasons.

The power adder based on capacity is mostly the power needed for refreshing the cells, which is much smaller than dynamic power required for accessing a page. So Glo's calculation continues to be nonsense.

I also have a hint for you: 8 GByte version drawing more power than the 4 GByte version is also attributed to the fact that the 8 GByte version is doing more work per time unit (e.g. games running faster).

And now go back and read again what Glo has claimed/calculated before nitpicking that power-bandwidth scaling is not totally correct.
 
Last edited:
The power adder based on capacity is mostly the power needed for refreshing the cells, which is much smaller than dynamic power required for accessing a page. So Glo's calculation continues to be nonsense.

I wanted to point out that your statement about power scaling only with bandwidth and not size was incorrect. BTW you do not have a detailed breakup of memory power scaling with capacity. So one could argue you are speculating .
 
I wanted to point out that your statement about power scaling only with bandwidth and not size was incorrect. BTW you do not have a detailed breakup of memory power scaling with capacity. So one could argue you are speculating .

Nope i am not speculating, i am speaking of experience with our designs using different amount of memories, where i have precise numbers. You can also roughly calculate this by looking into the datasheet of the memories - no speculation needed.
Thing is the correlation with used bandwidth is much higher than the correlation with capacity*. Glo just multiplied power with capacity such that double the capacity took double the amount of power.

*I could imagine a combination, where you have very low effective bandwidth usage while having huge capacity, where the capacity based power starts to dominate.
 
Last edited:
R9 290X memory subsystem was using 80W of power, with 6000 MHz GDDR5, with 16 memory chips.
RX 480 memory subsystem was using 37W's of power with 7000 MHz GDDR5, with 8 memory chips.

Well in this case the R9 290X has double the width. It's not quite double the bandwidth, but its very close. The RX 480's memory might have an advantage of using better process as well.

Yes memory is different because capacity doesn't have a linear relation with power consumption. There's some extra power if you double the amount of chips but nowhere near double. Of course if you double capacity with same chip count the increase will be minimal.
 
84 SM Ampere GPU - 40% Rasterization perf. improvement, over RTX 2080 Ti, 375W TGP - RTX 3090.
68 SM GPU - 10-15% rasterization improvement over RTX 2080 Ti, 320W TGP - RTX 3080.
48 SM GPU - 10% rasterization improvement over RTX 2080 Super - RTX 3070 Ti.
40 SM GPU - RTX 2080 performance - RTX 3070.
36 SM GPU - RTX 2070 Super performance - RTX 3060 Ti
30 SM GPU - RTX 2070 performance - RTX 3060
24 SM GPU - RTX 2060 performance - RTX 3050 Ti.
20 SM GPU - GTX 1660 Ti performance - RTX 3050.
LOL, Nope, prepare for your second biggest expectations fail. The 84 SM GPU is 60% faster or more.
 
Wonder how thats going to work. Would be a strange bus layout compared to the other cards.

Sounds like the founders edition will have the irregular PCB and funky doubled sided on the card fans. The partner cards probably three slots thick and some interesting designs... maybe.. hehe
Probably some water cooled versions.

As for bus layout what do you think?
 
It's hard to believe because it would make these cards too good for deeplearning stuff unless NV starts blocking that on a driver level. There would be very little need to get a Titan or Quadro card which usually have more RAM.
Maybe they'll do some distributed deep learning to improve DLSS? Also, if so, spooky.
 
It's hard to believe because it would make these cards too good for deeplearning stuff unless NV starts blocking that on a driver level. There would be very little need to get a Titan or Quadro card which usually have more RAM.

Yeah, that does sound like NVidia. Didn't they put a stop to using GeForce in datacenter or was that words only and not actually anything that prevents the cards from being used?
 
Yeah, that does sound like NVidia. Didn't they put a stop to using GeForce in datacenter or was that words only and not actually anything that prevents the cards from being used?

That's words only but still legally binding. But I wasn't thinking about datacenter use but workstation use for which you can use GeForce for anything you want. The limitation with GeForce usually was the amount of RAM. If your data doesn't fit into the GPUs memory, then it gets very slow to train so that a slower quadro with more vram actually gives better performance. Hence my comment that this amount of vram is doubtful.
 
That's words only but still legally binding. But I wasn't thinking about datacenter use but workstation use for which you can use GeForce for anything you want. The limitation with GeForce usually was the amount of RAM. If your data doesn't fit into the GPUs memory, then it gets very slow to train so that a slower quadro with more vram actually gives better performance. Hence my comment that this amount of vram is doubtful.
You wouldn't use a single card. That's the point. NVidia's software suite would probably prevent the workload from being done if it detects GeForce cards and not Quadros. There's an open source project on the internet but it doesn't get much love.

If you're talking about straight up workstation use for rendering stuff, then certain software make use of what Quadro offers at the hardware level than what GeForce does. Quadros use ECC VRAM. If you're doing heavy renders for a mechanical device or system in software like SolidWorks a small error may really cost a lot. It's a no brainer to use Quadro at that point, including the ability to render larger files as you pointed out, simulation work and raw performance at the upper end of Quadros.

There's always going to be a major benefit to choose Quadro cards, especially if no business is going to risk messing up because they saved a few thousand dollars.
 
LOL, Nope, prepare for your second biggest expectations fail. The 84 SM GPU is 60% faster or more.
Are you a guy like Blue Nugroho, from Twitter who to this day believes PS5 is not RDNA2 based?

Your posts in both threads suggest that.
 
Wonder how thats going to work. Would be a strange bus layout compared to the other cards.

It'd have to be either 2 GB chips or 2 1 GB chips sharing the same bus.

It does make sense from a financial standpoint; they can charge something in the range of the Ti .
 

Jensen is also doing a GTC Keynote on October 5th. Probably Ampere Quadros is the focus but they could also talk about the second tier Ampere Gaming.
 
It'd have to be either 2 GB chips or 2 1 GB chips sharing the same bus.

It does make sense from a financial standpoint; they can charge something in the range of the Ti .

Micron stated the cards would all have a base of 12GB. If this is true (And it should be given the source) doubling the memory would bring you to 24GB, not 20. The only way to end up with 20 is to gimp the bus in some way. And while nVidia has done this before (GTX 970), not sure they want to go down that road again.

Now if the non-founders cards do have 10GB, that would mean they are running a different memory bus than the FE cards, which would be really weird and something I dont think we have ever seen as it would mean AIB cards would have less memory bandwidth than the FE cards.
 
Last edited:
Micron stated the cards would all have a base of 12GB. If this is true (And it should be given the source) doubling the memory would bring you to 24GB, not 20. The only way to end up with 20 is to gimp the bug in some way. And while nVidia has done this before (GTX 970), not sure they want to go down that road again.

The 3080 is 320 bit, so 10 (or 20). The 3090 is 384, so 12 or 24.
 
Ahh, that tweet did specifically state the 3080 at the bottom. Would still be very strange to put that much memory on that card.
I'd imagine that they'll launch 20GB version if AMD has cards with more vram. For marketing bigger number is better.
 
Back
Top