Nvidia 2000-series: A bad value for DC?

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,278
3,902
75
Let's look at the specs for the new GTX RTX 2000 series. A 2070 will have 2304 cores at a base clock of 1410MHz. A 1080 has 2560 cores at a base clock of 1607MHz. A GTX 1080 is going for as little as $440 right now. MSRPs for 2070s will be around $500.

Only two things are potentially improved with the 2000 series: RAM speed and IPC. The 2070 and 2080 have 14Gbps GDDR6, while the 1080 only has 10Gbps GDDR5X, all on a 256-bit bus. But how many projects are VRAM-limited? Maybe, I'm guessing, Folding@Home and PrimeGrid GFN? Any others? There was also mention that FP and integer work can be done simultaneously. That might help.

Didn't I see somebody around here mention they had access to a Volta Quadro or something? If anyone here does, how's the IPC on those shaders? (PrimeGrid PPS Sieve would tell you.) But, then, Turing seems to be different from Volta anyway.
 
  • Like
Reactions: lane42

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,728
14,758
136
I just ordered another 1080TI for $730. With the prices they are charging for the 2xxx series, I still win in price/perf.
 

UsandThem

Elite Member
May 4, 2000
16,068
7,380
146
People can't really blame them for the high prices. They have no true competitor, and as they found out during the mining craze, many people were willing to pay significantly more than MSRP for their cards. With them charging more upfront, they they will keep more of the money instead of the distributors and retailers. I bet their stockholders and investors are doing back-flips tonight.

I have to say though the prices are crazy (IMHO), and I think a lot of people who just game will spend more time on their consoles instead of dropping $500 - $1200 on a GPU. There will still be enough people who might bite their tongue, and still reluctantly shell out the money for a new GPU despite the high prices, where they will be fine for a while. However, if this is the pricing trend going forward, it can very well backfire on them.

Hopefully AMD can have a competitive product in the next several years, and who knows, maybe Intel's second venture into discreet GPUs will go a hell of lot better than their first attempt at it. ;)
 

lane42

Diamond Member
Sep 3, 2000
5,721
624
126
These are turing's, no volta's ?
Are these what we want for D.C. or are the volta's coming.
A grand for a 2080ti :mad:
 

UsandThem

Elite Member
May 4, 2000
16,068
7,380
146
These are turing's, no volta's ?
Are these what we want for D.C. or are the volta's coming.
A grand for a 2080ti :mad:

$1200 for the Founders Edition. MSRP of $999 for AIB, but I imagine there will be next to no cards at that price. Just like Pascal, the AIB cards should all well be over the Founders Edition's price. My personal bet will be that almost all of the AIB cards that are for sale will all be around the $1200 price, and go up from there.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,278
3,902
75
I think Nvidia's naming convention is alphabetical. Fermi < Kepler < Maxwell < Pascal < Turing < Volta. So consumer Volta might be coming.
 

StefanR5R

Elite Member
Dec 10, 2016
5,677
8,208
136
I think Nvidia's naming convention is alphabetical. Fermi < Kepler < Maxwell < Pascal < Turing < Volta. So consumer Volta might be coming.
But...
Ryan Smith said:
For complete details on the Turing architecture, please see our companion article. But in short Turing is an evolution of the Volta architecture, taking everything that made the GV100 fast, and then improving on it.
On the other hand, all of the new features for neural network acceleration and raytracing acceleration are irrelevant to existing Distributed Computing applications, which are FP32 centric. (FP64 projects, notably Milkyway, being the rare exceptions, are not benefiting from those features either.)

I am aware of one change in Volta over Pascal which affects FP32, but I don't quite understand whether or not existing applications benefit from it: The CUDA cores (re-?)gained individual program counters and stacks, allowing for finer grained thread scheduling. I have only briefly looked at the articles on Turing so far, and am not sure whether this update of Volta was carried over to Turing. — Edit, also, L1 and L2 caches in Volta were tweaked, but the corresponding details for Turing are not yet published.

There was an arguably small step up in process technology from TSMC 16 nm FinFET (Pascal) to TSMC 12 nm FFN (Volta and Turing), which promises somewhat increased performance per Watt; though not as much as the step from TSMC 28 nm (Maxwell) to Pascal — AFAIU.

Looking at specs that are relevant to FP32 GPGPU computing:

1070 vs. 2070
150 W : 175 W (1 : 1.167)
1920 shaders : 2304 shaders (1 : 1.200)
The shader count was increased a little bit more than the power target.
This is good for performance as well as for perf/Watt, at least in workloads which are able to utilize all shaders.​

1080 vs. 2080
180 W : 215 W (1 : 1.194)
2560 shaders : 2944 shaders (1 : 1.150)
The shader count was not increased as much as the power target.
While performance should go up, this is a bad sign for perf/Watt.​

1080Ti vs. 2080Ti
250 W : 250 W (1 : 1.000)
3584 shaders : 4352 shaders (1 : 1.214)
The shader count was increased, but not the power target.
Good for perf/Watt and for performance, at least in workloads which are able to utilize all shaders.​

(Note, there are plenty of Distributed Computing applications which are not able utilize all shaders out of the box. IOW they do not scale well to GPUs with higher shader count. In Folding@home, this can be partially fixed by switching from Windows to Linux. In BOINC, there are fixes like running two or more jobs on the same GPU at once, or giving arcane command line arguments in app_config which are specific to the particular application, or by finding optimized applications from 3rd parties.)

Edit,
Only two things are potentially improved with the 2000 series: RAM speed and IPC. The 2070 and 2080 have 14Gbps GDDR6, while the 1080 only has 10Gbps GDDR5X, all on a 256-bit bus. But how many projects are VRAM-limited? Maybe, I'm guessing, Folding@Home and PrimeGrid GFN?
Good point. I am seeing appreciable memory controller utilization in SETI@home/cuda90 as well. On average not as high as shader utilization, but with occasional peaks. Having headroom for these peaks may help overall throughput a little bit.
 
Last edited:
  • Like
Reactions: Orange Kid

Howdy

Senior member
Nov 12, 2017
572
480
136
$1200 for the Founders Edition. MSRP of $999 for AIB, but I imagine there will be next to no cards at that price. Just like Pascal, the AIB cards should all well be over the Founders Edition's price. My personal bet will be that almost all of the AIB cards that are for sale will all be around the $1200 price, and go up from there.

With these "higher" price schemes the 20xx has, it has also kept the 10xx series pricing inflated. The 10xx have relaxed a little but are still at or slightly under original pricing from the AIB from what I am seeing.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,278
3,902
75
On the other hand, all of the new features for neural network acceleration and raytracing acceleration are irrelevant to existing Distributed Computing applications, which are FP32 centric.
They don't appear to help my favorite project, PPS Sieve, either, since I made it INT64-centric. o_O

I hadn't seen that Volta article, thanks!

What I'm trying to figure out is this: If you have 2304 "CUDA cores", do you have 2304 INT32 cores and 2304 FP32 cores, or 1152 of each?
 

Hans Gruber

Platinum Member
Dec 23, 2006
2,208
1,146
136
Remember when the Titan came out and it was either $899 or $999? The top card at the end of every Nvidia product cycle run the last few generations? Now they raise the bar in an industry that has been flat for years. I guess 1080Ti's will be the fire sale item after bitcoin completely fails.
 

StefanR5R

Elite Member
Dec 10, 2016
5,677
8,208
136
On the other hand, all of the new features for neural network acceleration and raytracing acceleration are irrelevant to existing Distributed Computing applications, which are FP32 centric.
They don't appear to help my favorite project, PPS Sieve, either, since I made it INT64-centric. o_O
I totally forgot that not all projects leave the point floating around... I am idly wondering, are Amicable, Collatz, Enigma, RC5-72 operating on integer data too?

What I'm trying to figure out is this: If you have 2304 "CUDA cores", do you have 2304 INT32 cores and 2304 FP32 cores, or 1152 of each?
From what I took from the news pieces so far: Superficially, you have 2304 "cores" of each. Turing intro article:
Ryan Smith said:
[...] the Turing architecture Streaming Multiprocessor (SM) itself is also learning some new tricks. In particular here, it’s inheriting one of Volta’s more novel changes, which saw the Integer cores separated out into their own blocks, as opposed to being a facet of the Floating Point CUDA cores.
But perhaps it is better to say just "ALU", not "core". Volta first quick look:
volta_sm_575px.png
Ryan Smith said:
Now there are a bunch of unknowns here, including how flexible these cores are, and how much die space that they take up versus FP32 CUDA cores. But at a high level, this is looking like a relatively rigid core, which would make it very die-space efficient. By lumping together so many ALUs within a single core and without duplicating their control logic or other supporting hardware, the percentage of transistors in a core dedicated to ALUs is higher than on a standard CUDA core. The cost is flexibility, as the hardware to enable flexibility takes up space. So this is a very conscious tradeoff on NVIDIA’s part between flexibility and total throughput.
IOW there are separate arithmetic logic units, but they share register files, cache, dispatch/ scheduler units, etc.. On the other hand, much of the shared stuff has been beefed up vs. Pascal also, e.g. the added scheduling hardware.
 
  • Like
Reactions: Ken g6

lane42

Diamond Member
Sep 3, 2000
5,721
624
126
UsandThem nailed it. Saw preorders at microcenter for 2080ti, $1150-1250.
1080ti's in the $650 range, almost half the price...….
 

Shmee

Memory & Storage, Graphics Cards Mod Elite Member
Super Moderator
Sep 13, 2008
7,522
2,533
146
I would be interested in seeing how these do in terms of Ethereum mining and Equihash/Zhash. That said, high price means still may not be worth it. We will see, plus the software takes time to develop.
 

StefanR5R

Elite Member
Dec 10, 2016
5,677
8,208
136
Reviews which include something resembling a computational benchmark:
  • AnandTech should have FAHBench when they get their review out.
No other reviewer ever heard of consumer GPUs being used for computational workloads. I am not surprised, but am nevertheless rolling my eyes.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,728
14,758
136
I saw one bench that showed it like 25% faster, but compared to current 1080TI ? almost twice the price, so I am not biting anytime soon.
 

StefanR5R

Elite Member
Dec 10, 2016
5,677
8,208
136
This may be reference TDP (and ~reference clocks), not Founders Edition.
On page 5 Nate Oh said:
Because NVIDIA is not productizing any other reference-quality GeForce RTX 2080 Ti and 2080 card besides the Founders Editions, which are non-reference by specifications, we've gone ahead and emulated the true reference specifications with a 90MHz downclock and lowering the TDP by roughly 10W. This is to keep comparisons standardized and apples-to-apples, as we always look at reference-to-reference results.
Either the "FE" designation is missing by mistake in the FAHBench graph, or he ran this test indeed only with the -10 W TDP modification on the 20 series cards.

So whether it's one or the other is not 100 % sure. The other problem which I have with this result is that I wonder how much the 1080 FE and 1080 Ti FE cards are constrained by their blower style cooler in this test. For a given chip and workload, lesser cooler performance causes the chip to work at a slightly worse frequency:voltage combination. (When hotter, the chip needs higher voltage at same frequency.)

After I saw AnandTech's FAHBench graph I was keen on seeing Phoronix' FAHBench results because it would have shown not only the performance but also the power consumption (while also having the influence of blower style cooler for 10 series vs. open cooler for 20 series). And it would have been Linux results.
 
Last edited:

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,728
14,758
136
The Anandtech review has FAHbench results. These are at founder's edition clock speeds I believe. I guess if you can get a 2080 for about the same price as a 1080Ti, then the 2080 is the way to go for folding.

100948.png
So its $1350 vs $719 for 2080TI vs 1080TI FTW3 top card. The one they benched was stock. So I bet it would do better.

BTW, I downloaded it, and ran it on win 10 (score 86) and linux (score 95) with the default parameters, and he does not say what he used, but it can't be those parameters.

So if anyone knows what was used, let me know and I will run it again, but I still say I doubt those numbers using a 1080TI FTW3

Edit: Not sure what parameters they used, but if I choose nav for work unit and openCL I get 156.6 in windows.
 
Last edited:

alcoholbob

Diamond Member
May 24, 2005
6,271
323
126
There's no obvious IPC improvement with Turing over Volta in regards to pure rasterization ability. And RTX cards are no faster than 10 series when it comes to per core rasterization of DX11. The improvements are all from asynchronous compute which carry over to about 20% IPC improvement in low level API rasterization.
 
  • Like
Reactions: TennesseeTony

Wiz

Diamond Member
Feb 5, 2000
6,459
16
81
Thanks, I came looking for exactly this info.
I need to replace a workstation, was thinking of waiting until the vendor I use has 2080 cards in systems but looks like if I went with a pair of 1080Ti cards I'd be farther ahead.
 
  • Like
Reactions: Markfw