NVIDIA Pascal Thread

alcoholbob · Mar 19, 2016

casiofx said:
We can get 2x efficiency by having the same performance of 980ti but half the power consumption.

Maybe the reason some people are disappointed is because they want the big pascal. Same like people disappointed on polaris not being the fury x replacement

The other reason people are dissapointed is nvidia has a staggered release schedule to maximize profits at customer expense. If you have a 980ti and the 1080 comes out your cards are instantly devalued by $100-200 or whatever the msrp comes out at but theres no product in the market that actually beats it in performance. Its simple a devaluation thats occurred because its a timed price drop that will maximize nvidia profits at customers expense.

3DVagabond · Mar 19, 2016

casiofx said:
We can get 2x efficiency by having the same performance of 980ti but half the power consumption.

Maybe the reason some people are disappointed is because they want the big pascal. Same like people disappointed on polaris not being the fury x replacement

Did you miss the /s? All I'm talking about was the reported clocks and I was joking.

Sweepr · Mar 19, 2016

Don't want to start a new thread so I'm posting here. GDC paper from Ville Timonen, Graphics Programmer from Remedy Entertainment about Quantum Break and DX12:

GPU perf: Do things right, match DX11
Not trivial on all architectures
Messing up GPU mem mgmt can be costly

CPU perf: Easy to outperform DX11
But are you really API overhead bound?
Instancing, LODding, good culling: Youre not swamping the driver with draws.

DX11 drivers are able to circumvent HW pitfalls. Were matching DX11 GPU perf on Maxwell + AMD.
CPU perf: Sure DX12 can be much faster, but if your engine design is such that you dont swamp the API with draw calls, the actual API overhead might not be significant in your overall CPU cost. We saved ~10% overall renderer time

http://wili.cc/research/northlight_dx12/GDC16_Timonen_Northlight_DX12.pptx

Silverforce11 · Mar 19, 2016

@Sweepr

DX12 is primarily about CPU overhead reduction & Multi-Thread Rendering.

For GPUs to get a benefit for these API under GPU bound scenarios, the hardware needs to support Async Compute to give it "multi-thread" capabilities by using multi-engines to parallel render queues.

So if they aren't using Async Compute, the performance uplift will be only from MTR/CPU side of the engine, and if they aren't CPU bound, then no improvements are expected.

Nothing we didn't know already.

xorbe · Mar 19, 2016

IllogicalGlory said:
These aren't mobile. They have desktop CPUs and desktop motherboards.

Gigabyte Technology Co., Ltd. H67M-D2-B3
ASRock Z170 Extreme6

For the "980 Ti" and "970" respectively.

You put them on cards for initial testing in the lab?

ShintaiDK · Mar 19, 2016

xorbe said:
You put them on cards for initial testing in the lab?

MXM modules in a PCIe slot via a converter would do.

moonbogg · Mar 19, 2016

I can't wait for the Pascal and the Skylake-E! Instead of TWO hundred FPS I'll get THREE!

raghu78 · Mar 19, 2016

alcoholbob said:
Only the metal gates are 16nm with TSMC 16FF+, the silicon is still the 20nm that TSMC was demoing all the way back in 2011-2012, TSMC has literally been making silicon on this manufacturing process for 5+ years Im not surprised they are ready for a big die by Q1 2017, its literally year 6 on this process since they first taped out working commercial silicon.

Sorry but demoing is not equal to volume production. Apple A8 and A8X were the first high volume TSMC 20nm chips. They launched in Q3/Q4 2015. 16FF+ products launched in Q3 2016, roughly a year after TSMC 20nm products launched. We are likely to see Nvidia 16FF+ GPUs in Q3 2016 roughly 9-12 months after the first TSMC 16FF+ products. Apple now commands wafer allocation at the bleeding edge foundry nodes. The rest have to wait in a queue for allocation.

JDG1980 · Mar 19, 2016

Sweepr said:
We've had rumours about 2 Pascal GPUs being launched after Computex. This could very well be GP106 or a low-clocked GP104 ES, who knows.

I wouldn't be surprised if at least one of the 3DMark results was from GP106. The Drive PX 2 has been promised (at least for limited private release) relatively soon, and some basic deduction indicates it will probably be using GP106: the TFlops rating is what we currently see with GM204 (and we know that the prototypes actually use GM204), so with a die shrink, that level of performance should drop down one chip size. Thus, we should expect to be seeing some GP106 tests. Of course, that says little about when it will be released to the public. We also have to take into account the possibility that current early samples are being tested with GDDR5 and the actual product will have GDDR5X.

Silverforce11 · Mar 19, 2016

Interesting discussion here for upcoming GP106 and GP104.

http://www.overclock.net/t/1595065/...performance-entries-spotted/210#post_25003734

Basically cut down GP104 (970 replacement) = 980Ti.

Full GP106 is ~40% above 960, or around the 970 performance.

One can assume the full GP104 to be 980Ti +20% or so based on the 970/980 relationship.

That's actually a very plausible result, shrinking Maxwell GM200 down into a mid-range GP104 will get that kind of performance profile, along with a few uarch tweaks.

Head1985 · Mar 20, 2016

Nope...Epic fail if true.
980TI is only 30-35% faster than GTX970.They just cant release 1070 with +30% performance against 224bit, 56rops,3.5GB GTX970.They need deliver + 50-60% performance boost on NEW NODE to force people upgrade.
Also GTX970 is already very very very very cutdown and still average 15% slower than GTX980.
1070 wont be 20% slower.So if 1070 is only ON 980TI level 1080 will be only 10% faster than TITANX and 15% than 980TI.

BTW 980Ti IS not flagship!!Its GTX570 of 40nm era.Why everyone comparing Pascal with GTX570?They need compare it to TITANX.That is flagship.

To not be epic fail on new node:
1070-50% faster than GTX970-10% faster thanTITANX/15% than 980TI(but still this will be worst jump on new node since ever)
1080-50% faster than GTX980-25% faster than TITANX/30% than 980TI.

Edit:GTX970 is 60% faster than GTX960.Why will they increase GP106 performance by 60% and GP 104 by 30%?Makes zero sense.

Silverforce11 · Mar 20, 2016

980Ti is ~30% faster than 980.

Not 970. Unless you're talking OC 970 models.

Importantly, we do not know the die sizes of GP106 and GP104.

GM204 is a very big mid-range chip by NV's standards.

3DVagabond · Mar 20, 2016

People upgraded from 780 ti to 980. It doesn't need to be anywhere near 50% faster.

Head1985 · Mar 20, 2016

Silverforce11 said:
980Ti is ~30% faster than 980.

Not 970. Unless you're talking OC 970 models.

Where?
http://www.techpowerup.com/reviews/NVIDIA/GeForce_GTX_980_Ti/31.html
1080P-21%faster
1440p-25%faster
VS GTX970
1080P-35%faster.

Silverforce11 · Mar 20, 2016

3DVagabond said:
People upgraded from 780 ti to 980. It doesn't need to be anywhere near 50% faster.

Indeed, it was around 10% at release, some sites even found it less than that (lower boosting 980 I guess, Anand's reference got to 1.265ghz).

With Polaris 11 and 10, at least we know roughly what to expect as they reveal Polaris 11's die size as ~110-120mm2, and Polaris 10 as ~232mm2.

With Pascal, we have no idea what to expect. However, mid-range Kepler GK104 was much smaller than GM204.

Head1985 · Mar 20, 2016

3DVagabond said:
People upgraded from 780 ti to 980. It doesn't need to be anywhere near 50% faster.

Those are "must have best GPU on market" buyers.Not GTX670/970 buyers...If 1070 is only 35% faster than 970 people will not upgrade.

Silverforce11 · Mar 20, 2016

Head1985 said:
Where?
http://www.techpowerup.com/reviews/NVIDIA/GeForce_GTX_980_Ti/31.html
1080P-21%faster
1440p-25%faster
VS GTX970
1080P-35%faster.

And ~40% faster at 1440p.

Mid-range GP104 cut ~980Ti performance is not too bad.

So full GP104 is likely up to 980Ti +20%.

That falls in line with Kepler 670/680 vs Fermi 580 situation. New node, new uarch.

Don't forget that GM204 is large for a mid-range chip. If GP104 is ~292mm2 or around there, 20% faster than GM200 at 601mm2 is good.

Head1985 · Mar 20, 2016

Silverforce11 said:
Mid-range GP104 cut ~980Ti performance is not too bad.

So full GP104 is likely up to 980Ti +20%.

That falls in line with Kepler 670/680 vs Fermi 580 situation. New node, new uarch.

Don't forget that GM204 is large for a mid-range chip. If GP104 is ~292mm2 or around there, 20% faster than GM200 at 601mm2 is good.

Well nope...And again 1080 will not be 20% faster than 1070.
970 vs 980 its 15% average.
If is 1070 on par with cutdown 980TI then 1080 will be only 15% faster than 980TI and 10% vs TITANX.
680 was 30-35% faster than GTX580/TITANX on 40nm.

ANd 10% vs TITANX will be pretty epic fail.

Silverforce11 · Mar 20, 2016

Head1985 said:
Well nope...And again 1080 will not be 20% faster than 1070.
970 vs 980 its 15% average.
If is 1070 on par with cutdown 980TI then 1080 will be only 15% faster than 980TI and 10% vs TITANX.
680 was 30-35% faster than GTX580/TITANX on 40nm.

We're dealing with 5-10% margins here.

Do you know how cut down the 2nd tier GP104 will be?

How about if GP104 is a small chip, like Polaris 10?

Right, so much unknown and you are basing your certainty on what? Be flexible, it's only rumors. lol

Silverforce11 · Mar 20, 2016

There's also this very plausible theory that Maxwell only existed because of the cancellation of the 20nm node.

NV's prior road maps did not have Maxwell. It was a Kepler -> Pascal leap.

This theory propose that Maxwell is Pascal, with most of the FP64 stripped, mix-mode ops stripped, a purely gaming or FP32 focus design. This results in the nice gains versus Kepler, at the same 28nm, despite only small die size increase (GK104 -> GM204 etc), NV has extracted a lot of performance from it.

Now, if that theory is true, the real comparison for Pascal should be made versus Kepler.

We're talking GK104 vs GP104, and GK110 vs GP100. What kind of performance leap is that? It's a massive perf leap.

Head1985 · Mar 20, 2016

Silverforce11 said:
We're dealing with 5-10% margins here.

Do you know how cut down the 2nd tier GP104 will be?

How about if GP104 is a small chip, like Polaris 10?

Right, so much unknown and you are basing your certainty on what? Be flexible, it's only rumors. lol

I really dont care how big GP104 is.If its not delivering +50% performance on brand new 16FF+ it will be epic fail(vs 970 and 980)
Btw 970 is already super cutdown and still 980 is only 15% faster.They really cant cutdown 1070 even more.So it will be again 15%.
I think it will be less cutdown than 970 after GTX970 controversy and they just underclock it more.
Right now 980TI is only 35% faster average vs GTX970 in 1080P(my resolution) and if 1070 only match it and deliver + 35% performance on brand new node i will not upgrade and i will call it epic fail.

Even GTX970 on same node as GTX670/770 delivered 40-60% increase vs 770/670.How they can deliver only 35% performance on new node?
http://www.techpowerup.com/reviews/MSI/GTX_970_Gaming/27.html

Sweepr · Mar 20, 2016

Silverforce11 said:
Interesting discussion here for upcoming GP106 and GP104.

http://www.overclock.net/t/1595065/...performance-entries-spotted/210#post_25003734

Basically cut down GP104 (970 replacement) = 980Ti.
Full GP106 is ~40% above 960, or around the 970 performance.
One can assume the full GP104 to be 980Ti +20% or so based on the 970/980 relationship.

That's actually a very plausible result, shrinking Maxwell GM200 down into a mid-range GP104 will get that kind of performance profile, along with a few uarch tweaks.

GTX 1070 is about 45% faster than GTX 970.
Although we have no numbers yet, GTX 1080 is probably 20-30% above this.
GTX 1060 is about 40% faster than GTX 960. GTX 1060 features 3GB GDDR5 and comes with 192bit bus.
GTX 1060 is about 10% faster than a GTX 970,

That's in line with my expectations. A ~200-300mm² GP104 beating 601mm² GM200 would be no small feat.

Cloudfire777 · Mar 20, 2016

Silverforce11 said:
Interesting discussion here for upcoming GP106 and GP104.

http://www.overclock.net/t/1595065/...performance-entries-spotted/210#post_25003734

Basically cut down GP104 (970 replacement) = 980Ti.

Full GP106 is ~40% above 960, or around the 970 performance.

One can assume the full GP104 to be 980Ti +20% or so based on the 970/980 relationship.

That's actually a very plausible result, shrinking Maxwell GM200 down into a mid-range GP104 will get that kind of performance profile, along with a few uarch tweaks.

Lets see:
Less heat and power,
Same performance as GTX 980Ti but ~140-150W
$300-350 price vs 980Ti`s $650?
Async support?
Full DX12 support

Sign me up for 1 or two of those GTX 1070s 🙂

Head1985 · Mar 20, 2016

Cloudfire777 said:
Lets see:
Less heat and power,
Same performance as GTX 980Ti but ~140-150W
$300-350 price vs 980Ti`s $650?
Async support?
Full DX12 support

Sign me up for 1 or two of those GTX 1070s 🙂

15% above GTX980TI 160wTDP 400USD will be much better and it will be actually upgrade from GTX970.
35%-GTX980TI performance is not upgrade.
670 to 970 was around 60% performance boost and it was on same node.How the hell can someone be glad by +35% performance vs GTX970 and with new node?

Timmah! · Mar 20, 2016

3DVagabond said:
People upgraded from 780 ti to 980. It doesn't need to be anywhere near 50% faster.

People on these boards, or similar ones. The same way they are now excited about Polaris and itching to get it, or even 2 into crossfire, despite the fact its almost given at this point it wont be really faster than the current AMD gpus, which ofc they already own (or comparable Nvidia stuff).

I think the majority is not so enthusiastic about new HW and does upgrade, only when truly needed.

NVIDIA Pascal Thread

Diamond Member

Lifer

Diamond Member

Lifer

Senior member

Lifer

Lifer

Diamond Member

Golden Member

Lifer

Golden Member

Lifer

Lifer

Golden Member

Lifer

Golden Member

Lifer

Golden Member

Lifer

Lifer

Golden Member

Diamond Member

Golden Member

Golden Member

Golden Member