NVIDIA Pascal Thread

tential · Apr 5, 2016

Arachnotronic said:
Didn't somebody promise us that AMD was 6+ months ahead of NVIDIA in terms of a 14/16nm GPU? NV is in volume production today on this monster while all AMD has talked about are relatively small Polaris chips.

Lol.

It was the exact opposite actually. Someone promised us that Nvidia would launch 6 months ahead of AMD and that we'd see the 1080 in may (per the title of this thread) and AMD 6 months behind.

nurturedhate · Apr 5, 2016

Arachnotronic said:
Didn't somebody promise us that AMD was 6+ months ahead of NVIDIA in terms of a 14/16nm GPU? NV is in volume production today on this monster while all AMD has talked about are relatively small Polaris chips.

Lol.

AMD demo's working silicon in early Jan 2016. Nvidia hasn't demo'd anything to date.
AMD stated release sometime around summer 2016. Nvidia states Q1/2017.

Sounds like 6 months to me.

Bigger question.. why are you trying to start a "this company is better that that company" flame war in a thread about pascal?

Timmah! · Apr 5, 2016

AtenRa said:
GP100 has 3840 Cuda Cores capable of 32bit, and half of those can do 64bit.

So no 5760 cores? Other people seem to think otherwise. I am officially confused.

Kenmitch · Apr 5, 2016

Arachnotronic said:
AMD is hyping Polaris 10/11 because their current GPU sales are in the toilet and by hyping them up in the press, they would create an image of "being ahead" of NV (no doubt to help boost its stock price). NV, which actually makes a lot of $ from selling GPUs obviously doesn't want to signal to gamers that "hey, the stuff we're trying to sell you is crappy and obsolete, wait for the new stuff!"

???

airfathaaaaa · Apr 5, 2016

Despoiler said:
You must have missed the context of the post. Pascal's async compute abilities are supposedly only as good as GCN 1.0

Not GP100, P100. P100 has no ROPs per the block diagram and none are listed on the spec sheet. Why would it need them as a data center product?

EDIT:Jesus I was thinking GP100 is the consumer class chip, but Nvidia has both a product name and chip level. They seeming interchange the two on their website, when they are the same thing.

Actually what people have said is that Maxwell is a derivative of Pascal. Maxwell being created when 20nm fell through. The GP consumer series chips could be radically different from P100 to be more suited for gaming. You can see that P100 Pascal adds back the DP and compute that Maxwell is missing as well as adds NVLINK and HBM2, both of which have been know for sometime. Pascal has changed it's compute capability level, which is new information. It's really not all that different spec wise compared to Maxwell. We still don't know if it can be used in conjunction with DX12 or Vulkan.

the post was actually pretty clear on saying it will have some gcn capabilities but no async at all
which by the looks of it its true
volta is what pascal would have been but they rushed it because of the dx12 thats why in the middle of 2015 they changed the roadmap to put pascal a bit higher from maxwell..

Hi-Fi Man · Apr 5, 2016

RussianSensation said:
So you think GP100 for Tesla and GP102 will have > 4000 CCs, > 240 TMUs and Async Compute? Hmm... I don't believe it. NV has never done that before IIRC. They used to subsidize Flagship GeForce with Quadro/Tesla; hence why it made sense to make a Big Daddy for 3 of those markets. Now you are suggesting NV will start making Big Quadro/Tesla chips which are completely separate from Big Daddy gaming line?

Btw, GP104 is nowhere to be found. If it were launching in May, it's pretty odd to not unveil it now. Starting to sound like GP104 May launch may have been rumormill.

Seems NV is in no rush to ship GeForce GP100/102 if they are selling 8x Tesla P100 cards via DGX-1 for $129,000! This just reinforces what many of us predicted -- mid-range Pascal 970/980 successors for 2016.

It's entirely plausible. Just look at the previous smaller GK104 chip, it had a lower FP64 pipe ratio than GK110/GK210 and less compute oriented features. NVIDIA could just be taking this one step further and it makes sense to do so. If true, it would mean that NVIDIA has realized that big enterprise compute workloads have evolved so much and become so different from graphics workloads that it makes sense to sell a chip tailored for each market and I believe they have the size and capability to do so.

CentroX · Apr 5, 2016

Arachnotronic said:
Didn't somebody promise us that AMD was 6+ months ahead of NVIDIA in terms of a 14/16nm GPU? NV is in volume production today on this monster while all AMD has talked about are relatively small Polaris chips.

Lol.

Lol indeed. AMD has actually demoed a game on polaris. What has Nvidia done? Powerpoint slides.

Kris194 · Apr 5, 2016

Did you just compare rumored 232 mm^2 GPU built with gaming in mind to 610 mm^2 monster built for HPC market?

Adored · Apr 5, 2016

Kris194 said:
Did you just compare rumored 232 mm^2 GPU built with gaming in mind to 610 mm^2 monster built for HPC market?

I think you'll find it was Arachnotronic that did that first. http://forums.anandtech.com/showpost.php?p=38146232&postcount=1000

I find it hard to believe that anyone truly believes a powerpoint slide and render of a product is somehow "more impressive" than a working demo of another product, regardless of how good the render looks on paper.

CentroX · Apr 5, 2016

Some sweclockers forum users believe nvidia will release HBM2 cards this year for consumers because samsung ramped up production in january. Theyre wrong right?

Kris194 · Apr 5, 2016

Hard to say, new Titan is rumored to be released either on Q4 2016 or Q1 2017.

Despoiler · Apr 5, 2016

airfathaaaaa said:
the post was actually pretty clear on saying it will have some gcn capabilities but no async at all
which by the looks of it its true
volta is what pascal would have been but they rushed it because of the dx12 thats why in the middle of 2015 they changed the roadmap to put pascal a bit higher from maxwell..

Went back and re-read it. It's confusing because it says i it won't have GCN-like capabilities, but then goes on to say it will be close to GCN 1.0. I was thinking GCN 1.0 has ACE's so it is capable of async at a rudimentary level.

MrTeal · Apr 5, 2016

Adored said:
I think you'll find it was Arachnotronic that did that first. http://forums.anandtech.com/showpost.php?p=38146232&postcount=1000

I find it really hard to believe that anyone really believes a powerpoint slide and render of a product is somehow "more impressive" than a working demo of another product, regardless of how good the render looks on paper.

Well, he did also hold up the Drive PX2. Interestingly, I'm pretty sure it has new (and much smaller) silicon.
I didn't save the live stream and the pics I've seen are crappy, but it's clearly a different and smaller GPU. Still, not a demo.

vs

Cookie Monster · Apr 5, 2016

From the GP100 announcement, it looks like they may have a seperate GPU e.g. GP102 just for speculation sakes (other than the GP104) for consumer/gaming market because everything in GP100 is heavily HPC orientated including the no of FP64 cores, register files etc.

I mean, they are talking up deep learning alot but did anyone see the 5TFLOPS (FP64) numbers?? that is absolutely huge for a single GPU....! Surprised they weren't highlighting this alot compared to all the VR/Deep learning, the trendy stuff.

Wonder when we will get some news on the gaming products which still makes up the majority of their revenue/profits.

Adored · Apr 5, 2016

MrTeal said:
Well, he did also hold up the Drive PX2. Interestingly, I'm pretty sure it has new (and much smaller) silicon.
I didn't save the live stream and the pics I've seen are crappy, but it's clearly a different and smaller GPU. Still, not a demo.

vs

Yes I forgot about that thanks. I remember looking closely at the die during the presentation and thinking it must be 250-350mm2 though, which seems really large for the probable GP106/GP107 chip.

Silverforce11 · Apr 5, 2016

Arachnotronic said:
Guess the people who said Pascal was Maxwell on 16FF+ were dead wrong.

Lol is that your take away from the info?

Pascal still is Maxwell+ on 16FF. What's the plus? FP64, FP16 mix-mode, NVLink, HBM2.

This is what NV says about GP100 on that blog post: Tesla P100: Built for HPC and Deep Learning

300W TDP is actually going backwards. AS EXPECTED because they tore out all the power hungry features in Maxwell to make it a gaming focused chip, and now Pascal needs to put those back in to compete in the HPC market.

This is actually the biggest change in terms of gaming performance:

GP100’s SM incorporates 64 single-precision (FP32) CUDA Cores. In contrast, the Maxwell and Kepler SMs had 128 and 192 FP32 CUDA Cores, respectively. The GP100 SM is partitioned into two processing blocks, each having 32 single-precision CUDA Cores, an instruction buffer, a warp scheduler, and two dispatch units. While a GP100 SM has half the total number of CUDA Cores of a Maxwell SM, it maintains the same register file size and supports similar occupancy of warps and thread blocks.

For those who have paid attention to the talks of wavefronts and warp sizes in game engines.

Each SM under the control of the instruction buffer, scheduler and dispatch units now only has 64 CC to process, instead of 128 (Maxwell) and 192 (Kepler).

This is GCN-like and means that a warp/wavefront of 64 used by console optimized engines will instantly hit peak CC utilization.

What this means in real effective terms is much less potential for inefficiency of CC usage if the game engine is poorly optimized for the warp/wavefront. I'm going to call it now, Pascal will perform great in GCN-optimized game engines, better than Maxwell and vastly better than Kepler.

Despite only a small increase in FP32 or potential gaming performance, comparing GM200 ~7TFlops to GP100 ~10.6 TFlops, in effect the change above means each paper spec flop is worth more for gaming due to improved CC utilization (for those under the impression Maxwell was already 100% utilization, lololol).

While some were expecting a double of performance, it's not going to happen given how HPC compute focused GP100 has to be, compared to GM200 which was made for gaming.

IMO, expect a ~60% improvement in gaming, higher if games are GCN-optimized, which is actually good news because most of the AAA stuff now are already console optimized. We could well see it performing ~80% faster in most AAA games due to the console-effect in full swing. Not too bad for a HPC focused chip!

Azix · Apr 5, 2016

when is the next opportunity for them to demo consumer cards? kinda strange they haven't said anything about them. We can assume this tesla is the max specs tho

KaRLiToS · Apr 5, 2016

:thumbsup: Great post Silverforce11

jpiniero · Apr 5, 2016

Azix said:
when is the next opportunity for them to demo consumer cards? kinda strange they haven't said anything about them. We can assume this tesla is the max specs tho

Computex, at the end of May. But it seems really unrealistic to think anything will ship then.

Considering GP100 doesn't have any (?) ROPs it seems pretty likely now that the GP102 rumor is true although you won't see it until well into 2017. I guess that would be 3840 cores without the DP. It does feel like that would still be too large for a consumer GPU so I guess we will have to see.

Silverforce11 · Apr 5, 2016

Silverforce11 said:
That's a pretty big leap, 17B transistors on TSMC 16FF, that they have said in marketing (think best case scenario) achieves 2x the density compared to 28nm.

GM200 is 8B transistors, at 600mm2. A perfect 2x is 16B. But we know it's rarely the case, not everything scales well. A more realistic target is around 14B transistors at ~550mm2 on 16nm FF.

IIRC, the FP64 target for Intel's KL is above 3TFlops, right? Say 3.5Tflops DP/FP64 is a target to reach for big Pascal. So yeah, it needs to be a huge chip.

I posted that a few days ago. Because some folks were having strange ideas that GP100 is going to be smaller due to node challenges.

Basically, NV went ALL IN. They don't want to just beat Intel's KL at ~3.5 FP64 TFlops estimate, they want to smash it and retain the HPC market dominance. GP100 will wipe the floor with KL at over 5.3TFlops FP64.

This is also very bad for AMD, because Intel's push will lead to more OpenCL, x86 etc ecosystems in HPC. NV knows its HPC dominance is due to CUDA so any erosion on the ecosystem has harsh long-term damage.

What do gamers get out of this? Jack all this year. 610mm2 on a brand new node that has issues with tiny mobile chips... think about the horrible yields.

Where's GP104? AWOL. For NV to get enough functional GP100 Tesla, one would have to assume most of the 16nm wafers are dedicated to producing GP100. Remember that HPC is their priority market, one that is very profitable and under threat from Intel.

GP107 is likely going to be first GTX this year, launching bottom up. GP106 later, as its still a small chip and yields will be acceptable, so the smaller portions of 16nm wafer that's not dedicated to GP100 can keep these two small chips going to consumers.

Don't forget, Raja was public in an interview saying AMD is well ahead of NV on the FF transition. He would not be saying that unless he was certain. The cause for NV's 16nm FF consumer GTX delay is GP100.

Does this analysis makes sense to some?

PhonakV30 · Apr 5, 2016

check out this :

http://cdn.wccftech.com/wp-content/uploads/2016/03/NVIDIA-Drive-PX-2_Official_12.jpg

and they show us before :

https://a.disquscdn.com/uploads/mediaembed/images/3456/5711/original.jpg

:whiste:

jpiniero · Apr 5, 2016

Silverforce11 said:
Where's GP104? AWOL. For NV to get enough functional GP100 Tesla, one would have to assume most of the 16nm wafers are dedicated to producing GP100. Remember that HPC is their priority market, one that is very profitable and under threat from Intel.

Most of the 16FF wafers are probably going to Apple for the A10. There should be enough to go around though since TSMC doubled production in March. The consumer GPUs are probably just behind.

Arachnotronic · Apr 5, 2016

Silverforce11 said:
Where's GP104? AWOL. For NV to get enough functional GP100 Tesla, one would have to assume most of the 16nm wafers are dedicated to producing GP100. Remember that HPC is their priority market, one that is very profitable and under threat from Intel

Your understanding of NVIDIA's business needs some work. Tesla is a very, very small portion of NV's overall revenues; GeForce GTX is far more important to the company in terms of total revenue and gross profit dollars.

Arachnotronic · Apr 5, 2016

silverforce11 said:
don't forget, raja was public in an interview saying amd is well ahead of nv on the ff transition. He would not be saying that unless he was certain. The cause for nv's 16nm ff consumer gtx delay is gp100.

Does this analysis makes sense to some?

edit: Nvm.

Rvenger · Apr 5, 2016

I can tell you that I sell way more Quadros than anything. Geforce is not Nvidia's primary focus. Not even close.

NVIDIA Pascal Thread

Diamond Member

Golden Member

Golden Member

Diamond Member

Senior member

Senior member

Senior member

Member

Senior member

Senior member

Member

Golden Member

Diamond Member

Diamond Member

Senior member

Lifer

Golden Member

Golden Member

Lifer

Lifer

Senior member

Lifer

Lifer

Lifer

Elite Member <br> Super Moderator <br> Video Cards