NVIDIA Pascal Thread

Page 51 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Sweepr

Diamond Member
May 12, 2006
5,148
1,142
131
From the other thread:

back to topic:
http://www.hardwareluxx.de/index.ph...it-pascal-gpu-und-samsung-gddr5-speicher.html

GP106 spotted in Revision A1 (should be the production one) and made in week 13 of 2016 (we are week 15). Still hot from the oven !

For such small chip, qualification should be fast and we can expect to see boards in retail before September.

Then on chiphell forum, someone showed ~300mm² Pascal with 8GB GDDR5 (link doesn't work for me but there is it: https://www.chiphell.com/thread-1563086-1-1.html )

Looks like Nvidia is going to deploy Pascal from top to bottom at very fast pace (this is where deep pockets and big R&D budget helps)


GP106
nviida-gtc2016-drive-px2-6-rs.jpg


GP104 - ~ 290-300mm²
GP106 - ~195-205mm²
 
Last edited:

DooKey

Golden Member
Nov 9, 2005
1,811
458
136
I'm starting to feel that itch for new shinies. Come to me my precious ones.........AMD or NV.....just give me some new shinies to buy!!
 

Adored

Senior member
Mar 24, 2016
256
1
16
That's identical to GK104's die size, does anyone have the actual measurements of that?
 

MrTeal

Diamond Member
Dec 7, 2003
3,554
1,658
136
Wouldn't 2560 CC's make more sense than 3072, given how the cores are arranged in GP100? Granted GP104 might not maintain the same GPC:TPC:SM:CC ratio as GP100, but that does seem likely. GK110 had the same FP32 CUDA Core number per SMX as GK104, just with additional FP64 units. GM204 and GM200 shared the same SMM to CC ratio as well.
 

Kris194

Member
Mar 16, 2016
112
0
0
GTX X80 - 3072 CUDA cores, with updated Maxwell to Pascal Arch, and higher core clocks.
GTX X70 - 2560 CUDA cores.

I don't think so, If they will keep 2:4:6 ratio it will be more like

x80 - 2560 CUDA cores
x70 - 2304(?) CUDA cores

To have 3072 cores in GP104, GP104 would have to have 4,8 GPC, it doesn't make any sense.
 
Last edited:

Head1985

Golden Member
Jul 8, 2014
1,863
685
136
I think AMD will win in this generation..vega 11 with HBM2 and 4096Sp will crush this.

Pascal looks pretty lame
256bit 8Ghz DDR5 64rops 2560sp is just lame.
it better have 3072Sp or its crap
 

el etro

Golden Member
Jul 21, 2013
1,581
14
81
Good point on Mem chips, they look awesome and will make the PCB smaller. Just wanted to it be GDDR5x, would push the BW of the chips to bigger heights without the need of to use bigger and hungrier buses.
 

jpiniero

Lifer
Oct 1, 2010
14,511
5,159
136
If it's really 300 mm2, then my pricing estimates of $399/$649 are likely too low. Maybe it does have 384-bit memory. This does seem like it could be trouble if it has the same core:SM ratio that GP100 does.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
I think AMD will win in this generation..vega 11 with HBM2 and 4096Sp will crush this.

Pascal looks pretty lame
256bit 8Ghz DDR5 64rops 2560sp is just lame.
it better have 3072Sp or its crap

Its Vega 10 that got 4096SP, not Vega 11.

Also they may not compete at all.

Polaris 10 is the first one, and that looks to be 2304ish SP and 256bit GDDR5.

Unless GP104 pulls a GDDR5X or HBM2. Then nobody with a GTX970/290 and up is going to upgrade this year.
 

Glo.

Diamond Member
Apr 25, 2015
5,662
4,421
136
I think AMD will win in this generation..vega 11 with HBM2 and 4096Sp will crush this.

Pascal looks pretty lame
256bit 8Ghz DDR5 64rops 2560sp is just lame.
it better have 3072Sp or its crap

If you are correct, that will mean 2560 GCN4 Core 232 mm2 GPU will compete in performance with 300 mm2 2560 CUDA core GPU and in comparison of clock-to-clock will win.

But overall, you are correct, it will be that X80 is 2560 CUDA core.
 

MrTeal

Diamond Member
Dec 7, 2003
3,554
1,658
136
Its Vega 10 that got 4096SP, not Vega 11.

Also they may not compete at all.

Polaris 10 is the first one, and that looks to be 2304ish SP and 256bit GDDR5.

Unless GP104 pulls a GDDR5X or HBM2. Then nobody with a GTX970/290 and up is going to upgrade this year.

Source? That's the same number as Fiji, which seems extremely unlikely for the next gen flagship. It also leave very little room between 4096 and the proposed P10 at 2304 SP for Vega 11 to slot in at. Hawaii was 41% larger than Tahiti; Fiji was 41% larger than Hawaii. The gap at the start of GCN was even larger, with Tahiti being 60% more shaders than Pitcairn while Pitcairn was twice as many as Cape Verde. Even if you space the 14nm chips out evenly, that would give V11 1/3rd more shaders than P10 and V10 1/3rd more shaders than V11.
 

Adored

Senior member
Mar 24, 2016
256
1
16
Nvidia aren't going to build a 300mm2 bandwidth-starved GPU on 16FF+. Either it has a 384-bit bus or 256-bit bus with compression + arch enhancements meaning bandwidth constraints are further lessened. I prefer the latter.

Don't forget that Samsung's process is denser than TSMC's as well, so AMD's 232mm2 could be closer to Nvidia's 275mm2. AMD also have history of wider buses in smaller area, look at Hawaii and Tonga, however the memory amounts in the leaks don't appear to support a 384-bit bus.
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Source? That's the same number as Fiji, which seems extremely unlikely for the next gen flagship. It also leave very little room between 4096 and the proposed P10 at 2304 SP for Vega 11 to slot in at. Hawaii was 41% larger than Tahiti; Fiji was 41% larger than Hawaii. The gap at the start of GCN was even larger, with Tahiti being 60% more shaders than Pitcairn while Pitcairn was twice as many as Cape Verde. Even if you space the 14nm chips out evenly, that would give V11 1/3rd more shaders than P10 and V10 1/3rd more shaders than V11.

http://hexus.net/tech/news/graphics/91592-amd-greenland-vega-10-said-4096-stream-processors/
 

jpiniero

Lifer
Oct 1, 2010
14,511
5,159
136
Nvidia aren't going to build a 300mm2 bandwidth-starved GPU on 16FF+. Either it has a 384-bit bus or 256-bit bus with compression + arch enhancements meaning bandwidth constraints are further lessened. I prefer the latter.

I suppose one option is that both products are cut so that 8 Ghz GDDR5 would be enough, and then a followup in 2017 would up the core counts and add GDDR5X. Or it's going to be an epic paper launch just to mess with AMD and Polaris but won't actually be really available until September or October.
 

DooKey

Golden Member
Nov 9, 2005
1,811
458
136
Nvidia aren't going to build a 300mm2 bandwidth-starved GPU on 16FF+. Either it has a 384-bit bus or 256-bit bus with compression + arch enhancements meaning bandwidth constraints are further lessened. I prefer the latter.

Don't forget that Samsung's process is denser than TSMC's as well, so AMD's 232mm2 could be closer to Nvidia's 275mm2. AMD also have history of wider buses in smaller area, look at Hawaii and Tonga, however the memory amounts in the leaks don't appear to support a 384-bit bus.

I hope you're right. Unfortunately we're talking about Glofo executing Samsung tech and they don't have the best reputation for exectution.........
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
vega10 its speculation.It cant be vega 10, because there is no room for vega 11 then.
polaris 10 2560SP
Vega10 4096SP
Vega10 cut 3584SP

Where is vega11?
vega11 3072SP
vega11 cut 2560SP?Same sp as polaris 10?no way.

If Polaris 10 doesn't get GDDR5X, its bottlenecked. Then a 2500-2800sp cut down Vega 11 part is going to be much faster. Not to mention it could have another TMU/ROP layout as well.
 

swilli89

Golden Member
Mar 23, 2010
1,558
1,181
136
If Polaris 10 doesn't get GDDR5X, its bottlenecked. Then a 2500-2800sp cut down Vega 11 part is going to be much faster. Not to mention it could have another TMU/ROP layout as well.

You make so many matter of fact statements without knowing so many constants. You don't know Polaris:

uArch changes
sp performance changes
memory compression changes
clockspeed
memory speed

Why don't you let AMD engineers worry about matching up a memory interface with their graphics core?
 

Adored

Senior member
Mar 24, 2016
256
1
16
If Polaris 10 doesn't get GDDR5X, its bottlenecked. Then a 2500-2800sp cut down Vega 11 part is going to be much faster. Not to mention it could have another TMU/ROP layout as well.

Most people would have said the 980 would be totally bottlenecked with bandwidth too but it doesn't appear to be the case. I prefer to compare a Polaris 10 at ~Fury X performance level to the 980 rather than Hawaii.

Yes it means AMD will need to have made another leap in compression or other architectural advances, but I don't feel it's impossible. I fully expect Nvidia to have made a similar leap if not greater.

I think we are all going to be surprised as just how far a 256-bit bus can stretch.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Most people would have said the 980 would be totally bottlenecked with bandwidth too but it doesn't appear to be the case. I prefer to compare a Polaris 10 at ~Fury X performance level to the 980 rather than Hawaii.

Yes it means AMD will need to have made another leap in compression or other architectural advances, but I don't feel it's impossible. I fully expect Nvidia to have made a similar leap if not greater.

I think we are all going to be surprised as just how far a 256-bit bus can stretch.

The 980 is bandwidth bottlenecked. ;)

If you dream of Fury X performance level, then there is a real long way to 512GB/sec. And even Fury X benefits with faster memory. So does Hawaii. And now some 192-224GB/sec bus will be enough?
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
You make so many matter of fact statements without knowing so many constants. You don't know Polaris:

uArch changes
sp performance changes
memory compression changes
clockspeed
memory speed

Why don't you let AMD engineers worry about matching up a memory interface with their graphics core?

If it bugs you so much, ignore it. :)
 

Adored

Senior member
Mar 24, 2016
256
1
16
Last edited: