• We should now be fully online following an overnight outage. Apologies for any inconvenience, we do not expect there to be any further issues.

Upcoming GTX 750 Ti tested

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

ams23

Senior member
Feb 18, 2013
907
0
0
No, you don't get it. GTX 750 Ti has a quad-channel 128-bit memory interface which obviously limits perf. in bandwidth-limited scenarios. The actual SMX's are very clearly just as capable as any other Kepler GPU SMX.
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
Yup, perf/SMX can only be determined if there are no other limiting factors or if the ratio bandwidth/GFLOP is roughly the same.
 

el etro

Golden Member
Jul 21, 2013
1,584
14
81
Yup, perf/SMX can only be determined if there are no other limiting factors or if the ratio bandwidth/GFLOP is roughly the same.

boxleitnerb, i'm only presuming that Nvidia relies on more and less complex cuda cores in order to improve perf/w of its architectures. This technique allowed to Nvidia to double theoretic shader performance from 580 to gtx 680 with only one node jump. Before Kepler, Nvidia relied on complex cuda cores to achieve too max GPGPU performance.
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
That doesn't negate anything I've said. Currently, a judgement of architectural efficiency of Maxwell is not possible.
 

el etro

Golden Member
Jul 21, 2013
1,584
14
81
That doesn't negate anything I've said. Currently, a judgement of architectural efficiency of Maxwell is not possible.

Your statement don't negate the statement i presumed(about less complex stream processors).
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
According to sweclockers both, 750 and 750 Ti perform about 25% better than the 650 and 650 Ti and both require no additional power connectors. If that is true, then Maxwell has some crazy energy efficiency, given that it is still on 28nm.

GTX 650 Ti has a rated TDP of 110W. If I'm not mistaken, the official limit for a card with no external power connector is 75W. That would represent such a big improvement that I'm skeptical they managed to pull it off on the same process node. If they did, then it's a massive coup.
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
Your statement don't negate the statement i presumed(about less complex stream processors).

It does this part:
Viewing this scores, seems than performance/stream processor of Maxwell will be lower than Kelper's.

You cannot judge performance of the SMX if there is another limiting factor. How hard is that to understand?
 

Cloudfire777

Golden Member
Mar 24, 2013
1,787
95
91
It seems legit that older claims about Maxwell big chip pack 6000 stream processors.
If maxwell needs 960 sps clocked at 1100Mhz to beat 768 Kelper sps clocked at 928+boost Mhz and consumes less at the same 28nm process, here's the magic.

Exactly!
I played around with some numbers yesterday:

GTX 650 Ti:
768 cores @ 1046MHz (803328)
4995 GPU points in 3DMark11

GTX 750 Ti:
960 cores @ 1176MHz (1 128 960)
5608 GPU points in 3DMark11

5608/4995 = 1.13
1128960/803328 = 1.40

These numbers mean that if the two GPUs had the same performance efficiency, GTX 750 Ti should score roughly 40% more than GTX 650 Ti. But it was only 13% more.

Ergo, I think Maxwell cores are weaker than Kepler cores. But Maxwell cores use less power than Kepler cores.

That goes hand in hand with the rumor about huge core count for Maxwell. They will need it for great performance, and Maxwell will allow it because it use little power. BUT Nvidia have to wait for 20nm to be able build these 3000+ core GPUs.
So the upcoming Maxwell`s next month is just an introduction to the architecture to show off the power reductions we see with the architecture. That is probably why Nvidia gave green light to just do the first ones on 28nm too. Its not about performance now.

Sweclockers also said 2 days ago that GTX 750 (non Ti) could be fed through the PCI-e port which also fortify my belief that Maxwell is indeed very power efficient.

My 5 cents
:thumbsup:
 
Last edited:

Imouto

Golden Member
Jul 6, 2011
1,241
2
81
With weaker cores I'd really like to know the die size of this thing.
 

el etro

Golden Member
Jul 21, 2013
1,584
14
81
With weaker cores I'd really like to know the die size of this thing.

Its not weaker, its less complex. Nvidia's way to improve efficiency.

But they are surely prioritizing perf/w gains in graphics(DX/ OGL) and professional markets over GPGPU general performance... This was the inverse approach of AMD with its GCN arch.
We can find in next round an reedit of R800(in Maxwell) vs Fermi(In Pirate Islands) battle.

It does this part:


You cannot judge performance of the SMX if there is another limiting factor. How hard is that to understand?

What limiting factor we have on the scores? Memory speeds?

The 928Mhz+boost speeds card(original 650Ti) generally runs above 1000Mhz. If Nvidia managed to pack the performance of a 110W TDP card into a card that not needs external power connector, great improvement have happened here.
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
It will be interesting to see how Maxwell does in cryptocoin (Scrypt) mining.

The reports so far indicate that the GTX 750 Ti will have 960 SPs. The question, then, is how a Maxwell SP compares to a GCN SP in hash rate. A lot of this depends on whether Nvidia fixed the integer rotate bottleneck (this takes only 1 cycle on GCN, but 3 on Kepler). If Maxwell can keep up with GCN in hashing on a per-SP basis, then the GTX 750 Ti could have a hash rate between a 7850 and 7870 - probably something from 350 to 400 KH/sec. At an estimated MSRP of $150, that would be competitive with AMD's Pitcairn-based cards on a KH/$ basis, while using less power. Of course all this is speculation - if they didn't make any adjustments to integer rotate, mining performance will remain inferior.
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,605
6
81
What limiting factor we have on the scores? Memory speeds?

The 928Mhz+boost speeds card(original 650Ti) generally runs above 1000Mhz. If Nvidia managed to pack the performance of a 110W TDP card into a card that not needs external power connector, great improvement have happened here.

Yes, memory bandwidth. Both the 750 Ti and the 650 Ti have 86.4 GB/s.
How should the former fully utilize its computing power in all situations?

The GTX 680 was already a bit bandwidth limited and it has roughly 3 TF and 192 GB/s. The 750 Ti has about 2/3 of the computing power but only 1/2 of the bandwidth. So it likely is even more bandwidth limited.
 

ams23

Senior member
Feb 18, 2013
907
0
0
The rumored specs of GTX 750 Ti make it obvious that these are not "less complex" CUDA cores. The # of CUDA cores per SMX is the same. The # of TMU's per SMX is the same. The # of ROP's per 32-bit mem. channel is the same.
 

ams23

Senior member
Feb 18, 2013
907
0
0
These numbers mean that if the two GPUs had the same performance efficiency, GTX 750 Ti should score roughly 40% more than GTX 650 Ti. But it was only 13% more.

No it shouldn't, because memory bandwidth is essentially the same between GTX 750 Ti and GTX 650 Ti. It is very clear that GTX 750 Ti is bandwidth-limited in this particular benchmark.
 

Cloudfire777

Golden Member
Mar 24, 2013
1,787
95
91
No it shouldn't, because memory bandwidth is essentially the same between GTX 750 Ti and GTX 650 Ti. It is very clear that GTX 750 Ti is bandwidth-limited in this particular benchmark.

No that is incorrect. 3DMark Performance preset will only utilize around 40-50% of the memory bus with a bandwidth of 86GB/s.
 

Cloudfire777

Golden Member
Mar 24, 2013
1,787
95
91
GTX 750 Ti + GTX 750 is confirmed Maxwell.

TchPowerUp released GPU-Z version 0.7.6, the latest version of the popular lightweight graphics subsystem information, monitoring, and diagnostic utility. Version 0.7.6 comes with support for new GPUs, including NVIDIA's upcoming Maxwell architecture.

Added support for NVIDIA GTX Titan Black, GTX 750 (GM107), GTX 750 Ti (GM107), GeForce 840M (GM108), GTX 760 (192-bit), GT 750M (Apple), GT 735M, GT 720M

http://www.techpowerup.com/197297/techpowerup-gpu-z-0-7-6-released.html
 

ams23

Senior member
Feb 18, 2013
907
0
0
No that is incorrect. 3DMark Performance preset will only utilize around 40-50% of the memory bus with a bandwidth of 86GB/s.

3dmark is either GPU bound or CPU bound depending on the version and settings. There are only two logical reasons why a purported GTX 750 Ti would be "only" 13% ahead of GTX 650 Ti: it is GPU bound (which would mean memory bus / ROP bound, not shader bound!!!) or it is CPU bound (which doesn't appear to be the case here).

Try comparing GTX 650 Ti to GTX 650 Ti Boost. The only difference between the two (other than a meager 5-10% difference in clock speed) is the extra 32-bit mem. channel and corresponding increase in ROP's for the latter GPU. The CUDA core count is identical between the two. According to your logic, these two GPU's should have nearly identical perf. in 3dmark11, which is clearly not the case at these GPU-limited settings because the GTX 650 Ti Boost is more than 20% ahead of GTX 650 Ti in comparison! Now I'm not saying that 3dmark11 is completely bound by bandwidth (it isn't), nor does it scale linearly with bandwidth (it doesn't), but clearly bandwidth makes a significant impact on the end result.

If mem. bus interface (and hence ROP throughput) is increased on GTX 750 Ti to match that of GTX 650 Ti Boost, then it should always perform greater than or equal to the latter. The CUDA cores are not weaker in any way as far as I can tell based on the rumored specs (the CUDA cores per SMX, the TMU's per SMX, and ROP's per mem. channel are not cut down in any way).

On a side note, the pconline 3dmark11 P results for GTX 750 Ti look suspicious to me. Not only is it strange that it says HD 4600 for the GPU, but it also doesn't make sense that a GTX 750 Ti would ever be slower than a GTX 650 Ti when looking at the purported specs.

And FWIW, GTX 750 Ti appears to be a Kepler GPU that is influenced to some extent by Kepler.M (rather than a Maxwell GPU). It may be a bridge to Maxwell, so to speak. I don't think it is coincidence that CUDA core count, mem. bandwidth, and pixel fillrate [ROP throughput] are all 5x greater compared to Kepler.M in Tegra K1 (note that texture fillrate is 11x greater).
 
Last edited:

Cloudfire777

Golden Member
Mar 24, 2013
1,787
95
91
GTX 650 Ti Boost scores 22% more than GTX 650 Ti. Why?

A) GTX 650 Ti Boost is clocked 12% higher than GTX 650 Ti.
B) GTX 650 Ti Boost have 24 ROPs while GTX 650 Ti have 16 ROPs.
C) Memory bandwidth is moot on 3DMark Performance preset. Sure you can add a couple of GPU points but it does not impact the 3DMark score because the memory bus is not being capped out.

Memory bandwidth comes to play in 3DMark Xtreme preset and 1080p gaming and such. Not 3DMark Performance preset with 720p and 1xAA and measly 500MB VRAM usage.

But who cares. I`m not buying this GPU anyway. GM104 on 20nm will be my next GPU (SLI).
 
Last edited:

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
GTX 650 Ti Boost scores 22% more than GTX 650 Ti. Why?

A) GTX 650 Ti Boost is clocked 12% higher than GTX 650 Ti.
B) GTX 650 Ti Boost have 24 ROPs while GTX 650 Ti have 16 ROPs.
C) Memory bandwidth is moot on 3DMark Performance preset. Sure you can add a couple of GPU points but it does not impact the 3DMark score because the memory bus is not being capped out.

Memory bandwidth comes to play in 3DMark Xtreme preset and 1080p gaming and such. Not 3DMark Performance preset with 720p and 1xAA and measly 500MB VRAM usage.

But who cares. I`m not buying this GPU anyway. GM104 on 20nm will be my next GPU (SLI).

Memory bandwidth matters.

55307.png


The desktop 640 is clocked higher (950 mhz vs 900 mhz) but scores lower.
 

el etro

Golden Member
Jul 21, 2013
1,584
14
81
If mem. bus interface (and hence ROP throughput) is increased on GTX 750 Ti to match that of GTX 650 Ti Boost, then it should always perform greater than or equal to the latter. The CUDA cores are not weaker in any way as far as I can tell based on the rumored specs (the CUDA cores per SMX, the TMU's per SMX, and ROP's per mem. channel are not cut down in any way).

Understand: less complex cores don't mean weaker cores.

Memory bandwidth matters.

In all tests(in this case)?


55307.png


The desktop 640 is clocked higher (950 mhz vs 900 mhz) but scores lower.

I saw it. But how you conclude that the Maxwell card is bandwidth limited?
 
Last edited:

Cloudfire777

Golden Member
Mar 24, 2013
1,787
95
91
Sigh. That is an entirely different benchmark.

Fire Strike test runs in 1080p. It use more VRAM. It have more visual graphics running, like tessellation and shadows.
Whatever. Believe what you guys want. I`m not being part of this discussion anymore.
 

mindbomb

Senior member
May 30, 2013
363
0
0
i don't think the 128 bit bus on this card or bonaire is that big of a deal. There is 7ghz GDDR5, and dx11 has good texture compression, so I feel the 960 cores and 16 rops are gonna end up being the bigger problem.