Question Why is the GTX 970 so much faster than the GTX 770?

Sam K. · Apr 21, 2020

Hello! I'm quarantined in my home and trying to figure out various different GPU specs and their impact on real world performance; as I don't have anything better to do at the moment!

Anyhow, I hit a wall while comparing GTX 770 with GTX 970 and need your opinions on the matter.

First, look at the screenshot below. As you can see, both GTX770 and 970 are running at the same core (1,215MHz) and memory frequencies (3,506MHz) and have the same 256-Bit bandwidth, which eliminates any bandwidth related variables. So far so good.

Since we know that the GTX770 has 1536 cores whereas the 970 has 1664, it's easy enough to calculate theoretical GFlop performance of both GPUs at this 'exact' moment:

GTX970: 0.002 x 1,215MHz x 1,664 cores = 4,043 GFlops.
GTX770: 0.002 x 1,215MHz x 1,536 cores = 3,732 GFlops.

As you can see, the GTX770 is just ~8% slower than the GTX970, or at least it should be, yet the frame rate suggests that the 770 is actually ~43% slower!

My question is a simple 'why'? Why the huge difference? What am I missing here? They should perform within ~10% margin because they've the exact same memory bandwidth and frequency and yet...

It's just super confusing!

So, any ideas?

pepone1234 · Apr 21, 2020

Kepler had problems fully utilizing its cuda cores in games not optimized for this architecture.
This architecture had a different approach between warp schedulers and the number of cuda cores per SM that required per game optimizations.

This is why in maxwell the SM had ony 128 cuda cores per SM and not 192.

NTMBK · Apr 21, 2020

pepone1234 said:
Kepler had problems fully utilizing its cuda cores in games not optimized for this architecture.
This architecture had a different approach between warp schedulers and the number of cuda cores per SM that required per game optimizations.

This is why in maxwell the SM had ony 128 cuda cores per SM and not 192.

This. Kepler requires the shader compiler to find instruction level parallelism in order to max out the TFLOPS. From the Kepler tuning guide:

Also note that Kepler GPUs can utilize ILP in place of thread/warp-level parallelism (TLP) more readily than Fermi GPUs can. Furthermore, some degree of ILP in conjunction with TLP is required by Kepler GPUs in order to approach peak single-precision performance, since SMX's warp scheduler issues one or two independent instructions from each of four warps per clock.

This means that either you need the shader to be written in a way that provides ILP (e.g. loop unrolling), or you need very smart shader compilers to extract it.

Also- the Geforce 970 has triple the L2 cache (1.5MB vs 512KB), so it has better average latency and less pressure on the memory bus. And Maxwell tweaked the cache hierarchy, combining L1 and texture cache.

Stuka87 · Apr 21, 2020

Kepler was basically forgotten by nVidia once Maxwell came out. When Witcher 3 launched, a GTX 960 was faster than a GTX 780, and came close to the less than a year old GTX 780 Ti. It made a lot of people very unhappy. nVidia did do some tweaks to make Kepler a bit a faster in Witcher 3, but what they also did was have review places test them with the special features like hairworks disabled so they didn't look as bad.

The same holds true for newer games for the reasons stated above.

Sam K. · Apr 21, 2020

pepone1234 said:
Kepler had problems fully utilizing its cuda cores in games not optimized for this architecture.
This architecture had a different approach between warp schedulers and the number of cuda cores per SM that required per game optimizations.

This is why in maxwell the SM had ony 128 cuda cores per SM and not 192.

Could it be due to the number of ROPs? I just noticed that 970 has 24 more ROPs than the 770 (32 vs. 56), which is a rather significant leap.

Do you think the drastically low number of ROPs might be bottlenecking the 770?

ondma · Apr 21, 2020

Pepone answered your question already. They are different architectures.

amenx · Apr 21, 2020

Stuka87 said:
Kepler was basically forgotten by nVidia once Maxwell came out. When Witcher 3 launched, a GTX 960 was faster than a GTX 780, and came close to the less than a year old GTX 780 Ti. It made a lot of people very unhappy. nVidia did do some tweaks to make Kepler a bit a faster in Witcher 3, but what they also did was have review places test them with the special features like hairworks disabled so they didn't look as bad.

The same holds true for newer games for the reasons stated above.

You are thinking GTX 970 vs 780 maybe, not the 960.

The Witcher 3: Performance Analysis

In this article, we put the GTX Titan X, R9 295X2, GTX 980, R9 290X, GTX 970, R9 290, GTX 960, and R9 285 through the Witcher 3: Wild Hunt. We do so at resolutions of 1600x900, 1920x1080, 2560x1440, and 4K to assess what hardware you need to play this recently released top-title.

www.techpowerup.com

Stuka87 · Apr 21, 2020

amenx said:
You are thinking GTX 970 vs 780 maybe, not the 960.

The Witcher 3: Performance Analysis

In this article, we put the GTX Titan X, R9 295X2, GTX 980, R9 290X, GTX 970, R9 290, GTX 960, and R9 285 through the Witcher 3: Wild Hunt. We do so at resolutions of 1600x900, 1920x1080, 2560x1440, and 4K to assess what hardware you need to play this recently released top-title.

www.techpowerup.com

On launch day the 960 was faster. After the driver update for Kepler, and some game changes (such as dropping the crazy tessellation) the 780 pulled ahead.

amenx · Apr 22, 2020

Stuka87 said:
On launch day the 960 was faster. After the driver update for Kepler, and some game changes (such as dropping the crazy tessellation) the 780 pulled ahead.

Nope, you are wrong. GTX 960 release date Jan 22, 2015. This is release date performance:

ASUS GTX 960 STRIX OC 2 GB Review

The ASUS GTX 960 STRIX OC is a custom variant of the GTX 960 that comes with an overclock out of the box on both GPU and memory. It is also the only card that features a backplate. Like all other boards, it will completely turn off its fans in idle and light gaming for the perfect noise-free...

www.techpowerup.com

I presume you have alternate source/link?
I remember very clearly, the 960 was a very weak card for a x60 card, and it was much criticized for not keeping up with traditional Nvidia x60 performance.

amenx · Apr 22, 2020

nurturedhate · Apr 22, 2020

amenx said:
Nope, you are wrong. GTX 960 release date Jan 22, 2015. This is release date performance:

ASUS GTX 960 STRIX OC 2 GB Review

The ASUS GTX 960 STRIX OC is a custom variant of the GTX 960 that comes with an overclock out of the box on both GPU and memory. It is also the only card that features a backplate. Like all other boards, it will completely turn off its fans in idle and light gaming for the perfect noise-free...

www.techpowerup.com

I presume you have alternate source/link?
I remember very clearly, the 960 was a very weak card for a x60 card, and it was much criticized for not keeping up with traditional Nvidia x60 performance.

It's not whether the 960 was good or bad, it was a bad card, it's whether it was faster in specifically The Witcher 3 with hairworks on. It's more a statement on how high levels of tessellation broke Kepler than anything about the 960 and it's Witcher 3 performance.

Spjut · Apr 22, 2020

For modern games, there are strong indications that Nvidia doesn't care about Kepler anymore. The worst case right now is Doom Eternal, it even seems to be downright broken for Kepler.

Graphs start at 4:40. There's barely any difference even between the various Kepler GPUs featured in the test either.

Digitalfoundry says the console version compares to Medium on PC.

If someone back in 2013 had said that the PS4, or even the freaking HD 7790, would humiliate the 780 Ti and Titan in late-gen games, he'd have been called an idiot.

Stuka87 · Apr 22, 2020

amenx said:
Nope, you are wrong. GTX 960 release date Jan 22, 2015. This is release date performance:

ASUS GTX 960 STRIX OC 2 GB Review

The ASUS GTX 960 STRIX OC is a custom variant of the GTX 960 that comes with an overclock out of the box on both GPU and memory. It is also the only card that features a backplate. Like all other boards, it will completely turn off its fans in idle and light gaming for the perfect noise-free...

www.techpowerup.com

I presume you have alternate source/link?
I remember very clearly, the 960 was a very weak card for a x60 card, and it was much criticized for not keeping up with traditional Nvidia x60 performance.

As stated by nuturedhate, I specifically was referencing The Witcher 3, and the release date of that game.

Byte · Apr 23, 2020

I've been using a 770 for a long time as i have all my other cards at working doing stuff. Its pretty damn slow, but it will run things a low res and most blizzard games run ok. But going to a 970 almost feels like a 2 generation leaf. Kepler gen is almost useless nowadays and the only thing keeping mine going this long is it is the 4GB ram version. So to all those that said you will never use that much VRAM, well you can go down in history with the fake bill gates quote, Xram aught to be enough.

mopardude87 · Apr 23, 2020

Byte said:
I've been using a 770 for a long time as i have all my other cards at working doing stuff. Its pretty damn slow, but it will run things a low res and most blizzard games run ok. But going to a 970 almost feels like a 2 generation leaf. Kepler gen is almost useless nowadays and the only thing keeping mine going this long is it is the 4GB ram version. So to all those that said you will never use that much VRAM, well you can go down in history with the fake bill gates quote, Xram aught to be enough.

Yeah i feel the same way about my 4gb 960. Bought it locally for $50 and it was great. Really stinks for todays games but it played BF5 with higher then 2gb vram usage at 900p. Performance was lackluster that was for sure. I had a 2gb 770 back in the day in another build and it was pretty good for its time, i jumped on a 970 and it was a good upgrade.

This 1080ti may be the first card since the gtx280 i had that honestly didn't compromise on anything lol. I picked it over the 2080 super for its cheaper pricing used and its 3gb additional vram but whats funny nothing really has pushed pass 6gb for me. COD BO4 did though but idk i see no game there that warrants the usage and honestly i think its allocating more then using.

loki1944 · Apr 24, 2020

Sam K. said:
Hello! I'm quarantined in my home and trying to figure out various different GPU specs and their impact on real world performance; as I don't have anything better to do at the moment!

Anyhow, I hit a wall while comparing GTX 770 with GTX 970 and need your opinions on the matter.

First, look at the screenshot below. As you can see, both GTX770 and 970 are running at the same core (1,215MHz) and memory frequencies (3,506MHz) and have the same 256-Bit bandwidth, which eliminates any bandwidth related variables. So far so good.

Since we know that the GTX770 has 1536 cores whereas the 970 has 1664, it's easy enough to calculate theoretical GFlop performance of both GPUs at this 'exact' moment:

GTX970: 0.002 x 1,215MHz x 1,664 cores = 4,043 GFlops.
GTX770: 0.002 x 1,215MHz x 1,536 cores = 3,732 GFlops.

As you can see, the GTX770 is just ~8% slower than the GTX970, or at least it should be, yet the frame rate suggests that the 770 is actually ~43% slower!

My question is a simple 'why'? Why the huge difference? What am I missing here? They should perform within ~10% margin because they've the exact same memory bandwidth and frequency and yet...

It's just super confusing!

So, any ideas?

Some of that, beyond architecture, going to be driver optimization and when the game came out. When Maxwell first came out, my 780Ti would beat out the GTX 980 most of the time, but now not so much. It's not in Nvidia's interest to keep refining older cards I don't think, though from what I've seen, AMD cards seem to hold up longer (at least my range of 290/290X/390X/Nano).

Hans Gruber · Apr 25, 2020

I will throw this out there. The 970 is probably the best card for dollar/performance that Nvidia has had in quite some time. With regards to it being able to hold it owns on any game 1080p. I still have a 7950 that is quite amazing considering its age. Same caliber as 970. I would say the 1080 is probably up there with the 970 but if you consider the price to performance ratio. The 970 wins that battle. The 960 is useless but I don't want to start a flame war. Another card that did well was the GTX 460 long after it's best years were gone.

Consider the lack of innovation since 2014. That is probably why the 970 still performs well. In the past the leap from generation to generation was massive. After the 1080, we have not seen much.

Guru · Apr 25, 2020

Didn't Maxwell have much better memory compression, something that translated into Pascal as well, being one of the main differences between Pascal memory performance and Polaris'es.

On top of that its core architecture was a lot more balanced and was able to process more data faster, while Kepler had more bottlenecks, especially for more standardized games, not optimized for it.

So yeah, there is a lot of difference between the two in terms of architecture, plus the main ones like much better memory compression, more ROP's, a dynamic and aggressive clock, etc...

And overall games tend to be more optimized towards newer architectures, so over time older gens become "slower" as games are optimized better for the newer gen cards!

So yeah, 5% from rops, 5% from memory compression, 5% from a dynamic boost clock, 10% from core architecture improvements, 10% from game optimizations, and you can easily get up to 50% faster.

And as others have said Nvidia's GTX 460, GTX 970 and GTX 1060 6GB one of the better value cards from Nvidia. Between these Nvidia has really been terrible at offering good value to consumers, how many people were tricked with the GTX 960 for example, which was a garbage card, low end, low tier cards advertised as mid range and overpriced at $240.

The GTX 660 was another dud, overpriced "mid" range, which was actually low range, much slower than the GTX 770, etc...

So yeah, with Nvidia you really have to be careful, but if you do catch one of these value cards, they end up being amazing for several years and hold their own.

VirtualLarry · Apr 26, 2020

The GTX 460 was a unique gem, IMHO. It was "special", in that it was super-scalar, unlike it's other Fermi brethren, IIRC. It sure was a long-lived card, a friend of mine used his for YEARS, with his Core2Quad rig. (I eventually managed to snag a 2GB version for him.)

continuum · Apr 28, 2020

Guru said:
Didn't Maxwell have much better memory compression, something that translated into Pascal as well, being one of the main differences between Pascal memory performance and Polaris'es.

It had both better memory compression and larger caches, which reduced pressure on the memory subsystem.

The NVIDIA GeForce GTX 980 Review: Maxwell Mark 2

www.anandtech.com

DamZe · May 4, 2020

Oh I remember rocking a GTX 770 2GB and regretting ever buying that thing even when with a deep discount back in 2014, it was painful playing the Witcher 3 @1080p on medium settings and dropping regularly to 40-something in FPS.

Kepler was a one trick pony with too little vRAM for the price and optimizations needed for it to run optimally were foregone the moment Maxwell hit the shelves.

JustMe21 · May 15, 2020

I think the biggest performance increase was from introducing tile based rendering into the Maxwell architecture, which also reduced the power requirement.

Red Hawk · May 16, 2020

loki1944 said:
Some of that, beyond architecture, going to be driver optimization and when the game came out. When Maxwell first came out, my 780Ti would beat out the GTX 980 most of the time, but now not so much. It's not in Nvidia's interest to keep refining older cards I don't think, though from what I've seen, AMD cards seem to hold up longer (at least my range of 290/290X/390X/Nano).

There's a couple practical reasons AMD cards "hold up longer" than Nvidia, or at least why GCN first gen has held up better than Kepler.

-Nvidia replaced their entire product stack between the 700 series and 900 series from Kepler to Maxwell, and again between 900 and 1000 from Maxwell to Pascal, then between 1000 to 2000 from Pascal to Turing. They haven't sold new Kepler cards for a long time. AMD on the other hand has just sort of introduced new GCN generations piece by piece, rebadging older cards for newer product stacks. They were selling first gen GCN cards up until the 400 series. And they're still selling fourth gen GCN (aka Polaris) So it has been in AMD's interests to keep up driver support for older generations of GCN, while neglecting Kepler doesn't really affect Nvidia's bottom line.

-The difference between each new Nvidia architecture has been more significant than the difference between GCN generations. There were improvements and refinements made to GCN, but the core architecture remained the same, so that made backporting driver improvements relatively easy. The differences between Nvidia architectures likely means you can't take driver improvements to Turing or Pascal and apply them so easily to Kepler.

So it's a mixture of both being necessary and being simpler for AMD to keep up driver support to older GCN products. That's not to say Nvidia makes bad products, but you go to them for the latest and greatest. Buy a card at whatever price point you can afford to replace at the same price in 2 or 3 years, whether that's a "xx80" or an "xx60" card. I mean, consider someone who bought a GTX 680 or 780 with the intention that "oh, I'll just buy the fastest Nvidia graphics card and I'll be set to play games for this whole console generation". Then look at those benchmarks from Doom Eternal. Woof. Anyone spending big money on a 2080, or 3080, expecting to get top level performance out of it for 5 years or more is making a sucker's bet.

And it's not like AMD is some champion of the people in this situation. Again, things have worked out this way out of necessity and relative ease for AMD. They don't have the R&D resources and the brand value to do what Nvidia does with replacing their whole product stack with a new architecture every couple years. When they have done a full architecture replacement, they've behaved just like Nvidia, or worse. Just look at the last cards on their previous "Terascale" architecture, the Radeon HD 6000 series -- the last WHQL driver was in 2015, and the last beta driver was in 2016. I'm betting a 6970 would do even worse in Doom Eternal - heck, it doesn't even support Vulkan, it would have to run the game in OpenGL. Once RDNA gains market share, we'll probably see GCN finally fall by the wayside.

Stuka87 · May 16, 2020

Red Hawk said:
There's a couple practical reasons AMD cards "hold up longer" than Nvidia, or at least why GCN first gen has held up better than Kepler.

-Nvidia replaced their entire product stack between the 700 series and 900 series from Kepler to Maxwell, and again between 900 and 1000 from Maxwell to Pascal, then between 1000 to 2000 from Pascal to Turing. They haven't sold new Kepler cards for a long time. AMD on the other hand has just sort of introduced new GCN generations piece by piece, rebadging older cards for newer product stacks. They were selling first gen GCN cards up until the 400 series. And they're still selling fourth gen GCN (aka Polaris) So it has been in AMD's interests to keep up driver support for older generations of GCN, while neglecting Kepler doesn't really affect Nvidia's bottom line.

-The difference between each new Nvidia architecture has been more significant than the difference between GCN generations. There were improvements and refinements made to GCN, but the core architecture remained the same, so that made backporting driver improvements relatively easy. The differences between Nvidia architectures likely means you can't take driver improvements to Turing or Pascal and apply them so easily to Kepler.

So it's a mixture of both being necessary and being simpler for AMD to keep up driver support to older GCN products. That's not to say Nvidia makes bad products, but you go to them for the latest and greatest. Buy a card at whatever price point you can afford to replace at the same price in 2 or 3 years, whether that's a "xx80" or an "xx60" card. I mean, consider someone who bought a GTX 680 or 780 with the intention that "oh, I'll just buy the fastest Nvidia graphics card and I'll be set to play games for this whole console generation". Then look at those benchmarks from Doom Eternal. Woof. Anyone spending big money on a 2080, or 3080, expecting to get top level performance out of it for 5 years or more is making a sucker's bet.

And it's not like AMD is some champion of the people in this situation. Again, things have worked out this way out of necessity and relative ease for AMD. They don't have the R&D resources and the brand value to do what Nvidia does with replacing their whole product stack with a new architecture every couple years. When they have done a full architecture replacement, they've behaved just like Nvidia, or worse. Just look at the last cards on their previous "Terascale" architecture, the Radeon HD 6000 series -- the last WHQL driver was in 2015, and the last beta driver was in 2016. I'm betting a 6970 would do even worse in Doom Eternal - heck, it doesn't even support Vulkan, it would have to run the game in OpenGL. Once RDNA gains market share, we'll probably see GCN finally fall by the wayside.

You seem to imply that each new set of cards that nVidia puts out is a ground up redesign. This is not true. They are iterative off the previous one. They just come up with new code names for each one. As for AMD, GCN is *NOT* the architecture of each chip, it is the ISA. AMD has had GCN 1.0 through GCN 1.5.

It was already stated earlier in this thread why Kepler didn't hold up. It required some very specific optimizations from the game developer to make it perform well. Also, when the 600 series came out, nVidia ripped out a ton of the compute compared to Fermi. At the same time, AMD added a ton of compute. This compute helped make the AMD cards more future proof as future games started to use compute more and more. AMD had also added features that were unused at the time of launch, but came into use later on.

There is also the fact that nVidia is not interested in keeping old cards going, they want their newest cards to look better. At the time that the RTX series launched, nVidia's biggest competition was themselves. AMD has proven they keep older cards going longer. Just look at benchmarks of the 390 from when it came out to now. It is hitting way above its weight.

Not sure why you bring up the HD 6000 series, it launched in 2010, at roughly the same time as the GTX 500 series. Both of these GPU families are next to worthless today, TEN YEARS later. The more recent driver for the GTX 580 by the way, is from 2015.

Red Hawk · May 17, 2020

Stuka87 said:
You seem to imply that each new set of cards that nVidia puts out is a ground up redesign. This is not true. They are iterative off the previous one. They just come up with new code names for each one. As for AMD, GCN is *NOT* the architecture of each chip, it is the ISA. AMD has had GCN 1.0 through GCN 1.5.

It was already stated earlier in this thread why Kepler didn't hold up. It required some very specific optimizations from the game developer to make it perform well. Also, when the 600 series came out, nVidia ripped out a ton of the compute compared to Fermi. At the same time, AMD added a ton of compute. This compute helped make the AMD cards more future proof as future games started to use compute more and more. AMD had also added features that were unused at the time of launch, but came into use later on.

There is also the fact that nVidia is not interested in keeping old cards going, they want their newest cards to look better. At the time that the RTX series launched, nVidia's biggest competition was themselves. AMD has proven they keep older cards going longer. Just look at benchmarks of the 390 from when it came out to now. It is hitting way above its weight.

Not sure why you bring up the HD 6000 series, it launched in 2010, at roughly the same time as the GTX 500 series. Both of these GPU families are next to worthless today, TEN YEARS later. The more recent driver for the GTX 580 by the way, is from 2015.

What I said is that the difference between each architecture that Nvidia has put out has been more significant than the difference between generations of GCN. That's not the same as saying every new Nvidia architecture has been a "ground up redesign". (Btw, the GCN "1.1, 1.2, 1.3", etc. numbering was never AMD's official nomenclature for new iterations of GCN. That was Ryan Smith of Anandtech -- who did it because the differences were so minimal that he thought it was more appropriate than treating the iterations as new architectures).

The reduction of compute functionality plays a role in why Kepler hasn't held up as well as GCN 1, sure. However that mainly applies to GK104 and GK106, the Titan had a lot of the compute functionality the lower end chips lacked and was considered a compute beast at time of release.

I brought up the HD 6000 series because it's the immediate predecessor of the first generation of GCN cards. If AMD's so good at keeping older cards going, surely the 6000 held up just as well, right? Well turns out you can't even properly test it, because the last beta driver release was in 2016. AMD didn't end driver support for the 7000 series in 2018, so why the early end for the 6000 series? My point is that it indicates there are other factors at play besides Good Guy AMD just keeping up support for your old graphics card because they're such a swell company.

(and by the way, incorrect, the last driver release for the 580 is from 2018)

Question Why is the GTX 970 so much faster than the GTX 770?

Junior Member

Member

Lifer

Diamond Member

Junior Member

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Senior member

Diamond Member

Platinum Member

Diamond Member

Member

Platinum Member

Senior member

No Lifer

Junior Member

Member

Senior member

Diamond Member

Diamond Member

Diamond Member