Dual core vs Quad core with and without HT (PCLab.pl)

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
PClab.pl has put together an interesting review that compares the performance of a 4770K in several modes of operation; dual core w/o HT, dual core w/ HT, quad core w/o HT and finally quad core w/ HT.

They also use two different clock speeds. It's a fairly good expose on the state of multithreading in modern PC games, and it confirms something many of us have long known or suspected; that NVidia has much better multithreading support than AMD in their drivers.

I remember Balla posted a thread asking whether anyone was CPU limited in AC IV about a week or so ago. Looking at these graphs, it clearly explains the reason why he was getting such lackluster performance was due to AMD's inferior multithreading support:

2C_gpu_acbf.png


2CT_gpu_acbf.png


4C_gpu_acbf.png


I'm not sure if the game supports DirectX multithreaded rendering like it's predecessor did, but the engine is clearly multithreaded and as you can see, NVidia's drivers makes much better use of the CPU's resources.

NVidia has implemented general multithreading support in their drivers since the first dual core processors became available years ago, and have obviously been improving upon and refining it since then as the difference is rather astonishing imo..
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
I'm not sure if the game supports DirectX multithreaded rendering like it's predecessor did, but the engine is clearly multithreaded and as you can see, NVidia's drivers makes much better use of the CPU's resources.

NVidia has implemented general multithreading support in their drivers since the first dual core processors became available years ago, and have obviously been improving upon and refining it since then as the difference is rather astonishing imo..

nv drivers need more CPU power?
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
nv drivers need more CPU power?

AMD has optimized for a singlethread driver. nVidia for a multithreaded one. Obviously the nVidia way scales better.

But check the review, it also depends largely on the game.

w2.png

sc2.png

c3j.png


Its not as black and white.

But its worth noticing perhaps, that nVidia drivers works very well with HT in scaling terms.
 
Last edited:

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
But check the review, it also depends largely on the game.

Yeah, if the game engine cannot take advantage of multicore to begin with, then NVidia's driver multithreading driver enhancements seem to go bust.

But its worth noticing perhaps, that nVidia drivers works very well with HT in scaling terms.
Indeed, look at the Crysis 3 benchmark with HT enabled compared to the one you posted:

4CT_gpu_c3j.png


Turning on HT allows the underclocked quad core to outperform the double clocked dual core.. Underclocking the CPU is hurting HT's performance I'm certain, as HT performance depends greatly on the bandwidth and latency of the cache. In other reviews I've seen with stock or overclocked CPUs, HT gives a bigger boost in Crysis 3..

I think PClabs.pl needlessly complicated the review by adding the clock speed variable into the mix, or at the least they should have covered all the bases by testing the quad core at 4.6ghz as well with HT on and off.

Instead they left the quad core at only 2.3ghz throughout the entire review. :|
 

SPBHM

Diamond Member
Sep 12, 2012
5,046
402
126
it's funny because AMD CPUs have more (slower) cores, while their VGA drivers are better with less high performance cores relative to NV.
 

mikk

Diamond Member
May 15, 2012
3,987
1,882
136
I'm not sure if the game supports DirectX multithreaded rendering like it's predecessor did, but the engine is clearly multithreaded and as you can see, NVidia's drivers makes much better use of the CPU's resources.


It is the other way around. Nvidia has issues with 3 or less threads and therefore performs subpar. AMDs driver does not have issues with 2 threads. With 4 threads and more Nvidia runs properly as it should. The bad 2 thread driver behavior from Nvidia is known since ages. AC4 barely uses 2 cores, it has nothing to do with a better driver MT capability. In general AC4 performs better with Nvidia.
 

ElFenix

Elite Member
Super Moderator
Mar 20, 2000
102,392
8,258
126
from those first two graphs it looks more like nvidia sucks at getting performance from dual cores. AMD has nice constant performance. nvidia's 2C performance blows goats.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
from those first two graphs it looks more like nvidia sucks at getting performance from dual cores. AMD has nice constant performance. nvidia's 2C performance blows goats.

Its the offset of singlethreaded vs multithreaded driver with command lists.
 

mikk

Diamond Member
May 15, 2012
3,987
1,882
136
Its the offset of singlethreaded vs multithreaded driver with command lists.


Multithreaded driver command lists is a different thing. What games beside Civilization 5 and Project Cars make use of driver command lists?
 

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
from those first two graphs it looks more like nvidia sucks at getting performance from dual cores. AMD has nice constant performance. nvidia's 2C performance blows goats.

Did you look at the whole article? For 2C performance AMD is significantly ahead. Add HT and Nvidia and AMD are pretty much now equal (@ 4.6 Ghz).

Which is really to be expected as at least for intel only really the sub $100 market lacks HT.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
It is the other way around. Nvidia has issues with 3 or less threads and therefore performs subpar. AMDs driver does not have issues with 2 threads. With 4 threads and more Nvidia runs properly as it should. The bad 2 thread driver behavior from Nvidia is known since ages. AC4 barely uses 2 cores, it has nothing to do with a better driver MT capability. In general AC4 performs better with Nvidia.

Um no..

Looking at the benchmarks, you begin to see a pattern. In multithreaded games, NVidia starts to get the edge when more than two threads are being used by the game.

This is indicative of their driver model, which I'm sure is designed at the foundational level to tap into the multithreaded capabilities of both the engine and the CPU..

AMD's driver model on the other hand cannot seem to scale beyond two threads, even if the engine supports it.

This BF4 graph supports what I'm saying, and Frostbite 3 engine purportedly can scale all the way to 8 threads:

bf4.png


4CT_gpu_b3.png
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Multithreaded driver command lists is a different thing. What games beside Civilization 5 and Project Cars make use of driver command lists?

Assassin's Creed 3.. But these improvements aren't due to driver command lists..

They are just general multithreading enhancements built in to the driver over time by NVidia.

Here's an old Tech Report article concerning NVidia implementing multithreading into their drivers..

That was 8 years ago, so I'm sure the driver's multithreading capabilities have been vastly expanded since then.
 

mikk

Diamond Member
May 15, 2012
3,987
1,882
136
Um no..

Looking at the benchmarks, you begin to see a pattern. In multithreaded games, NVidia starts to get the edge when more than two threads are being used by the game.


If true Nvidia drivers should produce a higher CPU load overall > we don't know if this is the case (in this test at least). It's possible that in some games Nvidia has a lower CPU overhead and in some games AMD has. This is a second explanation if these tests are really all CPU bound.


AMD's driver model on the other hand cannot seem to scale beyond two threads, even if the engine supports it.

I played Crysis 3 on a HD7970 and i5-4670 not long ago and the scaling was perfect, almost 100%. If AMD wouldn't scale with more than 2 cores in general they would be hopelessly lost in many other games which is not the case. Nvidia has a known issue when only 3 or less threads are available and that's why the scaling is much better on Nvidia. Did pclab use Windows 8.1? Maybe AMD doesn't gain from the dx11.1 speedup in BF4 for some reason. Nvidia gains quite a lot under Windows 8.1 due to the DX11.1 speedup in BF4.


Assassin's Creed 3.. But these improvements aren't due to driver command lists..

CPU Multithreading itself is nothing new, ShintaiDK implied the difference is due to driver command lists which is supposed to further improve the scaling. Of course it is a possibility that BF4 uses driver command lists.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
I played Crysis 3 on a HD7970 and i5-4670 not long ago and the scaling was perfect, almost 100%. If AMD wouldn't scale with more than 2 cores in general they would be hopelessly lost in many other games which is not the case. Nvidia has a known issue when only 3 or less threads are available and that's why the scaling is much better on Nvidia. Did pclab use Windows 8.1? Maybe AMD doesn't gain from the dx11.1 speedup in BF4 for some reason. Nvidia gains quite a lot under Windows 8.1 due to the DX11.1 speedup in BF4.

OK I must admit that I made a mistake by saying that AMD's drivers don't scale beyond two threads. It would perhaps be more accurate to say that AMD drivers don't use more than two threads as effectively as NVidia, but they still use it.

But it really depends on what area of Crysis 3 you're benching as well. The "Root of all Evil" level is the most GPU intensive area in the game by far, and the benchmarks show that.

Conversely, the "Welcome to the Jungle" level is the most CPU intensive area in the game, due to all of the physics on the grass..

CPU Multithreading itself is nothing new, ShintaiDK implied the difference is due to driver command lists which is supposed to further improve the scaling. Of course it is a possibility that BF4 uses driver command lists.
Judging by the large deltas in BF4 and AC4, it would not be inconceivable to think that both the FB 3.0 engine and Ubisoft's updated Anvil Next engine support DX11 multithreading.

The performance deltas are just too large for general driver multithreading enhancements to explain I think.. I remember Repi (the BF lead programmer) wanting to implement driver command lists in BF3, but he couldn't get it to work right with NVidia's drivers and so it was never implemented.

With BF4 however, maybe whatever issues that previously prohibited driver command lists from being implemented have been resolved..

As for AC IV, It's highly likely that it supports driver command lists, as AC 3 did and they're using an updated version of the same engine...
 

parvadomus

Senior member
Dec 11, 2012
685
14
81
Looks like NV driver perform better with more threads, and AMD with less.
Anyways, use a quad-core @ 4,6Ghz and 290X will destroy the 780 easy. I particularly find the review irrelevant, as it does not add this configuration to the graphs to show real world performance.
 

Skurge

Diamond Member
Aug 17, 2009
5,195
1
71
2c 4.6 vs 4c 2.3
makes sense...

also if you go with many cores + high performance per core you can hide this characteristic.

Oh yeah. Only saw that the 2c tests where at exactly double the clocks. Although I think they should have thrown 3c tests in there as well.
 

Techhog

Platinum Member
Sep 11, 2013
2,834
2
26
I don't see a point in a test like this if they change the clocks. This is like that logic that people who know nothing about CPUs use to determine "effective" clockspeed (I have 4 cores at 3.5GHz each, so I have a 14GHz CPU!!!).
 

Grooveriding

Diamond Member
Dec 25, 2008
9,107
1,260
126
Some of the most pointless and broken benchmarks I have ever seen... what are they trying to prove ? Don't play games with your Haswell underclocked to 2.3ghz ? Okay....

I've seen other broken reviews from pclab before as well. Their Battelfield 4 benchmarks were in complete opposite to benches done on all the other review sites. I haven't looked at everything they've ever done, so those may have been aberrations and broken testing on their part, but in general it's not a site I would bother to even open with the other options out there and bizarre tests like this one.
 

Techhog

Platinum Member
Sep 11, 2013
2,834
2
26
Underclocked CPU can make sense to make the test CPU bound.

Yes, so it doesw prove which GPU is more thread-bound and which is more speed-bound... but neither test is useful in the real world unless you're looking at a laptop, which can only barely match this GPU power with a dual-GPU configuration and on top of that the quad-core will always have HT, will most likely have a higher base speed, and will have a higher turbo speed for sure.
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
I don't see a point in a test like this if they change the clocks. This is like that logic that people who know nothing about CPUs use to determine "effective" clockspeed (I have 4 cores at 3.5GHz each, so I have a 14GHz CPU!!!).

I think the reason they did that was to increase dependency on and highlight the performance gain from multithreading as much as possible.