Overclock your uncore. :)

SlowSpyder

Lifer
Jan 12, 2005
17,305
1,002
126
Just figured I'd post some results of a little uncore bump. It's only a single bench, but I think it demonstrates the importance of uncore overclocking.

I ran some benches for Idontcare to see how my PhI and PhII stacked up against his Q6700 (I think that's what he has). As the cores of the PhII got pushed to higher speeds I think the 1.8GHz uncore can start to hold the cores back, not feed them so to speak.

IDC has a bench he runs that crunches some numbers, 28 jobs. The faster your CPU is the faster the bench finishes.

For this run I had four instances going at once (one for each core). The cores were at 3.7GHz, the uncore untouched. The benches finished from 8:47 - 8:50.

With this run I had the uncore at 2070MHz, but the core at just 3.4GHz. Even with the cores having a ~300MHz disadvantage the jobs finished in 8:17 - 8:19. I think as the cores got to higher MHz the uncore couldn't feed them, so the scaling gets pretty poor.

And here I ran the uncore at 2070MHz with the cores at just under 3.7GHz.

So, unless my math is off (very possible :) ) the moderate overclock of the uncore to 2070Mhz gained about a 10% speed advantage at the same core speed (477 seconds for the first finished job at 2070MHz/3.68Ghz vs. 527 seconds for the first finished job with the chip at 1.8GHz/3.7GHz.

I'm probably not telling you guys anything you didn't know, but a moderate overclock of the uncore delivered 10% better performance clock for clock. So, if you were just using your multiplier to overclock you PhII it's a good idea to play around with you HT speed as there is more oomph you're leaving on the table otherwise.

Nothing earth shattering here, but just wanted to share some results incase anyone was curious aobut what overclocking the uncore could provide in performance. Any thoughts pleae feel free to share. :)
 

SlowSpyder

Lifer
Jan 12, 2005
17,305
1,002
126
Originally posted by: Modular
Maybe I'm slow, but what's an "uncore"?

The Phenom chips have the cores as well as the IMC (integrated memory controller) and L3 cache that operate at a different clock speed than the cores. So overclocking the IMC/L3 can give some decent performance increases in certain scenarios.
 

faxon

Platinum Member
May 23, 2008
2,109
1
81
the part of the CPU that isnt ALU, FPU, or cache. newer CPUs that have other functions built into them (RAM controllers, PCI-E, ect...) have these functions in an area of the CPU called the un-core or uncore, which basically means the part that isnt the core
 

Denithor

Diamond Member
Apr 11, 2004
6,298
23
81
PhII Uncore

An extra benefit of the Socket-AM3 Phenom II processors is that their uncore (memory controller + L3 cache) will be clocked at 2.0GHz instead of 1.8GHz like the two processors launching today. By comparison the Phenom 9850 and 9950 both have a 2.0GHz uncore clock; AMD had to go down to 1.8GHz to launch the Phenom II at 2.8GHz and 3.0GHz today.

As 45nm yields improve AMD will increase the uncore frequency, but today it's at 1.8GHz and the AM3 chips will have it at 2.0GHz.

There was a guy in the GPU forum who saw pretty impressive fps gains in certain games on a TripleCF 4850 rig when he overclocked his PhII 940 uncore (as much or more from overclocking the core speed).

I think this is a major factor in how much better the i7 chips perform with multiGPU rigs versus C2Q chips (cores are able to share data more freely, better throughput, smoother transfer to the correct GPU, etc).
 

DrMrLordX

Lifer
Apr 27, 2000
22,533
12,402
136
As I've said before, "uncore" only applies to Nehalem chips. On K10 chips it's the Northbridge or NB. Otherwise the OP's recommendations are spot-on. It's a shame the NB does not/can not reach the same clock speeds as the cores can in K10 chips.
 

SlowSpyder

Lifer
Jan 12, 2005
17,305
1,002
126
Originally posted by: DrMrLordX
As I've said before, "uncore" only applies to Nehalem chips. On K10 chips it's the Northbridge or NB. Otherwise the OP's recommendations are spot-on. It's a shame the NB does not/can not reach the same clock speeds as the cores can in K10 chips.

Sorry, it's just the terminology I was using with IDC in PM's... but as long as you guys know what I'm talking about I guess. :)

I'd like to keep tinkering and pushing to see where I stop getting performance returns. I've seen plenty of PhII's with the L3/NB anywhere between 2.2 to 2.7GHz.
 

Flipped Gazelle

Diamond Member
Sep 5, 2004
6,666
3
81
I've gotten my L3/NB - PhII "Uncore", it makes sense to call it that since it refers to the L3 cache & memory controller - to 2.3 successfully w/a .1v boost. More than that causes massive instability.
 

Gikaseixas

Platinum Member
Jul 1, 2004
2,836
218
106
i have my "uncore" at 2.4 but never tried to push it higher. I'll try 2.5 to see what happens :laugh:
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: SlowSpyder
Originally posted by: DrMrLordX
As I've said before, "uncore" only applies to Nehalem chips. On K10 chips it's the Northbridge or NB. Otherwise the OP's recommendations are spot-on. It's a shame the NB does not/can not reach the same clock speeds as the cores can in K10 chips.

Sorry, it's just the terminology I was using with IDC in PM's... but as long as you guys know what I'm talking about I guess. :)

Yes Intel came up with the term uncore to describe the non-core components inside the chip, but there is no reason to not use the terminology when speaking of the non-core areas on an AMD chip.

The generic use of the terminology to refer to the uncore areas of AMD chips as one does to the uncore are of Nehalem is no different than calling Nehalem a monolithic quad-core chip just like Phenom, or calling C2D and X2 both dual-core chips.

To be sure the uncore contains different things in the AMD versus Intel chips, but at this time the overlap in components is relevant as both chips allow independent clock domains of their IMC/L3$ versus the core, and changing that clockspeed can have a real impact on application performance just as SlowSpyder is observing.

edit: [shameless self-interest plug] If anyone here has a yorkfield quad (preferably a QX) or an i7 (preferably a 965) and would like to run this benchmark at 3-4 different clockspeeds on your rig so I can add data to my graph I sure would appreciate it, pm me if interested [/shameless self-interest plug]
 

Denithor

Diamond Member
Apr 11, 2004
6,298
23
81
Originally posted by: Idontcare
edit: [shameless self-interest plug] If anyone here has a yorkfield quad (preferably a QX) or an i7 (preferably a 965) and would like to run this benchmark at 3-4 different clockspeeds on your rig so I can add data to my graph I sure would appreciate it, pm me if interested [/shameless self-interest plug]

Looking for grounds to do a system upgrade or something? ;)

If AT can refer to it as an "uncore" on the AMD chips then I certainly don't have any problem using that terminology. (See the link I gave above.)
 

SlowSpyder

Lifer
Jan 12, 2005
17,305
1,002
126
Originally posted by: Rhoxed
Originally posted by: DrMrLordX
That's some serious voltage on a Phenom II. What are you using to cool it?

xigmatek with 2 110cfm fans

I was thinking about going above 1.5, but figured with summer coming I better keep it a bit lower. The room my PC in gets very warm in summer. So I'm not sure that 1.55 volts and a 83F room would be a good idea. ;)
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: Denithor
Looking for grounds to do a system upgrade or something? ;)

:thumbsup: Always!

Originally posted by: Denithor
If AT can refer to it as an "uncore" on the AMD chips then I certainly don't have any problem using that terminology. (See the link I gave above.)

Agreed! I hadn't checked it out when I made my post, now I see and agree with your observation and implication.
 

Denithor

Diamond Member
Apr 11, 2004
6,298
23
81
I've actually wondered the same thing about the i7 chips - how much effect do you see from pushing the uncore above its rated point?

Especially in cases where the i7 really shines - mutiGPU gaming for example (take a look, the i7 965 shows an obscene advantage over the QX9770 and E8400 in those benchies).

EDIT - I know that kind of setup is actually built by <0.05% of people out there - think of this as theoretical or conceptual interest in the topic, and I'm more interested in CPU throughput/efficiency than the GPU muscle itself. You simply don't see the true power of the i7 architecture when it's only pushing a single GTX so you have to look where the performance comes into play and adjust from there.

So - OC the uncore of that 965 and see how performance scales. And I want to see if the same holds true for PhII with an overclocked uncore...
 

Rhoxed

Golden Member
Jun 23, 2007
1,051
3
81
Originally posted by: SlowSpyder
Originally posted by: Rhoxed
Originally posted by: DrMrLordX
That's some serious voltage on a Phenom II. What are you using to cool it?

xigmatek with 2 110cfm fans

I was thinking about going above 1.5, but figured with summer coming I better keep it a bit lower. The room my PC in gets very warm in summer. So I'm not sure that 1.55 volts and a 83F room would be a good idea. ;)

i keep ambient about 22C (72F) and i live in florida, so air conditioner cranking always.
 

Martimus

Diamond Member
Apr 24, 2007
4,490
157
106
Originally posted by: Idontcare
edit: [shameless self-interest plug] If anyone here has a yorkfield quad (preferably a QX) or an i7 (preferably a 965) and would like to run this benchmark at 3-4 different clockspeeds on your rig so I can add data to my graph I sure would appreciate it, pm me if interested [/shameless self-interest plug]

Will you be making these results available to the rest of us? (I would like to see that graph when it is completed.)
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: Martimus
Originally posted by: Idontcare
edit: [shameless self-interest plug] If anyone here has a yorkfield quad (preferably a QX) or an i7 (preferably a 965) and would like to run this benchmark at 3-4 different clockspeeds on your rig so I can add data to my graph I sure would appreciate it, pm me if interested [/shameless self-interest plug]

Will you be making these results available to the rest of us? (I would like to see that graph when it is completed.)

Absolutely, provided the contributors don't mind. Here is what I have accumulated so far:

http://i272.photobucket.com/al...arisonwithPhenom-1.gif

Note that both SlowSpyder's and Peter Trend's PhenomII results are plotted separately (same CPU, different rigs though) but the best fit line for both sets of PhenomII scaling results actually overlaps so well that you can't the purple best fit line which is underneath the red best fit line.

Yes this graph shows PhenomII has lower IPC than Phenom for this particular application, which in turn has lower IPC than Kentsfield. Slowspyder and I are at a loss to explain why this is this case but our current working theory is that the application loves L2$.

To determine whether this hypothesis is true we are hoping to add some Yorktown results (increase in L2$ should result in higher IPC) and some i7 results (smaller L2$ should result in lower IPC).

Another interesting observation as Slowspyder led off with OP is that the uncore clockspeed of his PhII needed to be 2.07GHz to match the results seen with Peter Trend's 1.8GHz uncore...the primary delta in their system's (from what I could tell) is that SlowSpyder is using Vista64 whereas Peter Trend is using regular 32bit Vista.

At any rate this is very much a real application, Metatrader is a forex platform which allows one to optimize their trade algorithms by way of backtesting historical data and then filtering (rank-sorting) the results as a means of extracting the set of parameters which delivered superior results on historical market data. (the usual caveat applies - past results are not indicative of future performance, etc etc)

So for me when I look to make computer purchases I am looking at the price/performance for a rig that delivers the highest number of "passes per minute" per dollar. More passes per minute means less time I have to wait to get my greedy hands on the next set of optimized parameters, with which I hope to increase my quantitative trading profitability.

At any rate I'd venture to guess there are at least tens of thousands of users of this program globally, maybe hundreds of thousands. It matters to us to get results sooner instead of later and for some of us it also matters to get those results in a cost-efficient manner. I've got an aging fleet of Q6600's OC'ed to 3.3GHz that I'd like to be able to justify replacing, so far I have not been able to generate a compelling argument to do that yet.

(my apologies Steve in advance for adding all this background content to your thread, please folks pm me with any further questions you may have so we avoid derailing Slowspyder's thread on the merits of OC'ing your uncore to maintain some "balance" with your core's need for L3$ bandwidth)
 

SlowSpyder

Lifer
Jan 12, 2005
17,305
1,002
126
No need IDC, I'm glad to get the back story on this. It's good to know what it did and why there's no reason to replace what you have, at least with what you've had tested so far.

 

cusideabelincoln

Diamond Member
Aug 3, 2008
3,275
46
91
You should post the results of your core at stock and you uncore as high as it will go, then we can compare those results to your various core-only overclocks.

And since you guys are hypothesizing the application loves L2 cache, it makes perfect sense that we're seeing the results of the uncore/L3 overclocking.
 

TC91

Golden Member
Jul 9, 2007
1,164
0
0
would ocing the fsb with a lower multiplier have a similar but much less significant effect on a core 2 system?