NB overclocking on K10 chips w/out L3 cache

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
Formulav8 approached me about putting together some benchmark numbers to see what effects NB overclocking would have on a Propus chip (he is testing an L3-less dual-core K10 . . . Rana I think?). Here are the results at four different NB speeds:

3705.3 mhz, 285 mhz HTT, 2850 HT Link
DDR3-1520 6-7-5-15 1T

NB: 2850.2 mhz

Everest Read: 10597 Write: 5325 Latency: 37.0 ns

superPi mod 1.5 XS 1M: 20.951s B43F7FA9 32M: 18m 25.449s 7BF80A1

Cinebench R10 1 CPU: 4014 x CPU: 14259 Multiprocessor Speedup: 3.55x

Fritz Chess Relative Speed: 18.70 Kilo nodes per second: 8976

Aquamark3 (64-bit patch) CPU: 16,661

ScienceMark 2.0 Overall: 2416.89

Cinebench R11.5 CPU: 4.24

7-zip 32MB Dictionary size, Passes 10 Total CPU Usage: 369% Total Rating/Usage 3658 MIPS Total Rating: 13521 MIPS

TrackMania Nations Forever 228 fps (1024x768, all settings none/lowest/fastest except anisotropic 2x, 8800GTX)

Blender: 8.57

PiFast43 32m (standard settings, largest FFT): 84.40 seconds

NB: 2565.1 mhz

Everest Read: 10088 Write: 5029 Latency: 38.6 ns

SuperPi mod 1.5 XS 1M: 21.044s C823DABA 32M: 18m 34.232s 89C45805

Cinebench R10 1 CPU: 3948 x CPU: 14198 Multiprocessor Speedup: 3.60x

Fritz Chess Relative Speed: 18.46 Kilo nodes per second: 8859

Aquamark3 (64-bit patch) CPU: 16,337

Sciencemark 2.0 Overall: 2380.94

Cinebench R11.5 CPU: 4.18

7-zip 32MB Dictionary Size, Passes 10 Total CPU Usage: 368% Total Rating/Usage: 3583 MIPS Rating: 13218 MIPS

TrackMania Nations Forever 225 fps (1024x768, all settings none/lowest/fastest except anisotropic 2x, 8800GTX)

Blender: 8.64

PiFast43 32m (standard settings, largest FFT): 85.02 seconds

NB: 2280.1 mhz

Everest Read: 9797 Write: 4791 Latency 39.1 ns

SuperPi mod 1. XS 1M: 21.216s 552086B3 32M: 18m 37.882s 62D890E1

Cinebench R10 1 CPU: 3959 x CPU: 14072 Multiprocessor Speedup: 3.55x

Fritz Chess Relative Speed: 18.40 Kilo nodes per second: 8833

Aquamark3 (64-bit patch) CPU: 16,065

Sciencemark 2.0 Overall: 2350.44

Cinebench R11.5 CPU: 4.16

7-zip 32MB Dictionary Size, Passes 10 Total CPU Usage: 371% Total Rating/Usage: 3576 MIPS Total Rating: 13296 MIPS

TrackMania Nations Forever 223 fps (1024x768, all settings none/lowest/fastest except anisotropic 2x, 8800GTX)

Blender: 8.66

PiFast43 32m (standard settings, largest FFT): 85.77 seconds

NB: 1995.1 mhz

Everest Read: 9093 Write: 4617 Latency: 41.2 ns

SuperPi mod 1.5 XS 1M: 21.388s FEE1B54B 32M: 18m 52.921s F5A8B5E1

Cinebench R10 1 CPU: 3973 x CPU: 13993 Multiprocessor Speedup: 3.52x

Fritz Chess Relative Speed: 18:30 Kilo nodes per second: 8785

Aquamark3 (64-bit patch) CPU: 15,688

Sciencemark 2.0 Overall: 2320.70

Cinebench R11.5 CPU: 4.12

7-zip 32MB Dictionary Size, Passes 10 Total CPU Usage: 373% Total Rating/Usage: 3502 MIPS Total Rating: 13123 MIPS

TrackMania Nations Forever 217 fps (1024x768, all settings none/lowest/fastest except anisotropic 2x, 8800GTX)

Blender: 8.68

PiFast43 32m (standard settings, largest FFT): 86.36 seconds

Here are some newer numbers to show the effects of NB clocks on applications that tax the video card (somewhat, more on this below):

NB: 2850

CineBench R10 OpenGL: 4486 CB-GFX

CineBench R11.5 OpenGL: 35.29 fps

Aquamark3 (64-bit patch) GFX: 19,515 Overall: 123,030

Trackmania Nations Forever 60.5 fps (1280x1024, all settings high, 8x antialiasting, anisoptropic 16x, 8800GTX)

3DMark Vantage CPU Score: 33736 Graphics: 7274 Overall: P9049

NB: 2565

CineBench R10 OpenGL: 4182 CB-GFX

CineBench R11.5 OpenGL: 34.97 fps

Aquamark3 (64-bit patch) GFX: 19,185 Overall: 120,761

Trackmania Nations Forever 60.1 fps (1280x1024, all settings high, 8x antialiasting, anisoptropic 16x, 8800GTX)

3DMark Vantage CPU Score: 34047 Graphics: 7270 Overall: P9050

NB: 2280

CineBench R10 OpenGL: 3975 CB-GFX

CineBench R11.5 OpenGL: 34.45 fps

Aquamark3 (64-bit patch) GFX: 18,822 Overall: 117,997

Trackmania Nations Forever 59.8 fps (1280x1024, all settings high, 8x antialiasting, anisoptropic 16x, 8800GTX)

3DMark Vantage CPU Score: 33985 Graphics: 7267 Overall: P9044

NB: 1995

CineBench R10 OpenGL: 3951 CB-GFX

CineBench R11.5 OpenGL: 33.58 fps

Aquamark3 (64-bit patch) GFX: 18,672 Overall: 116,992

Trackmania Nations Forever 59.3 fps (1280x1024, all settings high, 8x antialiasting, anisoptropic 16x, 8800GTX)

3DMark Vantage CPU Score: 33537 Graphics: 7291 Overall: P9064

I was going to create graphs and such, but found it somewhat tedious compared to just cutting and pasting the data in a post. I'm so lazy.

edit: formulav8 made some graphs using my data. Not 100% complete but I'm not complaining.

http://www.vbcodesource.info/graphs/drmrlordx/

(did not embed jpgs with img tags to avoid slow loads).

Thanks for the graphis, formulav8!

edit edit: I have added the above extra benchmark results to show what happens when you raise NB speed when running applications that tax the video card. Unfortunately, my monitor will not support resolutions higher than 1280x1024, so that's the best I can do; however, that was only really a factor in the Trackmania Nations Forever benching.

It seems that the benchmark showing the greatest benefit from NB speed is Cinebench R10 OpenGL. R11.5, not so much.

edit edit edit: Added 3DMark Vantage scores. Highly inconsistent. I couldn't generate enough trial keys to do enough runs to normalize data, but it seems like there's a lot of noise involved.

Anyway, as you can see, there are gains to be had, though it should be noted that they are not amazing; still, I did not expect NB speed to play much of a role in Propus' performance (and I certainly did not expect memory read and write numbers to be affected by it). Also, my Cinebench R10 single CPU results are a bit odd in that they seemed to go up as NB speed decreased after a certain point. How odd. The highest NB speed still gave me the best single CPU and multi CPU scores so that was as expected.

Blender scores were generated using an optimized 32-bit build of version 2.5 alpha 2 while rendering a test file known as test.blend: http://www.eofw.org/bench/test.blend?PHPSESSID=819e1c2b501ae28ff68c17f67b1a8a69

Another bizarre finding was that Cinebench R11.5 had to be run at a LOWER vcore than other benchmarks because the heat from high vcore settings made it unstable at speeds of 3.7 ghz or higher. I am planning to improve my cooling on this chip to see what positive results can be had from dropping a few more degrees @ load.

Also, this should demonstrate, without a doubt, that Propus just can't hang with Deneb at the same clock speed. With a better IMC, maybe it could, but as it stands, it is simply inferior. Propus MIGHT be competitive in benchmarks/applications with an enormous working set, but I haven't found anything to demonstrate that possibility (yet).
 
Last edited:

Hyperlite

Diamond Member
May 25, 2004
5,664
2
76
Interesting! good work! might have to go back and try and tweak my NB a little...2223mhz right now...but IIRC my board protests much above that.
 

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
Interesting! good work! might have to go back and try and tweak my NB a little...2223mhz right now...but IIRC my board protests much above that.

Not sure how lucky I got with this chip, but the NB is good for speeds up to around 2.85 ghz as a part of a consolidated overclock. Isolated it could hit 2.95 ghz. The real trick to getting things stable in that department was an NB voltage of 1.3-1.4v and a CPU-NB voltage of 1.35-1.44v. I understand that that's well beyond the "safe" 10% range so overvolt at your own risk. FYI 3.7 ghz core speed + 2.85 ghz NB only really required 1.3v/1.35v respectively . . . it was only when pushing core speeds to the bleeding edge (for this chip) and when running vcore at 1.52v and above that higher NB speeds were really necessary.

A lot of people said that the relationship between NB speed and memory speed would affect memory stability, but on this chip, it did not.

Thanks for the NB testing

No problem. Aside from SuperPi 32M, it didn't take that long to complete. I might add PiFast numbers later for the Zoners out there.
 

formulav8

Diamond Member
Sep 18, 2000
7,004
522
126
It definitely shows there are improvments to be expected from overclocking the northbridge whether you have Level 3 cache or not. To bad AMD can't get the memory controller to clock near as high as core clock as latency would get a huge boost. :)


Jason
 

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
Yeah, I've been wanting higher clocks out of the NB/IMC as well. Kinda sad that we don't get NB speeds of 4 ghz, sometimes not even on ln2-cooled Denebs.

There's got to be a reason (if not multiple reasons) for NB speeds to be as low as they are on K10 variants, but at least the speeds seem to be creeping upwards a bit. Not sure what E0 stepping and Thuban will bring.
 

BD231

Lifer
Feb 26, 2001
10,568
138
106
Yeah, I've been wanting higher clocks out of the NB/IMC as well. Kinda sad that we don't get NB speeds of 4 ghz, sometimes not even on ln2-cooled Denebs.

There's got to be a reason (if not multiple reasons) for NB speeds to be as low as they are on K10 variants, but at least the speeds seem to be creeping upwards a bit. Not sure what E0 stepping and Thuban will bring.

TDP, those IMC's run warm so that performance increase is coming at a cost (minimal in our eyes, not so minimal in AMDs). There's no reason to raise the IMC because there's no need for more bandwidth at the moment. Short of a die shrink for the Phenom, I don't think we'll see much change.

If we're lucky AMD keeps the Phenom around and continues to better the design. Phenom III maybe? Certainly wouldn't be the first time they threw out a third refresh.
 
Last edited:

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
TDP, those IMC's run warm so that performance increase is coming at a cost (minimal in your eyes, not so minimal in AMDs). There's no reason to raise the IMC because there's no need for more bandwidth at the moment. Short of a die shrink for the Phenom, I don't think we'll see much change.

Interesting that you should bring that up . . . temps barely budged no matter how much NB and CPU-NB voltage I fed into the chip. Raising NB clockspeed didn't seem to do much either. Things got a lot less stable at higher core speeds when I tried to push the limit on NB overclocking though . . . and let's face it, performance never really comes for free.


If we're lucky AMD keeps the Phenom around and continues to better the design. Phenom III maybe? Certainly wouldn't be the first time they threw out a third refresh.

You never know. Hexcores should stick around for awhile, at least, or so I'd think, and possibly quads too.
 

BD231

Lifer
Feb 26, 2001
10,568
138
106
Yeah I guess your eyes are the better judge on temps there, what with their thermal sensing capabilities.

Of course they'll be around, utter non-sense to think otherwise; or even comment on the speculation for that matter. Product cycles are funny though, they have this out with the old, in with the new thing going on.
 

formulav8

Diamond Member
Sep 18, 2000
7,004
522
126
TDP, those IMC's run warm so that performance increase is coming at a cost (minimal in our eyes, not so minimal in AMDs). There's no reason to raise the IMC because there's no need for more bandwidth at the moment. Short of a die shrink for the Phenom, I don't think we'll see much change.
QUOTE]


From my own testing of a unlocked SempronII @ 3807mhz from 1410mhz northbridge to 2256mhz northbridge was about 6 watts on average.

Also, the benefit of a faster memory controller is Latency not simply raw bandwidth improvements. Its the decreasing of latency that gives the performance improvements when overclocking the cacheless northbridge.


Jason
 

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
Yeah I guess your eyes are the better judge on temps there, what with their thermal sensing capabilities.

Damn right.

Seriously, at high clockspeeds and high voltages, anything that's going to manifestly affect TDP @ stock is definitely going to affect TDP (and temps) while overclocked. Hell, take a look at formulav8's numbers . . . 6 watts?

If I can run up NB and CPU-NB voltages without temps budging, that ought to tell you something.

The only problem I've seen with high NB speeds on chips that are out today is that they don't all top out at the same speed and that my NB (at least) becomes a lot harder to stabilize as the heat output from the cores increases. However, 2.2-2.4 ghz NB seems doable on just about every K10 out there.

Of course they'll be around, utter non-sense to think otherwise; or even comment on the speculation for that matter. Product cycles are funny though, they have this out with the old, in with the new thing going on.

It is theoretically possible that AMD might go with X2s on the ultra-low end, x6s on the low-to-mid range, and Bulldozer/Zambezi in the mid-to-high end as of next year. With Thuban doing such a good job of staying within Deneb's power envelope, production of quads seems somehow less-relevant. If AMD can continue to find a ready niche for quads, then I'm sure they'll be around for awhile, but if quads get pushed into "hey grandma can afford that" territory, then AMD may as well just make X2s instead to improve their margins. Why sell an x4 for $50 when you can sell an X2 for $35-$40? I have a feeling that the 1035T/1055T will push quad prices into the cellar.
 

BD231

Lifer
Feb 26, 2001
10,568
138
106
Like I said, the bandwidth is not needed and it raises the overall TDP.

There are going to be six core parts that only work as quads so I don't see them going anywhere.
 

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
Like I said, the bandwidth is not needed and it raises the overall TDP.

Yeah, but the latency reduction is what really matters, and on chips with L3, it really really matters.

There are going to be six core parts that only work as quads so I don't see them going anywhere.

I'm sure they will at least offload "failed" Thubans as quads so long as they create them. The real question is: will there be E0 Deneb or Propus? Will they tape out an entirely new quad with Thuban's new features that is a "native" quad and not just a Thuban with two failed cores? Only time will tell.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,560
14,514
136
Yeah, but the latency reduction is what really matters, and on chips with L3, it really really matters.



I'm sure they will at least offload "failed" Thubans as quads so long as they create them. The real question is: will there be E0 Deneb or Propus? Will they tape out an entirely new quad with Thuban's new features that is a "native" quad and not just a Thuban with two failed cores? Only time will tell.

Why the extra expense ? Maybe even a 5 core, 4 core and even a 3 core out of a failed hex. Makes perfect sense. No scrap, sell it all !
 

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
Why the extra expense ? Maybe even a 5 core, 4 core and even a 3 core out of a failed hex. Makes perfect sense. No scrap, sell it all !

I'm sure we'll see x4s and possibly x5s, heck we're already seeing the Thuban x4s. And you sort of made my point for me: why go to all that expense, indeed?

They may phase out "native" quads and just sell the scraps as x4s/x5s with X2s staying in the sub-$50/sub-65w TDP segment.

Unlocking a cheap Sempron to a hexacore is gonna be sweeeet.

Ha! If AMD does that, they're crazy.
 

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
update: pifast43 numbers added.

Settings were all defaults (items listed under entry '0'), 32000000 decimal places, 4096k FFTs.
 

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
Another gratuitous update: formulav8 informed me that there were significant gains to be had in vid-card intensive apps when increasing NB speed on an L3-less K10.5 chip. So, I ran a few quick benches using software I already had installed to see if he was right, and the results have been pasted in the original post, below the initial run of results.

Cinebench R10 OpenGL seemed to benefit the most from increased NB speed. Aquamark3 did okay, too. Cinebench 11.5 OpenGL and Trackmania Nations Forever showed less improvement.

I admit that it is not an impressive array of graphical benchmarks, but I don't have the software available just now to do anything more interesting than that. Perhaps I should install 3dmark Vantage next.
 

formulav8

Diamond Member
Sep 18, 2000
7,004
522
126
All your results are looking real good. Good to see there are gains just speeding up the memory controller without cache. Definitely thanks for going through the pain of benching and recording. I know with doing the dual core reviewing that its a pain. You actually did many more nb speeds than I did as well. I wish I had the patience. :)


Jason
 

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
Just having fun playing with a chip that is fast becoming obsolete thanks to Thuban and Zosma. If they ever do an L3-less E0-stepping hexcore, however, this data may yet be of some use to someone (for a short time).

Added Vantage numbers. Weird stuff. It seems like there's a lot of statistical noise with that benchmark, and I didn't have enough trial keys to do more runs for normalization.

One thing I also don't quite get . . . why is Cinebench R10's OpenGL test so sensitive to NB speed, as opposed to R11.5's? One big thing I noticed is that the slowdowns on both OpenGL tests in the parts where fps dips to its lowest level seem mitigated somewhat by high NB speed.