What is benefit of higher FSB with C2D?

mshan · Jun 22, 2008

From what I've read, C2D chips (assuming either 65 nm or 45 nm series) only differ in clock speed, cache, and fsb.

What is the benefit of higher fsb supposed to be?

Also, I just bought an Asus P5W DH Deluxe and it says it supports 800 or 1066 fsb, but memory choices are 667 and 800.

I take it 1066 is supposed to be matched with 800, but i thought these Intel chips worked best when fsb and memory speed were synch?

Foxery · Jun 22, 2008

FSB increases do not have a significant impact on Core2 performance. Intel mostly does this to keep the multipliers low, and to push an upgrade path for motherboard and RAM makers.

AnandTech looked into this twice. Relevant pages linked:
July 2006: C2Duo with 1333mhz FSB review
June 2007: C2Duo E6750 officially launched

mshan · Jun 22, 2008

Thanks for the info!

Looks like my mobo (Asus P5W DH Deluxe) was optimally designed for a 1066 fsb cpu and 800 DDR2.

But I always thought these Intel cpus worked best when in "sync" (cpu and memory bus at same frequency).

Is this true, and why did Asus design this mobo this way?

Accord99 · Jun 22, 2008

Originally posted by: mshan
But I always thought these Intel cpus worked best when in "sync" (cpu and memory bus at same frequency).

Is this true, and why did Asus design this mobo this way?

The Core 2's run faster (slightly) with faster clocked memory:

http://www.xbitlabs.com/articl...uo-memory-guide_5.html

DDR2-533 is the correct memory speed to run 1:1 with 1066MHz FSB Core 2's but as seen in the above review, DDR2-800 offers a small performance benefit in applications.

Cookie Monster · Jun 22, 2008

FSB speed doesn't translate to performance most the time (since most benchmarks are aimed squarely at a core's ability to number crunch). Its hard to quantify its gains since increasing the FSB is simply speeding up the inter communication between the CPU and the NB. Theoretically this would mean faster access to the memory controller i.e lower latency to the memory thanks to the higher FSB speed. So from this, the system should feel a little more responsive but thats not quantifiable as ive mentioned earlier. (I gues going from 1066MHz to 1333MHz is pretty negligible in that aspect as well)

However this is really different to Core 2 Quads. They pack alot of number crunching prowess (well seeing as its 2 Core 2 Duos in MCM), but the major problem for these quads is that the 2 Core duo dies communicate through via FSB. Definitely a big drawback where the latency caused by the inter communication of CPU1 with CPU 2. I suspect this could probably be felt during heavy multi tasking (Say your playing Crysis and have wallmaster on the background changing wall papers of your choice in a random fashion, everytime the wall paper changes, it would cause a jitter in Crysis). So a higher FSB is a must for quads, but not so much for duals.

And take a look at the core2 duo/quad OCing guide. It explains indepth on how memory speed works. (i.e how would a DDR2-1066 rated memory work with 1333MHz FSB)

maximal · Jun 22, 2008

My understanding is that Core architecture was always FSB starved, Intel realized this and built C/C2 architectures around large caches and efficient prefetchers to compensate for this FSB bottleneck. AMD on the other hand, equally realizing FSB as a major bottleneck to CPU performance, decided to eliminate FSB altogether. Arguably AMD was the wiser but failed to capitalize on this with its overall CPU architecture. Now Intel has seen the light and is abolishing FSB in its next architecture (Nehalem), which seems to be breaking new ground it terms of performance already (based on early previews).

magreen · Jun 22, 2008

Originally posted by: maximal
My understanding is that Core architecture was always FSB starved, Intel realized this and built C/C2 architectures around large caches and efficient prefetchers to compensate for this FSB bottleneck. AMD on the other hand, equally realizing FSB as a major bottleneck to CPU performance, decided to eliminate FSB altogether. Arguably AMD was the wiser but failed to capitalize on this with its overall CPU architecture. Now Intel has seen the light and is abolishing FSB in its next architecture (Nehalem), which seems to be breaking new ground it terms of performance already (based on early previews).

It's true that the high latency penalty of the Core architecture is its Achilles' Heel and would have been its downfall if not for the large caches and smart prefetchers.

But increasing the FSB doesn't seem to translate to more performance from every benchmark I've seen... e.g. the e4300 (800FSB) at 1.8GHz performed exactly like the e6300 (1066 FSB) would have performed at 1.8GHz. I'm not really sure why, come to think of it.

Idontcare · Jun 23, 2008

Originally posted by: Foxery
FSB increases do not have a significant impact on Core2 performance. Intel mostly does this to keep the multipliers low, and to push an upgrade path for motherboard and RAM makers.

Nailed it!

Desktop segment has not really needed a FSB boost since we hit 800MHz (4x200MHz).

When was the last time a laptop cycle was based on increasing the FSB? You rarely upgrade a laptop, unlike desktops where the CPU can be readily upgraded so there is less "pressure" to ensure a laptop upgrade includes the mobo & ram makers.

Originally posted by: magreen
It's true that the high latency penalty of the Core architecture is its Achilles' Heel and would have been its downfall if not for the large caches and smart prefetchers.

But increasing the FSB doesn't seem to translate to more performance from every benchmark I've seen... e.g. the e4300 (800FSB) at 1.8GHz performed exactly like the e6300 (1066 FSB) would have performed at 1.8GHz. I'm not really sure why, come to think of it.

Server's and blades are a different story versus the desktop market as you can have 2-4 sockets and memory contention becomes a real issue for performance scaling as you add more and more cores onto the back of the same FSB even with those advanced pre-fetchers.

This is why, for instance, Skulltrail (2 sockets) operates with a dual-FSB architecture.

Originally posted by: maximal
My understanding is that Core architecture was always FSB starved, Intel realized this and built C/C2 architectures around large caches and efficient prefetchers to compensate for this FSB bottleneck. AMD on the other hand, equally realizing FSB as a major bottleneck to CPU performance, decided to eliminate FSB altogether. Arguably AMD was the wiser but failed to capitalize on this with its overall CPU architecture. Now Intel has seen the light and is abolishing FSB in its next architecture (Nehalem), which seems to be breaking new ground it terms of performance already (based on early previews).

AMD most certainly capitalized on it. The K7 core architecture was strong but the IMC strategy made the K8 a product with no equal for many many business quarters and helped propel AMD into the lucrative multi-socket server markets where those opteron 8xxx's sold for $2k+.

Where AMD lost ground was that while they were busy tweaking the IMC end of the business (AM2, DDR->DDR2, etc) they were not aggressive enough in improving the core side of things. This is where Intel caught-up and surpassed AMD when they transistioned to Core2.

To say "Intel saw the light" implies the company's decision makers were oblivious to the option of integrating the memory controller for decades past. This is obviously not the case, for Intel and AMD it was always a "known" option for improving performance but it has its drawbacks in terms of cost (die-size) and you can rest assured some very smart people were making some very intelligent decisions (at both companies) as to when going with IMC would provide the greatest returns to shareholder value.

mshan · Jun 23, 2008

For a dual core C2D cpu, is 2 MB L2 cache the best balance point in terms of performance, heat production, and price?

magreen · Jun 23, 2008

Originally posted by: Idontcare

Originally posted by: magreen
It's true that the high latency penalty of the Core architecture is its Achilles' Heel and would have been its downfall if not for the large caches and smart prefetchers.

But increasing the FSB doesn't seem to translate to more performance from every benchmark I've seen... e.g. the e4300 (800FSB) at 1.8GHz performed exactly like the e6300 (1066 FSB) would have performed at 1.8GHz. I'm not really sure why, come to think of it.

Click to expand...

Server's and blades are a different story versus the desktop market as you can have 2-4 sockets and memory contention becomes a real issue for performance scaling as you add more and more cores onto the back of the same FSB even with those advanced pre-fetchers.

This is why, for instance, Skulltrail (2 sockets) operates with a dual-FSB architecture.

I'm glad you responded IDC, you're the person I wanted to ask this. I knew that part about why Opterons with their IMC and HTT beat Xeons in the 2P and especially 4P space. But there's something I don't understand in the desktop space.

From Anand's original review of Core back in June '06 we saw that increasing memory bandwidth by running memory at ddr2-667 or ddr2-800 did almost nothing to increase Core performance (whereas on X2 it makes a large performance difference). His explanation was that the high latency due to the Core architecture's non-integrated mem controller masked any increase in memory bandwidth. Whereas the X2 systems with their IMC didn't have that high latency, so increasing memory speed led to a direct increase in overall system performance. He proved this by showing that memory write bandwidth doesn't change at all with increased memory speed on a Core system. And he explained that it's only Core's really smart memory prefetchers that keep the system afloat.

But according to all this, wouldn't increasing the FSB lower that high latency in an Intel system? So wouldn't a higher FSB allow the effects of running the RAM faster on an Intel system be noticable as a speedup in system performance?

Idontcare · Jun 23, 2008

Originally posted by: magreen
From Anand's original review of Core back in June '06 we saw that increasing memory bandwidth by running memory at ddr2-667 or ddr2-800 did almost nothing to increase Core performance (whereas on X2 it makes a large performance difference). His explanation was that the high latency due to the Core architecture's non-integrated mem controller masked any increase in memory bandwidth. Whereas the X2 systems with their IMC didn't have that high latency, so increasing memory speed led to a direct increase in overall system performance. He proved this by showing that memory write bandwidth doesn't change at all with increased memory speed on a Core system. And he explained that it's only Core's really smart memory prefetchers that keep the system afloat.

But according to all this, wouldn't increasing the FSB lower that high latency in an Intel system? So wouldn't a higher FSB allow the effects of running the RAM faster on an Intel system be noticable as a speedup in system performance?

The simple answer is that the "cheap" way to increase bandwidth on the memory side of the equation was to increase latency of the memory modules and the way the chipset accesses those memory modules.

This Anandtech article by Kris Boughton really shed the light on the specifics of this issue: ASUS ROG Rampage Formula: Why we were wrong about the Intel X48

Had the Core architecture not included its "aggressive" pre-fetchers then I fully expect we'd observe a marked correlation between FSB (memory speed) and application performance.

The reason the performance of Core does not depend heavily on FSB is because the applications we use on the desktop are more limited to the data being read into the CPU's L2/L1 than writing the data to the ram and hard-drive. Since we are "read speed" limited this is where massive L2 cache sizes and pre-fetchers "hide" the read-latency to accessing ram on the otherside of that MCH. Pre-fetchers can't pre-emptively "write" to the ram though, so you can't hide the write latency...but our desktop applications are not "write speed" limited (typically) so we just don't experience this to be an issue.

You can rest assured that were AMD's opteron's and K8's equipped with similiarly aggressive pre-fetchers and L2$ sizes as C2D then you would be equally hard-pressed to see any improvement in performance as the bandwidth is increased. But that would increase die-size and cost of production.

Denithor · Jun 23, 2008

Originally posted by: mshan
For a dual core C2D cpu, is 2 MB L2 cache the best balance point in terms of performance, heat production, and price?

Regarding the 65nm chips, 2MB is slightly lower in performance than the bigger brothers in the lineup (L2 cache: 4MB or 2MB?)

But you're better off with a 45nm chip (Wolfdale vs. Conroe Performance).

In the 45nm chips if you are into gaming go for the 6MB chips, otherwise the 3MB chips will be perfectly acceptable (Influence of L2 Cache Size on Wolfdale Performance).

Basically, you are comparing the e7200 (2.53GHz, 3MB cache) at $130 to the e8400 (3.0GHz, 6MB cache) at $190. Pick whichever suites your needs better.

maximal · Jun 23, 2008

Originally posted by: Idontcare

AMD most certainly capitalized on it. The K7 core architecture was strong but the IMC strategy made the K8 a product with no equal for many many business quarters and helped propel AMD into the lucrative multi-socket server markets where those opteron 8xxx's sold for $2k+.

Where AMD lost ground was that while they were busy tweaking the IMC end of the business (AM2, DDR->DDR2, etc) they were not aggressive enough in improving the core side of things. This is where Intel caught-up and surpassed AMD when they transistioned to Core2.

To say "Intel saw the light" implies the company's decision makers were oblivious to the option of integrating the memory controller for decades past. This is obviously not the case, for Intel and AMD it was always a "known" option for improving performance but it has its drawbacks in terms of cost (die-size) and you can rest assured some very smart people were making some very intelligent decisions (at both companies) as to when going with IMC would provide the greatest returns to shareholder value.

What I meant was: Intel finally realized that it has squeezed as much performance as it can out of FSB designs with its smart cache/prefetcher architecture and decided to tackle IMC engineering challenge, something AMD has had a couple of years head start on. Intel was never oblivious IMC as a performance boosting design; it simply realized (correctly) that it would not result in a significant enough performance improvement with it current C/C2D/C2Q designs. Intel's own Pat G, in a recent interview, states that an IMC-only approach was not able to keep current multicore designs loaded to their satisfaction, an IMC design would only yield a performance leap when coupled with an overall architecture overhaul. Nehalem seems to fit perfectly with this philosophy.

From the business perspective, keeping FSB alive as long as possible also makes a lot of since on Intel's part since it has an old and large chipset business which AMD lacked, so AMD could abolish FSB faster and sooner with minimal financial repercussion. In Desktop space Intel should be congratulated; they were able to financially milk FSB for all its worth, but what they underestimated was how competitive AMD?s IMC K8 design ended-up being in multi-socket Server space and this cost them lucrative Server space market share. It can also be argued that AMD?s IMC switch was primarily motivated by its week competitive position in the server space; back then its Athlon MP architecture couldn?t hold a candle to Intel Xeons. As such both companies have gained some and lost some as a consequence of their strategic decisions, a perfectly natural competitive market response, after all there is only one pie. I absolutely agree that people behind these decisions know what they are doing.

Cookie Monster · Jun 23, 2008

What about the effect of FSB and Core 2 Quads? ive always thought that the quads never really "doubled" the L2 cache since they were two seperate dies i.e the total pool of L2 cache couldn't be shared. Is it safe to assume that a core 2 quad could be equivalent to a dual socket core 2 Duos on a single FSB (or is there more to it then that)?

Search

What is benefit of higher FSB with C2D?

mshan

Diamond Member

Foxery

Golden Member

mshan

Diamond Member

Accord99

Platinum Member

Cookie Monster

Diamond Member

maximal

Junior Member

magreen

Golden Member

Idontcare

Elite Member

mshan

Diamond Member

magreen

Golden Member

Idontcare

Elite Member

Denithor

Diamond Member

maximal

Junior Member

Cookie Monster

Diamond Member

TRENDING THREADS