Memory Divider, does it make difference to have Memory faster than CPU?

GundamF91

Golden Member
May 14, 2001
1,827
0
0
With my E4500 I can run 3Ghz at 333mhz FSB. My memory is DDR2-800, so it can do default 400mhz. I have Abit IP35E board, so in order to take full advantage of memory, I run divider CUP: Memory as 1:1.20.

I've always read that it divider of choice is 1:1. I think this was from the time when CPU was faster than memory, so if CPU had to wait for memory, you run async. Now my situation is that my memory is faster than CPU, so now it's memory waiting for CPU. Since CPU is the one with processing power, memory is just taking orders from CPU, does it really make a difference when memory divider makes it so memory runs faster than CPU? ? It seems that all you end up doing is memory sitting idling waiting for CPU.
 

SerpentRoyal

Banned
May 20, 2007
3,517
0
0
1:1 may yield higher overclock. The rule is to use 1:1 memory divider to lock-in the highest stable core speed, then use memory divider to overclock the RAMs.

1:1.20 memory divider will increase system performance by a very small amount. You should be able to measure the improvement with 1M and 32M Super Pi.
 

BonzaiDuck

Lifer
Jun 30, 2004
16,178
1,777
126
I never wanted to get into a p***ing contest on these issues, and I've had some productive and friendly exchanges with many here.

Some time back, maybe a couple weeks, I posted a brief argument that a CPU-to-RAM ratio that was <> 1:1 was equivalent to putting "wait-states" in the mix. In other words, there are clock cycles that aren't carrying any data mixed in with those which are.

I agree with another colleague here that this may be splitting hairs over a minor loss in effective speed, and some others have asserted benchmarks which might show some slight gains.

All I've discovered so far, however, is that running tight latencies that work at 800 Mhz doesn't score higher benchies than tighter latencies at speeds < 800 Mhz. I still want to do a little "tuning" with that same test and run it again. I'll report back with anything significant, and I'm always looking to see what the resta y'all discover.
 

cmdrdredd

Lifer
Dec 12, 2001
27,052
357
126
Originally posted by: BonzaiDuck
I never wanted to get into a p***ing contest on these issues, and I've had some productive and friendly exchanges with many here.

Some time back, maybe a couple weeks, I posted a brief argument that a CPU-to-RAM ratio that was <> 1:1 was equivalent to putting "wait-states" in the mix. In other words, there are clock cycles that aren't carrying any data mixed in with those which are.

I agree with another colleague here that this may be splitting hairs over a minor loss in effective speed, and some others have asserted benchmarks which might show some slight gains.

All I've discovered so far, however, is that running tight latencies that work at 800 Mhz doesn't score higher benchies than tighter latencies at speeds < 800 Mhz. I still want to do a little "tuning" with that same test and run it again. I'll report back with anything significant, and I'm always looking to see what the resta y'all discover.

in short, don't waste your time searching for the magic combination to get DDR2-1000+ with memory rated to DDR2-800, and also don't stress over not doing cas3 DDR2-667 with the same sticks.

On the whole, your games, and other applications will not run any faster or slower.
 

magreen

Golden Member
Dec 27, 2006
1,309
1
81
Basically, with AM2 X2 memory speed makes a big difference due to the integrated memory controller, and on Core2 memory speed makes little difference due to the high inherent latencies in the FSB architecture.
 

GundamF91

Golden Member
May 14, 2001
1,827
0
0
I'm wondering about that the "theoretical" debate on when memory is running faster than CPU. Wouldn't memory simply sit there waiting for CPU instructions, and then all you have is wasted memory cycles? I don't see how running 1:1.20 could improve bandwidth if there's no CPU cycle there to feed or retrieve info.
 

n7

Elite Member
Jan 4, 2004
21,281
4
81
In theory perhaps.

But in reality, the extra bandwidth helps.

All you have to do is look @ DDR3, which is basically designed to run 1:2, twice as fast as the FSB to realize that bandwidth is obviously important.

I don't know the exact whys, but i know it helps.
 

BonzaiDuck

Lifer
Jun 30, 2004
16,178
1,777
126
Let's refine the discussion a bit.

I think there are certain memory operations that exclude the CPU from the loop.

That would explain any performance gains you get from higher memory speed in Mhz on a divider that differs from 1:1.

And it's true -- the benchmark tests are synthetic, but you have to have some basis for comparison.

SiSoft Sandra has a "processor efficiency" benchtest that is explained as the comparison to an optimal match of L2-cache speed and memory speed.

And of course, in all the choices of benchmark tests, as you increase CPU speed when over-clocking the FSB and memory, you also increase the speed of L1 and L2 cache.
 

Plekto

Junior Member
Dec 19, 2007
5
0
0
Also remember that the difference between fast memory - say, CAS 2 DDR 400 and CAS4 DDR2-800 are very very tiny. The memory is almost always waiting on the CPU or FSB and so while the synthetic bandwidth tests may show some difference running applications that in real life would only be simulated in something like a database, web server, running a compiler, or similar. the net effect is very, very minor for us normal users.

95% of the time your game is just going to be sitting there burning cycles like using newspaper to start a fire.

L2 cache is by far a more important issue. An e2220(when it comes out) is basically a 1MB cache E4600. The difference is noticeable for the 2MB over the 1MB - far more than messing with overpriced memory.
 

MadScientist

Platinum Member
Jul 15, 2001
2,182
62
91
Ok, Duck
I've turned to the 1:1 side for now. I've got 2 configs for my son to try 3D rendering benchmarks on that are memtest86+ and Prime95 stable (8hr, no time to test longer). Both at 3.404 Ghz cpu speed and 1.2775 vcore, MCH at 1.25V default
1. Ram at 1:1, 756 mhz, 3-3-3-8-2T, 2.2V
2. Ram at 1:1.25, 946 mhz, 5-4-4-12-2T, 2.2V

I was hoping to get the vdimm down but my ram doesn't like 2.1V at 3-3-3-8-2T even after bumping the vMCH up 2 notches.

I'll post back with the results. It might be awhile, my son's no ball of fire.

Q6600, 3.4 Ghz @ 1.2775 vcore
Geforce 7600GT
4 x 1 GB Crucial Ballistix PC6400
Vista Ultimate 64
2 x Maxtor 300 GB SATA HDs
Samsung SH-S203B SATA DVD Burner
Corsair 620W PSU
Cooler Master Hyper TX2 HS/fan
Antec 900 case
 

cmdrdredd

Lifer
Dec 12, 2001
27,052
357
126
Originally posted by: n7
In theory perhaps.

But in reality, the extra bandwidth helps.

All you have to do is look @ DDR3, which is basically designed to run 1:2, twice as fast as the FSB to realize that bandwidth is obviously important.

I don't know the exact whys, but i know it helps.

Then why does almost every review of the same X38 mobo running DDR2 get the same FPS numbers and general performance scores as DDR3 give or take a few points? Is it that noticable? Answer: no not really.
 

BonzaiDuck

Lifer
Jun 30, 2004
16,178
1,777
126
Originally posted by: MadScientist
Ok, Duck
I've turned to the 1:1 side for now. I've got 2 configs for my son to try 3D rendering benchmarks on that are memtest86+ and Prime95 stable (8hr, no time to test longer). Both at 3.404 Ghz cpu speed and 1.2775 vcore, MCH at 1.25V default
1. Ram at 1:1, 756 mhz, 3-3-3-8-2T, 2.2V
2. Ram at 1:1.25, 946 mhz, 5-4-4-12-2T, 2.2V

I was hoping to get the vdimm down but my ram doesn't like 2.1V at 3-3-3-8-2T even after bumping the vMCH up 2 notches.

I'll post back with the results. It might be awhile, my son's no ball of fire.

Q6600, 3.4 Ghz @ 1.2775 vcore
Geforce 7600GT
4 x 1 GB Crucial Ballistix PC6400
Vista Ultimate 64
2 x Maxtor 300 GB SATA HDs
Samsung SH-S203B SATA DVD Burner
Corsair 620W PSU
Cooler Master Hyper TX2 HS/fan
Antec 900 case

During this season, I need to show my gratitude for good fortune -- like someone trying something with the same RAM-module configuration I have yet to test. I have this other "build" in progress with 4x1GB rebadged Tracers. ["Newegg LanFest Crucials"]

I'll be looking forward to your results. Did you ever post earlier what board and chipset you're running with this setup?
 

BonzaiDuck

Lifer
Jun 30, 2004
16,178
1,777
126
Hope y'all had a great holiday.

I've been testing my "NewEgg LanFest Crucial" [Tracer] DDR2-800's with a 5:6 ratio -- what some of you are calling 1.0 : 1.2.

So -- over-clocking my Q6600 to the (easy) 3.0 Ghz FSB 1,333, the memory speed is set to (2 x 400=) 800 Mhz. I'm testing timings tightened from 4,4,4,12 to 4,3,4,8.

Earlier this week, I had these set at 3,3,3,8,1T @ DDR2-667 Mhz. To get the tighter timings (at lower speed, I had to run the modules at VDIMM voltage of 2.15V. (But the BIOS monitor shows it as 2.19V -- "gettin' close . . . . ")

Right now, I'm running these -- as I said -- in "native" DDR2-800 mode with the 4,3,4,8,2T timings -- at 2.075V with MEMTEST86+ -- no errors. They ran in PRIME95 Blend Test for 3 hours at 2.10V.

I lose about 500 MB/s in bandwidth with the memory running at 800 Mhz -- over the 667 Mhz setting with tight timings.

Now I still have to read some things that require multiple passes -- like your college Atomic Physics text and the "Special Theory of Relativity."

But there is something in DDR2 memory called "additive latency" -- affecting CAS. Without pursuing greater illumination and understanding (the Christmas Brandy, you know), it allows for a more steady stream of data without clock-cycle gaps -- with no delays between banks or rows. This MAY be the industry's justification for higher-speed, looser latency memory running at a different speed than the CPU external frequency.

But in the synthetic bandwidth benchmarks, I"ve yet to prove that anything else is better than a 1:1 ratio. My highest bandwidth benchies have come between 9,500 and 9,750 MB/s, and I can't yet match them unless I can over-clock the CPU to a level requiring a higher VCORE (not my comfort-level) and a 1:1 ratio. Even so, I would still need to loosen the memory timings to something like 4,3,4,8.

With this 5:6 divider and those timings, I only get a benchmark result of about 8,700 MB/s.

My holiday pronouncement : We'll probably be better off with processors stock-rated at 1,333. An over-clock to 1,600 is only a 20% leap. Unless there are motherboard or chipset limitations, this would provide the 1:1 ratio, the full benefit of DDR2-800 speed, and the latency settings would be a matter of choosing quality dual-channel kits.

Maybe I sound like Rip Van Winkle, because I would be surprised if someone hadn't said this two years ago. And I know there are many who are running their DDR2-1066's at the rated spec.

But this still looks like "the bottom line" to me. And we'll have to wait and see how the DDR3 hype turns into real performance gains. So far, the tests run in this month's CPU Power User Magazine are still eating dust -- from my 5:6 DDR2-800 benchies of 8,700 MB/s.

 

lopri

Elite Member
Jul 27, 2002
13,310
687
126
Well, the title of this thread is kinda misleading.. I haven't seen any memory running faster than CPU on current Intel platform. :D Obviously you guys are talking about FSB here - and this discussion itself shows a great deal about the imbalance of FSB system and the confusions surrounding it. Basically what happens is that CPU runs much much faster than FSB and memory runs quite a bit faster than FSB. As you can imagine where the famous 'FSB bottleneck' comes from - It was a crucial factor for Intel's chipset business, however, so they couldn't give it up. The introduction of Opteron series and the following success has had Intel finally change their attitude towards the whole FSB nonsense, and we will see their first effort this time around next year.

Also, you cannot possibly discuss the benefit/penalty of sync/async memory as well as bandwidth vs latency without the prerequisite - i.e. under which chipset (or even mobo) and its configuration? Every chipset (mis)handles memory differently. One visible trend is, though, that Intel is moving towards higher bandwidth at the cost of latency, and there will always be a overhead running memory at non-integer ratio. (i.e. 1:1, 1:2, 1:4 will be more efficient than 5:6, 4:5, 2:3, etc., regardless of absolute performance) One of the biggest reason for this is obviously the quad-core CPUs.
 

MadScientist

Platinum Member
Jul 15, 2001
2,182
62
91
Originally posted by: lopri
Well, the title of this thread is kinda misleading.. I haven't seen any memory running faster than CPU on current Intel platform. :D Obviously you guys are talking about FSB here - and this discussion itself shows a great deal about the imbalance of FSB system and the confusions surrounding it.
Getting back to the OP's question, does everyone here agree then that his system would run better/no different if he ran his ram at 1:1, 333 mhz, with his FSB, instead of running his memory at it's rated 400 mhz?

The following is a quote from Anandtech's latest article, "Overclocking Intel's New 45nm QX9650: The Rules Have Changed". Not sure if this helps or adds to the confusion. Their recommendation, depending on what CPU you have, of starting at an FSB of 400Mhz, may not be doable for some, i.e., I can do 3.6 Ghz (9 x 400Mhz) stable on my Q6600, but prefer to keep it lower due to temps.

"Tuning Memory Subsystem Performance

Earlier, we talked about the importance of first testing your motherboard's memory subsystem before moving to the CPU. When you think about it, the reason is clear. Encountering an error while testing blindly provides absolutely no helpful information as to the source of the problem. Since both the CPU and memory stability are dependent on the FSB it only makes sense that we remove them from the equation and first tune our motherboard at our target FSB. This is accomplished by setting the target FSB (we recommend you start at 400 MHz) in the BIOS, making certain to select a CPU multiplier which places the final processor frequency at or below the default value. Next, loosen up all primary memory timings and set the memory voltage to the modules' maximum rated value. Assuming the system is in good working order, we can now attribute all observed errors to discrepancies in the MCH settings and nothing else.


Preparing to run Prime95's blend test for the first time


Boot the system in Windows and launch an instance of Prime95. From the menu select "Options" then "Torture Test?" and highlight the option to run the blend test (default). Now click "OK" to start the test. The blend test mode runs larger FFT values, meaning the processor must rely heavily on the memory subsystem when saving and retrieving intermediate calculation results. Although a true test of system stability would require many hours of consecutive testing, in the interest of time let the program execute for a minimum of 30 minutes.

If you encounter no errors (and the system is indeed still running), you can consider the memory subsystem "stable" at this point. If this is not the case, exit Windows, enter the BIOS, and try slightly increasing the MCH voltage. Repeat this process until you find you can complete (at least) a 30 minute run with no errors. If for some reason you find that increasing the MCH, voltage continues to have no effect on stability, or you have reached your allowable MCH voltage limit, you may be attempting to run the MCH higher than what is achievable under stable conditions. Setting Command Rate 2N - if available in the BIOS - loosening tRD, or removing two DIMMs (if you are running four) may help. If you find modifications to those items allows for completion of an initial Prime95 test, be sure to continue the testing by reducing the MCH voltage until you find the minimum stable value before moving on.

On the other hand, if you find that you can comfortably complete testing with additional MCH voltage margin to spare then you are in a good position to dial in some extra performance. Whether or not you wish to depends on your overall overclocking goal. Generally, more performance requires more voltage; this means more heat, higher temperatures, and increased operating costs. If efficiency is your focus, you may wish to stop here and move on to the next phase in tuning. Otherwise, if performance is your only concern, decreasing tRD is a great way of improving memory bandwidth, albeit usually at the expense of a higher MCH voltage.

In the end, as long as the system is stable, you are ready to move on to the next step. The insight necessary to determine just what to change and the effect if will have on stability and performance is something that comes only with experience. We cannot teach you this and experimenting further at a later time will help you sharpen these skills.


Select a Memory Divider and Set Some Timings

The latest generation of Intel memory controllers provides a much more expansive choice in memory dividers than ever before. That said, there are only three that we ever use, the most obvious of these being 1:1. Setting 1:1, simply put, means that the memory runs synchronously with the FSB. Keep in mind though that the FSB is quad-pumped (QDR) and memory is double data rate (DDR). For example, setting an FSB of 400MHz results in a 1.6GHz (4 x 400) effective FSB frequency at DDR-800 (2 x 400), assuming your memory is running 1:1. Selecting 5:4 at an FSB of 400MHz sets a memory speed of DDR-1000 (5/4 x 2 x 400). The other two dividers we would consider using besides 1:1 are 5:4, and in the case of DDR3, 2:1.

Regrettably, there are rarely any real performance gains by moving to memory ratios greater than 1:1. While it is true that many synthetic benchmarks will reward you with higher read and copy bandwidth values, the reality of the situation is that few programs are in fact bottlenecked with respect to total memory throughput. If we were to take the time to analyze what happens to true memory latency when moving from DDR2-800 CAS3 to DDR2-1000 CAS4, we would find that overall memory access times might actually increase. That may seem counterintuitive to the casual observer and is a great example of why it's important to understand the effect before committing to the change.

Start your next phase of tuning by once again entering the BIOS and selecting a memory divider. As mentioned earlier, even though there are many choices in dividers you will do best to stick to either 1:1 or 5:4 when using DDR2 and 2:1 when running DDR3. Next set your primary timings - typically, even the worst "performance" memory can handle CAS3 when running at about DDR2-800, CAS4 to about DDR2-1075, and CAS5 for anything higher. These are only approximate ranges though and your results will vary depending on the design of you motherboard's memory system layout, the quality of your memory, and the voltages you apply. You may find it easiest to set all primary memory timings (CL-tRCD-tRP) to the same value when first testing (i.e. 4-4-4, 5-5-5, etc.), and as a general rule of thumb, cycle time (tRAS) should be set no lower than tRCD + tRP + 2 when using DDR2 - for DDR3 try to keep this value between 15 and 18 clocks inclusive."



 

BonzaiDuck

Lifer
Jun 30, 2004
16,178
1,777
126
Thanks, MadScientist. I either missed that article before, or read through it too quickly.

 

BonzaiDuck

Lifer
Jun 30, 2004
16,178
1,777
126
It's probably easier to start at 400 Mhz under the implicit scenario of hardware addressed by the article: QX9650 @ 1,333FSB default. For our Q6600's @ 1,066FSB, it's a bigger leap.

Somewhere else today -- perhaps an article also at AnandTech about OC'ing the Kentsfield Q6600 B3-stepping, they suggest that a "safe" voltage for the B3 is a maximum 1.46+ Volts.

[I wanna say . . . . . "Are you KID-DING?!"]