Poll: Samsung says Rambus = Lowest latency / Highest Bandwidth?

fkloster

Diamond Member
Dec 16, 1999
4,171
0
0
Read here. Is this BS?



<< Measured at either the component or system level, Rambus DRAMs have the fastest latency. Surprisingly, due to the mismatch between its interface and core timing, the 133 MHz SDRAM is significantly slower than the PC100 SDRAM. The RDRAM's low latency coupled with its 1.6 gigabyte per second bandwidth provide the highest possible sustained system performance >>



NO FLAMES / NO HATRED: NO NEED FOR THAT CRAP
 

Vegito

Diamond Member
Oct 16, 1999
8,329
0
0
There's probably a clause that when they license to manufacture rambus, they have to publish rambus garbage... how is it that pc133 beats rambus in test, etc... those guys are smoking crack.
 

PowerJoe

Senior member
Oct 9, 1999
887
0
0
That's why latency graph say &quot;Lower numbers mean better performance&quot;. Some people just can't read...

-PJ
 

KarlHungus

Senior member
Nov 16, 1999
638
0
0
That report is BS. The writer generalizes the best case for Rambus as being the overall system latency. Yes, in the case of a single Rambus memory chip the latency could very well be less than SDRAM. Unfortunately for Rambus and their backers this configuration is far from the norm in PCs where there are multiple chips in each RIMM, and multiple RIMMs in each system (if you can afford it ;) ). Any report with static numbers referring to Rambus latency isn't telling the whole truth.
 

Osangar

Junior Member
Sep 19, 2000
22
0
0
Much of the latency of DRDRAM comes from the way it is used in PC memory. (DRDRAM uses the same DRAM cells as SDRAM so at that level the latencies are the same.)

For small embedded devices with memory located close the processor and soldered directly to the board Rambus can get very good latency. Maybe even better then SRDAM. In a typical PC setup with the processor interfacing to the memory through a chipset and very long channels DRDRAM is at a big latency disadvantage.

There are some ways to reduce this. A fast bus like the P4 has can help a lot and an on chip memory controller like the EV7 (Alpha 21364) has is even better.

A while ago on of the Compaq engineers working on the EV7 bus defended the use of Rambus in a thread at aceshardware and what he said was quite interesting.

What it boiled down to was that the high bandwidth per pin allows you move the memory controller on chip. This reduces the latency below that of SDRAM or DDR SDRAM with an off chip memory controller. This approach was only practical with a memory like Rambus that has a very high bandwidth per pin. Even DDR would require nearly 4 times as may pins for memory then DRDRAM and the total number of pins is limited by production technologies.

So what Samsung said was not necessarily untrue, it just does not apply to current PC designs.
 

Noriaki

Lifer
Jun 3, 2000
13,640
1
71
I'll do my best to keep my personal opinion of Rambus (at least the company) out of this thread.

I've read serveral places that a single Rambus chip has very good latency, the trouble comes in when you start toggling between active and sleep states.

Because a single Rambus chip produces a lot of heat, if you had all of them active at once a RIMM would need a fan. What they do to combat this is have only 1 chip on the RIMM active at a time, while all the others sleep, with the heat spreader plate, this lowers the overall heat/power to be on par with SDRAM.

If there was some other way to combat heat, and you could have all the Rambus chips active at once, it would significantly help latency. I'm not saying Rambus would be less latent than SDRAM, but we can't know for sure without seeing it in action.

There is the great argument that adding more RIMMs to a channel slows the channel, and that may be true, but all the data is latched together on a clock cycle, the same as it is with SDRAM, adding DIMMs makes the system slower to, but you don't notice it because they are clocked together, while Rambus's delay in adding RIMMs may be more, by far the biggest culprit in Rambus's latency is this business of flipping between Active and Wait states.

Perhaps measured at the system level without using Wait states Rambus does have the lowest latency.

As for the article overall, it's very poorly written. If you ignore latency added by state switching on RIMMs then most of it's true but barely...Lets just a pick a few of my favourite parts:



<< sustainable system bandwidth has jumped a factor of 10 over SDRAM. >>


PC100/133 SDRAM have 800MB/s / 1GB/s, Rambus has 1.6GB/s, he said sustainable, not peak, but even still I find it hard to believe he got a factor of 10.

Lets say he is right...so now we are talking sustainable bandwidth, meaning in practice, not theoretical right? But then:


<< The system latency of PC100 and 133 MHz SDRAM adds five clocks to the component latency. The total SDRAM system latency is 90 ns for PC100 and 75 ns for 133 MHz SDRAM.
...
Adding in the component latency, the RDRAM system latency is 70 ns, significantly faster than both PC100 and 133 MHz SDRAM.
>>


That's a pretty theoretical number crunch. So make up your mind guy, are we talking theoretcial numbers, or real-world?




<< he superior electrical characteristics of a Rambus system eliminate the two-cycle addressing problem, requiring only 10 ns to drive the address to the RDRAM. >>


Hrmm didn't he just finish saying that Rambus only has a 1.25ns cycle? So isn't 10ns 8 clocks isntead of 2? oh yeah that solves the double cycle problem. As I recall the statement is true, it only takes 10ns to address Rambus, and 20 for SDRAM, but he makes it sound like Rambus has managed to squeeze the whole address lock into 1 clock cycle. It would be 1 on SDRAM, but it isn't on Rambus...

I can't say I see any outright lies, but it's misleading at best...
 

fkloster

Diamond Member
Dec 16, 1999
4,171
0
0


<< Because a single Rambus chip produces a lot of heat, if you had all of them active at once a RIMM would need a fan. What they do to combat this is have only 1 chip on the RIMM active at a time, while all the others sleep, with the heat spreader plate, this lowers the overall heat/power to be on par with SDRAM. >>

-Noriaki

I have a big problem with this statement Noriaki. I have had two rimms running in my setup for many months and their heat spreaders are luke warm to the touch. Not hot. Also in my bios I can manually disable my rdram clock sleep state. I have disabled it for months with no added heat under heavy loads (Unreal flyby for days). Also do you have a link to your statement about 1 chip on the rimm active @ one time? Power consumption of rimms per pin should be much lower than SDRAMS are they not?
 

Mark R

Diamond Member
Oct 9, 1999
8,513
16
81
Rambus chips produce a lot more heat than SDRAM. A typical PC800 Rambus chip produces about 1.6 W when active, compared to a PC133 SDRAM chip which produces 0.4 W. Additionally, idle Rambus chips produce about 0.4 W, idle SDRAM chips use about 0.06 W. [Source: Samsung data sheets]

[Active in this case is taken as meaning having pages open, therefore being ready to send/receive, rather than actually transmitting/receiving data]

I think the limitation of only having 1 or 2 active Rambus chips at a time is a limitation of the i820 chipset - unfortunately, I don't have this information to hand.

This limitation may well have been deliberate in order to prevent overheating of RIMMs - 4 active chips on a RIMM would certainly risk overheating.


 

Osangar

Junior Member
Sep 19, 2000
22
0
0
Power management doesn?t kill Rambus performance, it just gives away an advantage that DRDRAM should have. It limits the 820 chipset to 8 open pages total, but this is still more then the 4 SDRAM is limited to.

Read here.

http://www.realworldtech.com/page.cfm?ArticleID=RWT112299000000

This is from Paul DeMone's Rambus article. Paul is a DRAM designer and this is the best article on the web about Rambus. The only thing to remember is that his analysis is based on the 820 chipset which is missing a lot of the features of later Rambus designs.

The 840 for example uses interleaved memory to hide latency and supports many more open pages. The 850 (for the P4) should be synchronous which should reduce latency and the faster FSB should reduce the total latency as well.
 

fkloster

Diamond Member
Dec 16, 1999
4,171
0
0
Linkified

I must argue that IF a rimm is limited to one active chip at a time, then it would certainly NOT be a limitation of the i820 chipset but the actual serial architecture of rdrams.
 

Osangar

Junior Member
Sep 19, 2000
22
0
0
On the contrary it would be a limitation of the chipset. The i820 only supports 8 active pages total. RDRAM supports either 8 or 16 open pages per device. The i840 supports at least 32 open pages.
 

fkloster

Diamond Member
Dec 16, 1999
4,171
0
0
The chipset (i820) has the ability to only use one chip on any given rimm @ one time?
 

Osangar

Junior Member
Sep 19, 2000
22
0
0
This is a little unclear. This would make sense since it can only support 8 open pages and one chip can have as few as 8 open pages, but I do not know if the pages can span more then a single chip. This isn't that important since it is the number of open pages that count.
 

HayZeus 2000

Senior member
Dec 22, 1999
234
0
0
BS. Since Noriaki has already dissected the article, I'll just add that skewing a whitepaper is nothing new.

Okay, now to clear up some issues:

The one device per channel limitation is a design limitation of the Rambus interface itself. Think about it: Rambus is serial in nature-a single Rambus chip (device) can spit out 16 bits of data at a time on a 16 bit bus (channel).

I believe the 8 active page limitation is based on the fact that current Rambus chips contain 16 pages per chip. Since these pages share sense amps with their neighbors, only half of the pages may be open at a given time. The most pages a single Rambus chip can have is 32, but to my knowledge a 32 page Rambus chip has yet to be mass-produced.

Lastly,


<< I have a big problem with this statement Noriaki. I have had two rimms running in my setup for many months and their heat spreaders are luke warm to the touch >>



Sounds like your heat spreaders are doing a good job then. Think about it: at 1.6W dissipation for the active Rambus device, that's close to the level of dissipation of a laptop processor unit. While most of the time the device wouldn't be active long enough to heat up enough to impact performance, the possibility exists that the device could remain open for too long and overheat, hence the heat spreader.

edit: fixed italics
 

Osangar

Junior Member
Sep 19, 2000
22
0
0
&quot;The one device per channel limitation is a design limitation of the Rambus interface itself. Think about it: Rambus is serial in nature-a single Rambus chip (device) can spit out 16 bits of data at a time on a 16 bit bus (channel).&quot;

I disagree. The number of open banks and therefor the number of open pages is limited by the implementation, not the RDRAM technology itself. Since technology is serial, there is nothing to prevent having more then 1 device active except for power management which is done by the chipset. Here is a quote from Paul DeMone's article.


&quot;A 128 Mbyte DIMM containing eight SDRAMs has only 4 banks since the devices operate in parallel, but a 128 Mbyte RIMM with eight DRDRAMs can have 256 banks. Since only a single page in a bank can be open at a time, having lots of banks should allow many pages to be open and reduce the chances of page conflicts between two or more memory access threads of locality &quot;

Of course he does go on to explain the limitations imposed by sense sharing amps etc. Everything I have read about RDRAM says it can support a large number of open pages.

I am confident that the limitation is imposed by the chipset, not the technology itself. If you have links to the contrary I would like to see them.
 

HayZeus 2000

Senior member
Dec 22, 1999
234
0
0
When I said that a single RDRAM chip can spit out 16 bits of data on a 16 bit channel, hence the device limitation, I assumed a state of ATTNR or ATTNW-actually reading or writing data, which is fairly self-evident. If when you say active you mean in a state of ATTN and NOT reading/writing data then you are right-that is a function of the chipset. I apologize for inadvertently adding to the confusion I was trying to clear up. :)

 

rmblam

Golden Member
Aug 24, 2000
1,237
0
0
Excerpts from here

&quot;As you add more devices to a RAMBUS system, the entire system has higher and higher read latency. So, while individual RDRAM chips might have a read latency (access time) of 20ns, which is about the same read latency as some SDRAMs, once you stick them in a system with three full RIMMs the overall system latency (which is the total amount of time from when the CPU sends out the read command and the data arrives back at it) will be either slightly better or significantly worse than the system latency for an SDRAM system&quot;

&quot;Further aggravating the read latency situation is the fact that RAMBUS doesn't support critical word first bursting. When the CPU asks for 8 bytes of data from a conventional SDRAM, the memory system sends it back 16 bytes data along with under the presumption that it'll probably need those extra 8 bytes shortly. Nevertheless, the 8 bytes that were specifically asked for-- the critical word--arrive at the CPU first, with the other freebie bytes coming next. RDRAM doesn't do this. It just sends you a whole 16 byte train of data, and if the 8 bytes you asked for are at the end of that train, then you'll just have to wait until they get there. &quot;

&quot;Finally, since the bus is so long and passes through so many devices, the capacitances added in by the loads of all of the attached devices significantly increase bus signal propagation time. So again, the more devices you stick on the RAMBUS channel, the worse the latency gets. &quot;

&quot;It's important to note that since all the RDRAMs on a channel share the same data bus, only one device per channel can be in either the ATTNR or ATTNW states at any given time.&quot;

&quot;So with RAMBUS' power management states, you basically trade off power savings for performance. The average system read latency, and thus the overall system performance, of a RAMBUS-based system will vary widely depending on how the chipset handles these states. The more RDRAMs that a system keeps in the lower power states, the less power it will use but the worse its performance will be.

One method of managing RDRAM's power states is called a closed page policy&quot;



&quot;Since only a device in the Active state can have active banks, and since only one device per channel can be active under a closed page policy, this policy limits the number of active banks you can have on a channel to the number of banks you can have active on a single RDRAM. So if you were using 32 bank RDRAMs with a closed page policy, the largest number of active banks (and hence open rows) you could have on a channel would be 16.&quot;

&quot;The i820 also limits the total number of open pages per channel to 8, which sort of throttles RAMBUS' potential performance. Hopefully, future Intel chipsets will allow more open pages than this.&quot;

&quot;This means that even though you could theoretically leave all the RDRAMs in a system in the ATTN state so that you could have half of the system's banks active, it's doubtful that you'd ever want to do this for systems with more than a few RDRAMs. The power consumption would be pretty high, and you might roast something.&quot;

&quot;a RDRAM die is just larger than an SDRAM die, so an individual RDRAM chip generates more heat when all its parts are running full bore. This makes the issue of spreading this heat out especially pressing&quot;

&quot;Industry climate and public opinion aside, however, it seems that in the end, neither RDRAM nor DDR SDRAM will &quot;win&quot; in any sort of general sense. The two technologies are different enough to where they'll be used in specific markets in order to meet specific application usage profiles and specific system design requirements. How that plays out in the mainstream PC market remains to be seen. With the rumblings about Intel's possible intention to produce chipsets for the P4 that support DDR SDRAM, what was once thought to be an immanent, unstoppable descent of RDRAM into the mainstream now looks like a very complicated market scenario&quot;