Accessing System Memory
Accessing system memory involves a rather complicated sequence of events. First, the CPU requests data from where it thinks those data are, that is, from a logical or virtual address space that is created for every application and program running. This virtual address space needs to be translated into the real or physical address space, and this is done mostly by the memory controller - an integral part of the chipset. After the correct address has been determined using the translation cues stored in the CPUs translation lookaside buffers (TLBs), the signals for the addresses have to be generated. The first selection narrows the location of the data down to one side of any memory module by means of the chip-select signal. Afterwards, since we are talking about DRAM, the first signal to be sent from the memory controller to the memory the row address is the Row address by means of a Row Activate Command.
Time- Muxed Row and Column Address Generation and the Three Key Latencies:
tRCD and CAS Delay
Instead of using a hand-shake protocol to acknowledge that the row is ready, synchronous DRAM (SDRAM) specifies a time - after which it is safe to assume that the row is open - as the so-called Row-to-Column Delay (tRCD). That means that after a statistically sufficient time where the tRCD has been satisfied, the row decoders are turned off and the column decoders are turned on by signaling a logical true on the Column Address Strobe (CAS) command line. This allows the same address lines that were used to specify the Row address to now specify the column address by issuing a Read command. This sequence of events and the use of the same channels to perform two different tasks is called time-multiplexing or "time-muxed DRAM addressing". After finding the correct column address and retrieving (prefetching) the data from the memory cells into the output buffers, the data are ready to be released to the bus. This time interval is called the CAS delay or CAS latency.
tRP
As long as the requested data are found within the same row (or page) of memory, the consecutive accesses will be "in page" or so-called "page hits". Any requests of data that are stored outside the currently open row, will miss that page and are therefore called page misses. In that case, the open page will have to be closed and the new page will have to be opened. The sequence of events includes disconnecting of the wordlines writing back all data from the sense amplifiers to the memory cells and finally shorting of the bitlines and bitlines "bar" to put everything back into a virgin state. This process is generally referred to as RAS precharge and the time required to execute all steps involved is called the Precharge latency or tRP.
In order to retrieve the next set of data the appropriate memory row will have to be opened with a bank activate command and the circle completes.
Latency Listings
There is no general consensus on how to list the latency parameters, some vendors are starting with the precharge, others are using tRCD as the first, however, the Solid State Technology Association formerly known as Joint Electron Device Engineering Council (JEDEC) has set forth certain guidelines pertaining to the nomenclature used and the code used on the modules to specify the parameters used. According to these specifications the sequence used is CAS Latency - tRCD - tRP - tRAS, where tRAS is the minimum bank cycle time, that is the time a row needs to be kept open before another request can force it to be closed. Therefore, a module specified as 2-3-2-7 will use a CAS latency of 2, a tRCD of 3, a Precharge delay of 2 and a tRAS of 7.
In general, lower latencies will yield better performance but there are a number of exceptions. Most memory devices will only support latency settings of 2 and higher, however, there have been memory chips capable of running at 1-1-1-2, notable examples were the EMS HSDRAM / ESDRAM series. One important distinction between the CAS Delay and other latencies is that the CL options have to be supported in hardware in the form of pipeline stages, whereas the other latencies are simply "time-to complete" values