In depth look @ Ram timings: ATi vs nVidia vs Samsung

Sentential

Senior member
Feb 28, 2005
677
0
0
http://www.ocforums.com/showthread.php?t=403149

Felinuz and I have been looking into why ATi cards always seem to clock poorly when compared to their nVidia based counter parts. What we have found is very interesting. Pulling from Samsung's physical tech sheet:

Samsung Default Timings

/************************************************** ******************************/
/* Define Statement */
/************************************************** ******************************/

/************************************************** ******************************/
/* -12 Specification */
/************************************************** ******************************/
`ifdef S12 // -11 spec
`define tGR 2 // Gapless 2 tCK
`define tRTW 14 // Read to Write at same bank - CL=9tCK, tCDLR=5tCK

`define tRC 35 // Row cycle time(min) - operation (tCK)
`define tRFC 45 // Row cycle time(min) - Auto Refresh (tCK)
`define tRASmin 25 // Row active minimum time (tCK)
`define tRASmax 100000 // Row active maximum time (tCK)
`define tRCDRD 12 // Ras to cas delay(min) for Read (tCK)
`define tRCDWR 7 // Ras to cas delay(min) for Write (tCK)
`define tRP 10 // Row precharge time(min) (tCK)
`define tRRD 9 // Row to row delay(min) (tCK)
`define tWR 7 // Last data in Row precharge (7 tCK)
`define tCDLR 5 // Last data in to Read delay (6 tCK)
`define tCDLW 0 // Last data in to Write delay (0 tCK)
`define tCCD 2 // Col. address to col. address delay (3 tCK)
`define tCKmin 1.2 // Clock minimum cycle time (ns) - CL=9
`define tCKmax 6 // Clock maximun cycle time (ns) - CL=9

`define tCC5 2.5 // Clock minimun cycle time at cas latency=5 (ns)
`define tCC6 2.22 // Clock minimun cycle time at cas latency=6 (ns)
`define tCC7 1.8 // Clock minimun cycle time at cas latency=7 (ns)
`define tCC8 1.6 // Clock minimun cycle time at cas latency=8 (ns)
`define tCC9 1.4 // Clock mimimum cycle time at cas latency=9 (ns)
`define tCHmin 0.45 // Clock high pulse width (min:0.45tCK, max:0.55tCK)
`define tCHmax 0.55 // Clock high pulse width (min:0.45tCK, max:0.55tCK)
`define tCLmin 0.45 // Clock low pulse width (min:0.45tCK, max:0.55tCK)
`define tCLmax 0.55 // Clock low pulse width (min:0.45tCK, max:0.55tCK)
`define tDQSCK 0.23 // DQS out edge to clock edge (min:-0.26, max:0.26)
`define tDQSQ 0.14 // Data strobe edge to output data edge (min:-0.16, max:+0.16)
//`define tSLZ 0.75 // DQS low-Z to vaild DQS delay @ Read Preamble (min:tCK-0.75, max:tCK+0.75)
//`define tSHZ 0.75 // Valid DQS to DQS Hi-Z delay @ Postamble (min:tCK/2+0.75, max:tCK/2+0.75)
//`define tHZQ 0.75 // Data out active to High-Z (min:tCK/2-0.75, max:tCK/2+0.75)
//`define tDQCK 0.16 // out data edge to clock edge (min:-0.16, max:0.16)
`define tDQSS1min 0.85 // Write command to first DQS latching transition - WL=1
`define tDQSS1max 1.15 // Write command to first DQS latching transition - WL=1
`define tDQSS2min 1.85 // Write command to first DQS latching transition - WL=2
`define tDQSS2max 2.15 // Write command to first DQS latching transition - WL=2
`define tDQSS3min 2.85 // Write command to first DQS latching transition - WL=3
`define tDQSS3max 3.15 // Write command to first DQS latching transition - WL=3
`define tDQSS4min 3.85 // Write command to first DQS latching transition - WL=4
`define tDQSS4max 4.15 // Write command to first DQS latching transition - WL=4
`define tDQSS5min 4.85 // Write command to first DQS latching transition - WL=5
`define tDQSS5max 5.15 // Write command to first DQS latching transition - WL=5

//`define tSDQS 0 // DQS-in setup time (ns)
//`define tWPREH 0.25 // DQS-in hold time (0.25 tCK)
//`define tSIHmin 0.45 // DQS-in high level width (0.45tCK)
//`define tSIHmax 0.55 // DQS-in high level width (0.55tCK)
//`define tSILmin 0.45 // DQS-in high level width (0.45tCK)
//`define tSILmax 0.55 // DQS-in high level width (0.55tCK)
//`define tSICmin 0.9 // DQS-in cycle time (0.9 tCK)
//`define tSICmax 1.1 // DQS-in cycle time (1.1 tCK)
`define tDSC 1 // DQS-in cycle time (1 tCK)

`define tIS 0.3 // Input setup time (ns)
`define tIH 0.3 // Input hold time (ns)
`define tMRD 7 // Mode register set cycle time (6tCK)
`define tDS 0.16 // Data in & DM set-up time (ns)
`define tDH 0.16 // Data in & DM hold time (ns)

//`define tDIPW 1.75 // Data in & DM input pulse width (ns)
//`define tDV 0.375 // Output DQS vaild window (tCK)

//`define tPDEX 7.5 // Power Down exit Time (ns)
//`define tXSA 20000 // Exit self refresh to bank active command (tCK)
`define tREF 7.8 // Refresh interval time (us)
`define tWPST 0.4 // DQS write postamble time (tCK)
`define tDAL 24 // Auto precharge write recovery + precharge time (ns)
`endif


/************************************************** ******************************/
/* -14 Specification */
/************************************************** ******************************/
`ifdef S14 // -14 spec
`define tGR 2 // Gapless 2 tCK
`define tRTW 14 // Read to Write at same bank - CL=9tCK, tCDLR=5tCK

`define tRC 31 // Row cycle time(min) - operation (tCK)
`define tRFC 39 // Row cycle time(min) - Auto Refresh (tCK)
`define tRASmin 22 // Row active minimum time (tCK)
`define tRASmax 100000 // Row active maximum time (tCK)
`define tRCDRD 10 // Ras to cas delay(min) for Read (tCK)
`define tRCDWR 6 // Ras to cas delay(min) for Write (tCK)
`define tRP 9 // Row precharge time(min) (tCK)
`define tRRD 8 // Row to row delay(min) (tCK)
`define tWR 6 // Last data in Row precharge (7 tCK)
`define tCDLR 5 // Last data in to Read delay (6 tCK)
`define tCDLW 0 // Last data in to Write delay (0 tCK)
`define tCCD 2 // Col. address to col. address delay (3 tCK)
`define tCKmin 1.4 // Clock minimum cycle time (ns) - CL=9
`define tCKmax 6 // Clock maximun cycle time (ns) - CL=9

`define tCC5 2.5 // Clock minimun cycle time at cas latency=5 (ns)
`define tCC6 2.22 // Clock minimun cycle time at cas latency=6 (ns)
`define tCC7 1.8 // Clock minimun cycle time at cas latency=7 (ns)
`define tCC8 1.6 // Clock minimun cycle time at cas latency=8 (ns)
`define tCC9 1.4 // Clock mimimum cycle time at cas latency=9 (ns)

`define tCHmin 0.45 // Clock high pulse width (min:0.45tCK, max:0.55tCK)
`define tCHmax 0.55 // Clock high pulse width (min:0.45tCK, max:0.55tCK)
`define tCLmin 0.45 // Clock low pulse width (min:0.45tCK, max:0.55tCK)
`define tCLmax 0.55 // Clock low pulse width (min:0.45tCK, max:0.55tCK)
`define tDQSCK 0.26 // DQS out edge to clock edge (min:-0.26, max:0.26)
`define tDQSQ 0.16 // Data strobe edge to output data edge (min:-0.16, max:+0.16)

//`define tSLZ 0.75 // DQS low-Z to vaild DQS delay @ Read Preamble (min:tCK-0.75, max:tCK+0.75)
//`define tSHZ 0.75 // Valid DQS to DQS Hi-Z delay @ Postamble (min:tCK/2+0.75, max:tCK/2+0.75)
//`define tHZQ 0.75 // Data out active to High-Z (min:tCK/2-0.75, max:tCK/2+0.75)
//`define tDQCK 0.16 // out data edge to clock edge (min:-0.16, max:0.16)
`define tDQSS1min 0.85 // Write command to first DQS latching transition - WL=1
`define tDQSS1max 1.15 // Write command to first DQS latching transition - WL=1
`define tDQSS2min 1.85 // Write command to first DQS latching transition - WL=2
`define tDQSS2max 2.15 // Write command to first DQS latching transition - WL=2
`define tDQSS3min 2.85 // Write command to first DQS latching transition - WL=3
`define tDQSS3max 3.15 // Write command to first DQS latching transition - WL=3
`define tDQSS4min 3.85 // Write command to first DQS latching transition - WL=4
`define tDQSS4max 4.15 // Write command to first DQS latching transition - WL=4
`define tDQSS5min 4.85 // Write command to first DQS latching transition - WL=5
`define tDQSS5max 5.15 // Write command to first DQS latching transition - WL=5

//`define tSDQS 0 // DQS-in setup time (ns)
//`define tWPREH 0.25 // DQS-in hold time (0.25 tCK)
//`define tSIHmin 0.45 // DQS-in high level width (0.45tCK)
//`define tSIHmax 0.55 // DQS-in high level width (0.55tCK)
//`define tSILmin 0.45 // DQS-in high level width (0.45tCK)
//`define tSILmax 0.55 // DQS-in high level width (0.55tCK)
//`define tSICmin 0.9 // DQS-in cycle time (0.9 tCK)
//`define tSICmax 1.1 // DQS-in cycle time (1.1 tCK)
`define tDSC 1 // DQS-in cycle time (1 tCK)

`define tIS 0.35 // Input setup time (ns)
`define tIH 0.35 // Input hold time (ns)
`define tMRD 6 // Mode register set cycle time (6tCK)
`define tDS 0.18 // Data in & DM set-up time (ns)
`define tDH 0.18 // Data in & DM hold time (ns)

//`define tDIPW 1.75 // Data in & DM input pulse width (ns)
//`define tDV 0.375 // Output DQS vaild window (tCK)

//`define tPDEX 8.75 // Power Down exit Time (ns)
//`define tXSA 20000 // Exit self refresh to bank active command (tCK)
`define tREF 7.8 // Refresh interval time (us)
`define tWPST 0.4 // DQS write postamble time (tCK)
`define tDAL 280 // Auto precharge write recovery + precharge time (ns)
`endif

/************************************************** ******************************/
/* -16 Specification */
/************************************************** ******************************/
`ifdef S16 // -16 spec
`define tGR 2 // Gapless 2 tCK
`define tRTW 12 // Read to Write at same bank - CL=8tCK, tCDLR=4tCK

`define tRC 27 // Row cycle time(min) - operation (tCK)
`define tRFC 34 // Row cycle time(min) - Auto Refresh (tCK)
`define tRASmin 19 // Row active minimum time (tCK)
`define tRASmax 100000 // Row active maximum time (tCK)
`define tRCDRD 9 // Ras to cas delay(min) for Read (tCK)
`define tRCDWR 5 // Ras to cas delay(min) for Write (tCK)
`define tRP 8 // Row precharge time(min) (tCK)
`define tRRD 7 // Row to row delay(min) (tCK)
`define tWR 5 // Last data in Row precharge (7 tCK)
`define tCDLR 4 // Last data in to Read delay (6 tCK)
`define tCDLW 0 // Last data in to Write delay (0 tCK)
`define tCCD 2 // Col. address to col. address delay (3 tCK)
`define tCKmin 1.6 // Clock minimum cycle time (ns) - CL=8
`define tCKmax 6 // Clock maximun cycle time (ns) - CL=8

`define tCC5 2.5 // Clock minimun cycle time at cas latency=5 (ns)
`define tCC6 2.22 // Clock minimun cycle time at cas latency=6 (ns)
`define tCC7 1.8 // Clock minimun cycle time at cas latency=7 (ns)
`define tCC8 1.6 // Clock minimun cycle time at cas latency=8 (ns)
`define tCC9 1.4 // Clock mimimum cycle time at cas latency=9 (ns)

`define tCHmin 0.45 // Clock high pulse width (min:0.45tCK, max:0.55tCK)
`define tCHmax 0.55 // Clock high pulse width (min:0.45tCK, max:0.55tCK)
`define tCLmin 0.45 // Clock low pulse width (min:0.45tCK, max:0.55tCK)
`define tCLmax 0.55 // Clock low pulse width (min:0.45tCK, max:0.55tCK)
`define tDQSCK 0.26 // DQS out edge to clock edge (min:-0.26, max:0.26)
`define tDQSQ 0.18 // Data strobe edge to output data edge (min:-0.16, max:+0.16)

//`define tSLZ 0.75 // DQS low-Z to vaild DQS delay @ Read Preamble (min:tCK-0.75, max:tCK+0.75)
//`define tSHZ 0.75 // Valid DQS to DQS Hi-Z delay @ Postamble (min:tCK/2+0.75, max:tCK/2+0.75)
//`define tHZQ 0.75 // Data out active to High-Z (min:tCK/2-0.75, max:tCK/2+0.75)
//`define tDQCK 0.16 // out data edge to clock edge (min:-0.16, max:0.16)
`define tDQSS1min 0.85 // Write command to first DQS latching transition - WL=1
`define tDQSS1max 1.15 // Write command to first DQS latching transition - WL=1
`define tDQSS2min 1.85 // Write command to first DQS latching transition - WL=2
`define tDQSS2max 2.15 // Write command to first DQS latching transition - WL=2
`define tDQSS3min 2.85 // Write command to first DQS latching transition - WL=3
`define tDQSS3max 3.15 // Write command to first DQS latching transition - WL=3
`define tDQSS4min 3.85 // Write command to first DQS latching transition - WL=4
`define tDQSS4max 4.15 // Write command to first DQS latching transition - WL=4
`define tDQSS5min 4.85 // Write command to first DQS latching transition - WL=5
`define tDQSS5max 5.15 // Write command to first DQS latching transition - WL=5

//`define tSDQS 0 // DQS-in setup time (ns)
//`define tWPREH 0.25 // DQS-in hold time (0.25 tCK)
//`define tSIHmin 0.45 // DQS-in high level width (0.45tCK)
//`define tSIHmax 0.55 // DQS-in high level width (0.55tCK)
//`define tSILmin 0.45 // DQS-in high level width (0.45tCK)
//`define tSILmax 0.55 // DQS-in high level width (0.55tCK)
//`define tSICmin 0.9 // DQS-in cycle time (0.9 tCK)
//`define tSICmax 1.1 // DQS-in cycle time (1.1 tCK)
`define tDSC 1 // DQS-in cycle time (1 tCK)

`define tIS 0.4 // Input setup time (ns)
`define tIH 0.4 // Input hold time (ns)
`define tMRD 5 // Mode register set cycle time (6tCK)
`define tDS 0.2 // Data in & DM set-up time (ns)
`define tDH 0.2 // Data in & DM hold time (ns)

//`define tDIPW 1.75 // Data in & DM input pulse width (ns)
//`define tDV 0.375 // Output DQS vaild window (tCK)

//`define tPDEX 10 // Power Down exit Time (ns)
//`define tXSA 20000 // Exit self refresh to bank active command (tCK)
`define tREF 7.8 // Refresh interval time (us)
`define tWPST 0.4 // DQS write postamble time (tCK)
`define tDAL 272 // Auto precharge write recovery + precharge time (ns)
`endif


/************************************************** ******************************/
/* -20 Specification */
/************************************************** ******************************/
`ifdef S20 // -20 spec

`define tGR 2 // Gapless 2 tCK
`define tWTR 10 // Read to Write at same bank - CL=7tCK, tCDLR=3tCK

`define tRC 21 // Row cycle time(min) - operation (tCK)
`define tRFC 27 // Row cycle time(min) - Auto Refresh (tCK)
`define tRASmin 15 // Row active minimum time (tCK)
`define tRASmax 100000 // Row active maximum time (tCK)
`define tRCDRD 7 // Ras to cas delay(min) for Read (tCK)
`define tRCDWR 4 // Ras to cas delay(min) for Write (tCK)
`define tRP 6 // Row precharge time(min) (tCK)
`define tRRD 5 // Row to row delay(min) (tCK)
`define tCDLR 3 // Last data in to Read delay (6 tCK)
`define tCDLW 0 // Last data in to Write delay (0 tCK)
`define tCCD 2 // Col. address to col. address delay (2 tCK)
`define tCKmin 2 // Clock minimum cycle time (ns) - CL=9
`define tCKmax 6 // Clock maximun cycle time (ns) - CL=9

`define tCC5 2.5 // Clock minimun cycle time at cas latency=5 (ns)
`define tCC6 2.22 // Clock minimun cycle time at cas latency=6 (ns)
`define tCC7 1.8 // Clock minimun cycle time at cas latency=7 (ns)
`define tCC8 1.6 // Clock minimun cycle time at cas latency=8 (ns)
`define tCC9 1.4 // Clock minimum cycle time at cas latency=9 (ns)

`define tCHmin 0.45 // Clock high pulse width (min:0.45tCK, max:0.55tCK)
`define tCHmax 0.55 // Clock high pulse width (min:0.45tCK, max:0.55tCK)
`define tCLmin 0.45 // Clock low pulse width (min:0.45tCK, max:0.55tCK)
`define tCLmax 0.55 // Clock low pulse width (min:0.45tCK, max:0.55tCK)
`define tDQSCK 0.26 // DQS out edge to clock edge (min:-0.26, max:0.26)
`define tDQSQ 0.16 // Data strobe edge to output data edge (min:-0.16, max:+0.16)

//`define tSLZ 0.75 // DQS low-Z to vaild DQS delay @ Read Preamble (min:tCK-0.75, max:tCK+0.75)
//`define tSHZ 0.75 // Valid DQS to DQS Hi-Z delay @ Postamble (min:tCK/2+0.75, max:tCK/2+0.75)
//`define tHZQ 0.75 // Data out active to High-Z (min:tCK/2-0.75, max:tCK/2+0.75)
//`define tDQCK 0.16 // out data edge to clock edge (min:-0.16, max:0.16)
`define tDQSS1min 0.85 // Write command to first DQS latching transition - WL=1
`define tDQSS1max 1.15 // Write command to first DQS latching transition - WL=1
`define tDQSS2min 1.85 // Write command to first DQS latching transition - WL=2
`define tDQSS2max 2.15 // Write command to first DQS latching transition - WL=2
`define tDQSS3min 2.85 // Write command to first DQS latching transition - WL=3
`define tDQSS3max 3.15 // Write command to first DQS latching transition - WL=3
`define tDQSS4min 3.85 // Write command to first DQS latching transition - WL=4
`define tDQSS4max 4.15 // Write command to first DQS latching transition - WL=4
`define tDQSS5min 4.85 // Write command to first DQS latching transition - WL=5
`define tDQSS5max 5.15 // Write command to first DQS latching transition - WL=5

//`define tSDQS 0 // DQS-in setup time (ns)
//`define tWPREH 0.25 // DQS-in hold time (0.25 tCK)
//`define tSIHmin 0.45 // DQS-in high level width (0.45tCK)
//`define tSIHmax 0.55 // DQS-in high level width (0.55tCK)
//`define tSILmin 0.45 // DQS-in high level width (0.45tCK)
//`define tSILmax 0.55 // DQS-in high level width (0.55tCK)
//`define tSICmin 0.9 // DQS-in cycle time (0.9 tCK)
//`define tSICmax 1.1 // DQS-in cycle time (1.1 tCK)
`define tDSC 1 // DQS-in cycle time (1 tCK)

`define tIS 0.5 // Input setup time (ns)
`define tIH 0.5 // Input hold time (ns)
`define tMRD 5 // Mode register set cycle time (6tCK)
`define tDS 0.25 // Data in & DM set-up time (ns)
`define tDH 0.25 // Data in & DM hold time (ns)

//`define tDIPW 1.75 // Data in & DM input pulse width (ns)
//`define tDV 0.375 // Output DQS vaild window (tCK)

//`define tPDEX 12.5 // Power Down exit Time (ns)
//`define tXSA 20000 // Exit self refresh to bank active command (tCK)
`define tREF 7.8 // Refresh interval time (us)
`define tWPST 0.4 // DQS write postamble time (tCK)
`define tDAL 260 // Auto precharge write recovery + precharge time (ns)
`endif

We have discovered that ATi does infact use considerably tighter timings than what Samsung specifies.

Samsung TRCRD = 7
ATi TRCRD = 7

Samsung TRCDWR = 4
ATi = 4

Samsung tRP = 6
ATi = 5

Samsung tRAS = 15
ATi = 14

Samsung TRRD = 5
ATi = 5

Samsung TWR = 10
ATi = 7

Samsung TR2W = ?
ATi = CL + 3

Samsung TW2R = 4
ATi = 3

Samsung TR2R = 5?
ATi = 2

Samsung WR latency = ?
ATi = 1.5

Samsung Cas latency = 8
ATi = 7

Samsung CMD latency = ?
ATi = 0

Samsung STR latency = ?
ATi =

Samsung WR latency = ?
ATi = 1.5

Samsung TRFC = 27
ATi = 27

In addition it appears that the limiting factor is not the typical TRP / TRCD but infact is the Cas latency. Armed with this information, felinuz and I are starting to mod bios to take advantage of this both in terms of speed and peformance.

More to come later ...... I gotta learn to read hex in order to pull nVidia's timings.
 

xtknight

Elite Member
Oct 15, 2004
12,974
0
71
OK, it looks like there are multiple specifications here. Do you guys think tightening timings will have any advantage over simply overclocking? With tighter timings, you won't be able to clock as high so wouldn't this deliver diminishing returns?

If you guys need any data I'd be glad to provide it for my cards (Leadtek GeForce 6800NU-NV41 and ATI Radeon 9500 PRO).

By the way, the hex numbers in decimal look like either they're a whole series of timings.
 

Peter

Elite Member
Oct 15, 1999
9,640
1
0
It's a question of balancing latency against throughput. If ATi feels their GPUs benefit more from low latency, then that setup makes sense.
 

Elcs

Diamond Member
Apr 27, 2002
6,278
6
81
Originally posted by: Peter
It's a question of balancing latency against throughput. If ATi feels their GPUs benefit more from low latency, then that setup makes sense.

What they feel and what is best in the real-world may be different by certain users standards.

Im quite happy with standard overclocking and staying within widely regarded "safe" limits so this is while interesting to read... probably wont interest me enough to tamper with settings.
 

Gamingphreek

Lifer
Mar 31, 2003
11,679
0
81
Originally posted by: Peter
It's a question of balancing latency against throughput. If ATi feels their GPUs benefit more from low latency, then that setup makes sense.

Exactly!

All those very well educated people over there at ATI have probably found this happy medium without stressing the memory. I wouldn't mess too much with it.

-Kevin
 

ssvegeta1010

Platinum Member
Nov 13, 2004
2,192
0
0
Great finds!


I can just see the posts now. "OMG I was OCing my videocard and I dropped the CAS to 2.5, like my RAM. Now my video is all messed up. Please help, my case is a Xaser III and I have a UV/LED fan blowing on my videocard, right under my 2nd cold cathode!!!" :D
 

gac009

Senior member
Jun 10, 2005
403
0
0
Itll be nice to read what you find but no way am i messing with the timings on my card(s).
 

Munky

Diamond Member
Feb 5, 2005
9,372
0
76
You can adjust memory timings in a program like ati tray tools, nothing a simple reboot wont fix if you screw up. I, for one, would be interested in getting more OC out of my x800, although it's not exactly hurting for more performance right now.
 

blckgrffn

Diamond Member
May 1, 2003
9,676
4,309
136
www.teamjuchems.com
Well, since they are running the memory tighter than spec, it would be interesting to find out if loosening the timings does in fact gain higher speeds and whether those speeds are better than the timings. Why not try it? If I still had an x800 series card I would be trying it out :)

Remember THUGSROOKs 6800U bios used the GT timings because they were tighter and made for better performance? I guess that was for the same clock speeds however, so that would make sense.

Anyway, let us know what you find out!

Nat