How does intel's Netburst quad-pumping and AGP4X/8X work?

Goi

Diamond Member
Oct 10, 1999
6,771
7
91
I remember asking this when AGP4X and the Netburst bus was just released, but either I forgot the answer or no definitive explanations were given. Anyway, after a long time, I just realized that I still don't know, so I'm asking this again. Please correct me if I'm wrong anywhere.

As we all know, a clock cycle has 2 edges, a rising and falling edge. DDR(used on DDR-SDRAM, RAMBUS, AMD/Alpha's EV6 bus, AGP2X, etc) doubles available bandwidth by exploiting this fact and sending 2 words per clock cycle(1 on the rising and 1 on the falling). What about QDR? How does the Netburst bus/AGP4X/QDR memory manage to double the bandwidth again since we've already run out of clock edges? What about AGP8X?
 

Goi

Diamond Member
Oct 10, 1999
6,771
7
91
That didn't help in explaining how quad pumping and AGP8X works...but thanks anyway.
 

Lonyo

Lifer
Aug 10, 2002
21,938
6
81
I think quad pumping send 2 on rise, and 2 on fall, hence doubling the bandwidth.
 

Goi

Diamond Member
Oct 10, 1999
6,771
7
91
Does that mean that AGP8X does 4 on the rising edge and 4 on the falling edge? Doesn't this just mean the doubling/quadrupling of the bus width? Unless the 2 or 4 words are sent sequentially rather than in parallel.
 

Lonyo

Lifer
Aug 10, 2002
21,938
6
81
As you?ll remember from our P4X333 Review, the AGP 8X specification (AGP 3.0) runs at 66MHz and transfers data eight times per clock cycle. The 32-bit wide AGP bus when running in 8X mode results in a total of 2.1GB/s of bandwidth between the R300 and your PC?s North Bridge.
How much extra bandwidth does AGP 8X provide? The AGP 3.0 specification defines that a total of 8 transfers are made on every clock cycle with a frequency of 66MHz. Since the bus is still 32-bits wide this doubles the bandwidth of AGP 4X (1.06GB/s) to 2.1GB/s.

I think it does 4 on rise, 4 on fall.
 

Goi

Diamond Member
Oct 10, 1999
6,771
7
91
My question is, how exactly does it do 2 on rise, 2 on fall or 4 on rise, 4 on fall?
 

Matthias99

Diamond Member
Oct 7, 2003
8,808
0
0
Since it can't magically make more wires appear, it has to be sending them one after the other. The bus controller is probably just internally multiplying the clock or something along those lines.
 

Goi

Diamond Member
Oct 10, 1999
6,771
7
91
Lawrence, EV6 and DDR made use of the fact that there are 2 edges per clock cycle. Since these 2 edges are fully utilized in DDR's case, I'm wondering how AGP4x/8x and Netburst is able to further extract even more bandwidth from the bus...

Matthias99, are you sure? I'm kinda looking for a straight, accurate answer rather than educated guesses, no offence. I do appreciate your conjecture, its certainly possible.
 

pm

Elite Member Mobile Devices
Jan 25, 2000
7,419
22
81
Contrary to popular belief (and to intuition) every double-pumped setup that I have seen only uses the rising-edge - not the rising and falling. Once you see this, then the ways that higher divider systems work is essentially an extended variation variation of the the way double-pumped works.

The general gist of it is that in general sending really fast clocks flying around on a PCB is very hard, but creating fast clocks in silicon from slower ones is much easier. So it's hard to route an 800MHz clock around on a motherboard PCB, and it's much easier to take a 200MHz clock and then generate a 4x faster clock from it. The way that CPU's usually generate faster clocks from the slower system clocks is using a phase-locked loop (PLL). PLL's are pretty cool circuits that enable you to lock one signal to another signal by aligning along one edge. There's a link to how PLL's work here.

"Ah, yes.", you say, "but that's not what I was asking. I want to know about quad-pumping, not faster clock signals." But the two are very related.and use very similar techniques. To generate 4 clock signals out of phase with each other, one common method is to use a DLL (Delay Locked Loop) (link to the differences between DLL's and PLL's) to evenly divide the clock into multiple phase regions and then you use a phase-comparator circuit (this edge is slower than this other edge, this edge is faster than this other edge, these two edges are aligned) to maintain the lock with the original signal. So if you want to create four synchronized delayed signals, you chain four DLL's together, tap off the intermediate nodes, and then check the phase alignment of the original signal to the output with a phase comparator to adjust. .
 

JustAnAverageGuy

Diamond Member
Aug 1, 2003
9,057
0
76
Now the question I'm asking myself.

Why don't they just quad pump D(Q)DR memory and turn PC3200 into PC6400?

or, am I missing the point?
 

pm

Elite Member Mobile Devices
Jan 25, 2000
7,419
22
81
Now the question I'm asking myself. Why don't they just quad pump D(Q)DR memory and turn PC3200 into PC6400?
They do. It's called QDR memory and it uses a method similar to what you describe. As opposed to DDR2 memory which is more of a quad-pumped extension of DDR memory.
 

Matthias99

Diamond Member
Oct 7, 2003
8,808
0
0
No, I'm not sure. However, I know a bit about circuit design, and there are only so many ways to push data across a bus. Let me put it this way: that's how I'd try to do it (multiplying the clock), although I'm sure other methods could be utilized. :)
 

Goi

Diamond Member
Oct 10, 1999
6,771
7
91
Thanks pm, so basically in a quad pumped scenario there the chipset basically takes the incoming onboard FSB, and creates 4 different busses of the same bus width, out of phase with one another, to create an illusion of 4 transfers per clock cycle?

As for your first paragraph, that's certainly enlightening news. Is this the case with DDR-SDRAM/RAMBUS? I've never seen a single article saying that these 2 double-pumped setups aren't using both edges. Also, from my logic circuits design class(which was taken many years ago) I seem to recall that in the classic case("single" pumping, the falling edge was used for signalling most of the time.