Don't know if anyone pointed it out, but wouldn't rendering 4 frames ahead (or worse, 8 frames) produce extra latency? At 60fps, each frame has occupies 16ms. If you're rendering 4 frames into the future, then you now have 64ms of lag before the on screen visuals can update to match your input. Double that to 8 frames and you're looking at over 1/10th of a second of visual lag. Got a really intensive game that drops the framerate to 30fps? Now you're looking at over 1/5th a second of lag. Presumably double buffering (or more) is already taken into account with the AFR model, but that's still a heck of a lot of visual lag, over 2x what you'd experience with triple buffering alone if my understanding of all this is correct.
AFR is nice because it provides a very efficient balance of resources and I guess it's easy to implement, but dynamic load balancing is where it's at. Alternatively, 3dfx style alternating scan lines provided a good way of load balancing without sacrificing compatibility, though it only helped with fillrate since all objects still had to be rerendered on each card. Still, when was the last time the vertex load of a game was its limitation?