[wider speculative execution]
[stalls after branch misprediction]
Assuming that a context switch happens in less than 10 clock cycles, that could allow 50% smoother gaming performance on average.
Branching is deciding between two (sometimes more) points in a task to continue at. A context switch is interrupting one task and taking on another.
Re smooth games:
~20 cycles of an e.g. 4 GHz CPU = 5 ns (nanoseconds)¹
frame-to-frame time at e.g. 120 Hz video frame rate: 8 ms (milliseconds, that's million nanosecods)
Also, IIRC 10 ms (milliseconds) is a ballpark under which audio latency needs to stay before musicians perceive a lag while hearing themselves on headphones (i.e. the whole chain of recording, processing, and playing back of live audio, with all the necessary buffering included). I suppose similar measures need to be taken for game audio.
________
¹) The ~20 cycles of penalty of a branch misprediction of which Agner Fog was writing about require that the stalled pipeline can be refilled with instructions and data which do reside in 1st level cache, AFAIU. However, even a last-level cache miss results in "only" a few hundred cycles of a stall, which still is ~five orders of magnitudes smaller than a video frame time. But it follows that [speculative] prefetching of instructions and data is much, much more important for purposes like video game smoothness, compared to speculative execution.