NTMBK
Lifer
Actually Maxwell was perfectly capable of this as well, it just wasn't enabled in the desktop cards, only in mobile SoCs (i.e. Tegra X1):
Desktop Maxwell != mobile Maxwell. For a start, mobile Maxwell was 20nm! It was also compute capability 5.3 (the only part which was): http://docs.nvidia.com/cuda/cuda-c-programming-guide/#compute-capabilities
The whole history of Maxwell and Pascal is a bit weird. Pascal is a bit of a change from Maxwell though, with SMs only having 64 shaders each. Each GPC has the same number of FP32 shaders (640), but with 10 SMs instead of 5. More registers and shared memory per shader, which will help with complex HPC kernels.