If you are going to engage in a technical discussion you need to use technical terms correctly and be clear about what you are saying. Your claim that a cascode amplifier works as a single transistor is ridiculous and the miller cap of the output device is not shorted to the ground (which would imply both the drain of the output transistor and its gate are grounded) but the gate of the output transistor is an AC (not DC) ground so there can't be any AC feedback effect. You also tend to jump around on topics because this really has nothing to do with a discussion on digital circuits.
Higher transconductance improves Ft. The gate to drain cap of a FET is also included in the ft calculation/measurement (it's in the denominator), so actually a higher ft does show that the increased capacitance isn't enough to offset the increased transconductance of the device. Your argument that transistors aren't inherently fast enough to pass a quality square wave over ~5 GHz is ridiculous. You do know there are digital circuits out there that are well above 5 GHz, right?
A higher transconductance doesnt forcibly improve Ft, it improve the gain within the frequency range wich is immune to miller effect but at the higher frequencies of interest the more the gain the more the capability of the miller cap to tame down the gain, what is gained by the better transconductance is lost because the product of the gain by the miller cap increase as well, that is, the negative feedback ratio increase with the increased gain.
If we take two N fets the driven fet has its drain connected to the source of the output fet while the gate of the latter is connected to the positive rail, the output device is always on, it s the input device that is switched on and off, it s in serial with the output device.
Since the output device has its gate connected to the positive rail it is shorted to ground AC wise, so its miller cap is also AC shorted to ground.
About switching speed if we take a 5GHz square wave, or a 0.2 ns peridodic signal, the fundamental is a 5GHz sine, it take 0.05 ns for this sine to reach full amplitude, by the same token the third harmonic is at 15GHz, and it take anout 0.016 ns to reach full amplitude.
If the signal is limited to those two harmonics then cross conduction will occur during 0.01ns, that is, 20% of the duration of the first half period, during wich both transistors of a complementary pair will be conducting simultaneaously, and the same when the signal switch to the other direction, that s just too slow and during the cross conduction current will rise sharply and one, or both, of the transistors will be destroyed by excessive current, so harmonics of way higher order are necessary to have fast enough rising and falling hedges.
To digress somewhat, with bipolar transistors of a same model there s manufacturing variations, and the lower the gain the higher the bandwith of the device, that seem counter intuitive at first but logical when we look at the gain x miller cap product, and when we look at different devices of same current capabilities thoses wich are 3-5GHz capables have way lower gain in a 10 ratio at least than 100-300Mhz only devices...