I already have done some of those tests

1C/2T
SMT scaling is pretty insane (~40%). I've also found very little in the way of an SMT penalty (sometimes 3% or so of a loss with SMT enabled and a single thread locked to a single core without the SMT logical core being parked)... but it's a pretty tricky thing to reliably quantify... it's basically inside the noise floor.
Update:
I've found a few examples where the SMT penalty is rather large, going through the numbers in their final form... but I've not found this play out in real application... just two synthetic benchmarks.
3.925GHs, DDR4-2667 16-16-16-38 1T
AIDA64 PhotoWorxx:
SMT ON: 21,567
SMT OFF: 24,805
Penalty:
13%
AIDA64 VP8 (
6700k scores 7,521 @ 4Ghz, so both results here are awesome)
SMT ON : 7,748
SMT OFF: 9,336
Penalty:
17%
Cinebench R10 Single Threaded
SMT ON: 4,774
SMT OFF: 4,923
Penalty:
3%
Cinebench R11.5 Single Threaded
SMT ON: 1.55
SMT OFF: 1.62
Penalty:
4%
Cinebench R15 Single Threaded
SMT ON: 150
SMT OFF: 160
Penalty:
6%
Most example, of course, show no change at all with or without SMT enabled.