I meant the jump from Skylake to Golden Cove. I think it was over 40% if memory serves. BTW, I just encoded a 4K 60 FPS video with Handbrake x265 with HT on and HT off.
HT off results:
HT on results:
So HT on completed the task 8.7% faster. Doesn't seem like much until you realize that this workload pegged all 32 threads on my CPU at 100% load, and with HT off, all 24 threads were also pegged to 100%.
This is a highly threaded workload of course, but my point is that having HT on still made the CPU notably faster and more efficient against 24 physical cores.
You are doing that test in isolation of two things that are being proposed: 1) The P cores gain back the transistors used to support HT and spend that on ST improvement. 2) Some (in this case, a pair) of the P cores are replaced with E-cores.
In your case, you have 8 P cores performing at speed 1, 16 E cores performing at speed ~.7, and 8 HT threads performing at about .1 exchange two of those P cores for 8 more E cores. Use the available transistors from HT to instead make the P cores 10% faster. That gives you 6 * 1.1=6.6, 24 * .7= 16.8 with a total of 23.4. Your current processor has 8 + 11.2 + (8 * .1)=20
Those are rough numbers, and are highly situationally dependent, but, the point is that by sacrificing 2 performance cores and HT on the performance cores, you gain space for 8 additional E cores. That's exchanging two hardware cores for 8 that are roughly 70% as good. You also gain performance on the P cores. Now, you also have a processor that is even faster in lightly threaded loads, including games if that's important to you, and also faster in heavy MT loads.
Your performance hit for disabling HT doesn't take into account the fact that the P cores would be FASTER without the HT circuitry, not because the HT circuitry itself imposes a significant penalty by it's existence, but because it takes up space and transistor budget that could go towards making the processor itself inherently faster for single threaded work. Consequently, they would have lower resource utilization, but, they would also generate less heat per unit of work performed in ST (unused resources are power gated in modern processors). That can be demonstrated now by disabling HT in most any modern processor and watching it's power draw and heat dissipation drop because it's using less of it's internal resources each clock. This means MORE power budget for the rest of the processor and the ability to sustain higher all-core clocks under load.