Bit off topic but how come CPU designers haven't thought of using persistent cache for storing decoded instructions? Something connected directly to the CPU with lowest possible latency so the CPU decoders don't have to work as hard? Wouldn't that save a lot of CPU time and increase instruction throughput? Most consumer workloads are just repetitive in nature. Boot PC, load OS, launch frequently used software, use frequently used functions of said software.Because Logic transistors such as ALUs, FPUs, branch predictors, decoders use a lot of power per transistor while caches are very power efficient. Power is pretty much the biggest limiter to performance nowadays.