technically . . . Intel has been flogging the same design since the late 90s.
No. Current machines are not derivatives of the P6 line, like Core 2 was. The current Intel core lineage starts at Sandy Bridge, which was a radical redesign. This is often hard to believe for people, because SNB targeted the same width as it's predecessors, so it's externally very similar. However, moving from holding register state in ROB to a full PRF completely changed the way data flows inside the core, and forced a redesign of basically every component past decode. SNB shares more in design with Pentium 4 than it does with P6. (This is probably also why this radical design departure was not marketed by Intel, and SNB was just "2nd gen core architecture". Marketing the new CPUs as "yeah, it shares a lot with the failed pentium 4 line, but we've only kept the
good ideas and ditched the bad ones, honest" would probably have been too hard a sell.
Intel has either lost their talent or pushed the boundaries of their design to its limits. Perhaps both. Iteration is not working for them. Innovate or get left behind.
... Have you been paying attention at all? Intel's current lineup hasn't improved much at all recently,
because they have shipped the same core 3 times under new names. The reason for this is, as explained by Raja Koduri in this very article, is that their architecture teams had been strictly targeting 10nm, and since 10nm has been delayed for so long, they have just shipped the same CPUs for a very long time.
Raja Koduri and Jim Keller in the Anandtech article said:
Our products will be decoupled from our transistor capability. We have incredible IP at Intel, but it was all sitting in the 10nm process node. If we had had it on 14nm then we would have better performance on 14nm.
More to the point, you are advocating for Intel to do a radical redesign to beat Apple, when Apple designs are converging on the exact same design ideas that Intel uses. The difference, and the reason why Apple is beating Intel right now, is that Apple has a better process to run their CPUs on. Since it has become pretty clear that the ideal CPU is a wide OoO machine running at relatively low design clocks and with an awesome memory subsystem, the only realistic way Intel can beat them is to build a similarly wide OoO machine. Which they are doing, as the main new feature described was widening of the retire and rename phases.
VLIW didn't work in its intended application, and the implementation was also bad. Didn't mean something non-x86 had to end that way. Look at what Apple is doing with ARM right now.
You appear to be putting quite a lot of emphasis on the ISA. Don't. One of the big lessons everyone has learned that the ISA doesn't really matter that much, so long as it's not something stupid and radically different (like IA64) from the known good systems that are good for shipping complex OoO cores (like ARM and x86). Apple does get a bit of an advantage from using ARMv8, because it's a newer and less crufty ISA with fixed-length instructions, but basically all it buys them is not having to implement a uop cache. Which basically costs Intel low single-digit % of power and low single-digit % of die area.
Beyond that, there is no major difference between ARMv8 and x86, and there is no known way forward that would give massive performance increases that could not be implemented on x86 (or ARM).
Are you saying it was impossible for Intel to do that?
No, it would not be impossible for Intel to ship a (x86) core that is much better, given a radically better process.
Intel has long gotten comfortable being the top dog in semiconductor process tech. Even when not in density, still in transistor performance. (Which really does matter more than density for high-end CPUs.) They stumbled bad with 10nm, and now shipping 14nm(+++) silicon against TSMC 7nm, making it the first time they have to compete against superior transistor performance since, IIRC AMD 90nm SOI was better than Intel 90nm bulk. (IIRC Intel hit the wall with shrinking SiO2 gate dielectric, while AMD widely touted that their superior SOI process made this not a problem for them. Which was true, all the way until the very next node at 65nm, when AMD hit it even worse. Oops.)