Okay so first of all - This sounds far to good to be true. You mean the gap in perf/watt on the latest cove compared to this new super-secret architecture is better than all of the perf/watt increases from the latest Cove to the P5? First of all can you clarify what the phrase "Later cove core" means? Are we referring to Sunny, Willow, Golden, or even something later?
		
		
	 
Where I found it didn't specify what Cove core it was talking about on the roadmap.  Other than it was later rather than earlier, since we only know about Goldencove, I originally had Goldencove in the spot.
Palmcove (10nm) -> Sunnycove (10nm+) -> Willowcove(10nm++) -> Goldencove (10nm+++) -> --- Cove (7nm) -> --- Cove (7nm+) --> "Deadsun 1" (7nm++) -> "Deadsun 2" (5nm)
Most of it was about the first 5nm CPU core only though.  DS1 will go half-way(probaby, the 7nm up-port wasn't mentioned at all) and DS2 will go all the way.
	
		
	
	
		
		
			Surely the designs for Intel's 5nm architectures are way to far out to really offer any granular detail on performance metrics?
		
		
	 
Anything canned between May 3, 2013 to June 21, 2018 might be back on the table.  Also, compared to previous architectures this one has the most funding comparatively.  The architecture is being worked on by all teams with C2DG/Haifa team being in command seat from the get-go.  Rather, than build it at Folsom or Oregon and have Haifa fix it if it isn't feasible for production.  It is particularly an all-in architecture, much like how AMD went with Zen.
I'm currently watching for any updates with;
BiTS team (known work on IA32 EL/Houdini/IA32 and Intel 64 DBT instructions)
DRVFS team (known work on SoftMachines)
EPIC team (known work on EPIC ISA and EPIC instruction set extentsion to IA32/Intel 64)
For the potential perf/watt improvements and IA-FUTURE.
The metrics are probably from planning and simulations with the 7nm risk production.  With the increased margin of hitting the end goal with the 5nm node.
DBT unit shouldn't translate between ARM instruction to x86 instruction. (Specifically, the DBT unit is after the decoder not before it)  Rather, converts decoded R/M/I operand/addresses from physical(ex. finite(R0-R15)) to virtual(ex. infinite(vR0-vR∞)).  As well as some other features allowing for thread rotations, as well as automatic fencing. (Which is used in conjuction with the consistency/security manager around the retire/scheduler)