MorphCore paper - rumored new architecture in Skylake

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

scannall

Golden Member
Jan 1, 2012
1,960
1,678
136
Yep. And I am sure they already do that. To get some kind of "magic solution" we need something extremely different like IA64 style.

And thats not going to happen due to the market.


I'm sure they do as well. I think in the long run, processors would be faster if they had continued down the Itanium route. But, there was market pressure for both 64 bit, and backwards compatibility. Itanium was a tacit admission that there were things in x86 that really needed an overhaul. And still do.
 

Nothingness

Diamond Member
Jul 3, 2013
3,296
2,368
136
Yep. And I am sure they already do that. To get some kind of "magic solution" we need something extremely different like IA64 style.
IA64 was not a radical departure as far as ISA and micro-arch go. As an example, predication and VLIW already were used by some DSP.

And thats not going to happen due to the market.
I hope you'll be proven wrong, things have to move. And anyway I want x86 ISA to die :biggrin:
 
Mar 10, 2006
11,715
2,012
126
He obviously means completely different architecture from the current 'Core' architecture that's been around for years.

You knew this too, I assume you just want to be pedantic as you have nothing better to do.

You do realize that Intel has made significant changes to the Core architecture over the years, right?

SNB was a pretty big overhaul; Intel added the uOp cache, totally rebuilt the branch predictor, brought a complete overhaul to the on-die interconnect (ringbus), implemented AVX, made key OoO structures (ROB, load/store buffers) larger, implemented a physical register file, etc.

Haswell was another big improvement that I'd say is on the scope of the NHM -> SNB transistion.

I don't want to be pedantic, but I think it's pretty poor form to imply that Intel is just phoning it in with its microarchitectures because AMD can't get it together. There's a lot of improvements/modifications that go on under the hood, and if you think that throwing away an established architecture "for the sake of something new" is a better idea than intelligently evolving a known-to-be-good architecture, then I don't know what to tell you.
 

Nothingness

Diamond Member
Jul 3, 2013
3,296
2,368
136
You do realize that Intel has made significant changes to the Core architecture over the years, right?

SNB was a pretty big overhaul; Intel added the uOp cache, totally rebuilt the branch predictor, brought a complete overhaul to the on-die interconnect (ringbus), implemented AVX, made key OoO structures (ROB, load/store buffers) larger, implemented a physical register file, etc.

Haswell was another big improvement that I'd say is on the scope of the NHM -> SNB transistion.

I don't want to be pedantic, but I think it's pretty poor form to imply that Intel is just phoning it in with its microarchitectures because AMD can't get it together. There's a lot of improvements/modifications that go on under the hood, and if you think that throwing away an established architecture "for the sake of something new" is a better idea than intelligently evolving a known-to-be-good architecture, then I don't know what to tell you.
I certainly agree that Intel made significant improvements and changes, but they have also shown in the last years that they probably have gotten the most of the existing underlying principles. And given how good they are, I have little hope we'll see more great improvements without radical changes way beyond PRF, micro-op cache or other "tweaks". So a new (micro-)architecture might be needed not only "for the sake of something new".

OTOH I'm afraid that, as ShintaiDK wrote, the market prefers a legacy status quo, which makes sens. But this also likely slows down innovation.
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
An update regarding the feasibility of MorphCore in upcoming Intel microarchitectures:
According to the authors they did some additions to an existing uarch and just needed additional multiplexers to switch between the old and new units or different logic blocks. They assumed a 2.5% stage delay hit for a pipeline stage needing the multiplexer and so reduced the clock by this amount and roughly the ST performance also.

But there is a flaw in this logic (pun intended). A typical pipeline stage of todays x86 designs has a processing time of about 20-25 FO4 inverter delays. For multiplexers I found numbers from 2.5 to 4 FO4 delays. This would mean a clock frequency hit of 10-20%. Including wire delays such an addition to an existing efficient core would cost about that much ST performance.

I think, it would be a better compromise to add special 8T SMT cores instead (just costs silicon area) and power on the cores, which are most efficient for the tasks, while saving power on the others (power gated). This is a bit like big.LITTLE. This way the 8T cores could be even more optimized for the multithreaded tasks.
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,663
2,526
136
He obviously means completely different architecture from the current 'Core' architecture that's been around for years.

'Core' is not an architecture, it's purely a marketing term.

SNB was a pretty big overhaul

SNB was not a big overhaul. It was a completely new uarch. This is easy to miss because Intel reused a lot of the names for structures from the old one, but the design of the system is radically different, with SNB using a completely new OoO system with a dataless ROB and PRF instead of the old "in-flight data lives in ROB" model.
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,663
2,526
136
Just to reinforce my point:

made key OoO structures (ROB, <snip>) larger

They did not make the ROB larger. They completely removed the old ROB. It no longer exists. The structure they now call ROB is not called that by academia or anyone else in the industry, and it has a completely different structure and function.
 

jpiniero

Lifer
Oct 1, 2010
16,783
7,235
136
I think, it would be a better compromise to add special 8T SMT cores instead (just costs silicon area) and power on the cores, which are most efficient for the tasks, while saving power on the others (power gated). This is a bit like big.LITTLE. This way the 8T cores could be even more optimized for the multithreaded tasks.

The purpose of MorphCore isn't really to improve performance; it's to be more energy efficient in MT workloads. The ST performance is going to go down on a core that is Morphed.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,114
136
Just to reinforce my point:

They did not make the ROB larger. They completely removed the old ROB. It no longer exists. The structure they now call ROB is not called that by academia or anyone else in the industry, and it has a completely different structure and function.

Ah, very trixie! It's too bad that I have too many irons in the fire to keep up with the latest trends in computer architecture. I really need to pick up the latest edition of "Computer Architecture, Fifth Edition: A Quantitative Approach", my knowledge is getting very outdated :(

15 years ago, it wasn't that hard to keep abreast of current trends in CPU and GPU technology and systems. Now the hardware and software elements are so complex it's just not doable.
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
The purpose of MorphCore isn't really to improve performance; it's to be more energy efficient in MT workloads. The ST performance is going to go down on a core that is Morphed.
This MorphCore compromise can be avoided by such dark silicon cores. It's just what I call adaptivity. Heterogeneous cores fall in this category. And you know what many people want - have the cake and eat it, too. How many are whining about the "slow" progress in ST performance? Intel just avoids the ranges far up north of the power/performance knee. Would they accept such a loss?

And as long as these 8T also contain threads affecting the user's perception of the system performance, this might lead to a seen >10x delay (1/8 * in order perf.).
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
Ah, very trixie! It's too bad that I have too many irons in the fire to keep up with the latest trends in computer architecture. I really need to pick up the latest edition of "Computer Architecture, Fifth Edition: A Quantitative Approach", my knowledge is getting very outdated :(

15 years ago, it wasn't that hard to keep abreast of current trends in CPU and GPU technology and systems. Now the hardware and software elements are so complex it's just not doable.
That book is good, but 2k pages (incl. all appendices) is a lot to read in one's spare time. ;) How about more recent slide sets?
Just to reinforce my point:



They did not make the ROB larger. They completely removed the old ROB. It no longer exists. The structure they now call ROB is not called that by academia or anyone else in the industry, and it has a completely different structure and function.
In the Bulldozer uarch they just call it "retirement queue".
 

dmens

Platinum Member
Mar 18, 2005
2,275
965
136
SNB was not a big overhaul. It was a completely new uarch. This is easy to miss because Intel reused a lot of the names for structures from the old one, but the design of the system is radically different, with SNB using a completely new OoO system with a dataless ROB and PRF instead of the old "in-flight data lives in ROB" model.

P4 had a PRF and decoded i$. SNB ain't got nothing on Netburst. :)

They completely removed the old ROB.

Nah, the data array just got moved (into the PRF). The ROB lives on as retirement, events and faulting control logic.
 
Last edited:
Apr 30, 2015
131
10
81
Maybe Intel have discovered big.little. In my view, the future for retail-device computers lies in SoCs, not traditional CPUs.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,583
10,224
126
Maybe Intel have discovered big.little. In my view, the future for retail-device computers lies in SoCs, not traditional CPUs.

Beema/Mullins? Carrizo-L? Braswell? Those are all SoCs, I think.

Edit: I guess if you see retail PCs all being low-end, and non-upgradable, then SoCs make sense. In that case, they should all be in mini-PC form factors.
 
Last edited: