• We should now be fully online following an overnight outage. Apologies for any inconvenience, we do not expect there to be any further issues.

Intel Ocean Cove Thread (next gen core design)

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

TheELF

Diamond Member
Dec 22, 2012
4,027
753
126
I think the key to intel's next architecture (and the key to AMD's current one) is the ability to implement MCM. It is obvious that process tech improvements are slowing and becoming more expensive, so the necessity to leverage die area of multiple packages will become important for performance in the next decade.
Anything more cores would give a big performance boost to get's a big performance boost from GPUs,which is why both companies came to the conclusion back in 2005 that they would have to incorporate GPUs into their CPUs as co processors,the main thing intel is trying to get some good GPU IP going,that's my guess.
For very specialized things that can't be run on GPU people already use many core ARM chips or stuff like xeonphi ,a lot of cores each running very slowly,such an co processor a mainstream CPU doesn't need and won't get for many many years.
I'm opting for Moonbogg's alien technology :D I think that something completely new and modern is the way forward rather than redesigning the old and then sleepwalking with that for another decade. Unless they'd enjoy being overtaken by their sole competitor in the near future.
They can't be overtaken they could be,in theory,be matched since once you are maximized there is no beyond maximized and if they would get matched intel would still be the first who got there.
 

fleshconsumed

Diamond Member
Feb 21, 2002
6,486
2,363
136
Interesting news. I just hope AMD stays competitive. Last 10 years before Ryzen were pretty painful. At least Intel doesn't have the process advantage anymore, so both Intel and AMD will be playing on a level field.
 

Thunder 57

Diamond Member
Aug 19, 2007
4,080
6,806
136
Didn't JK want to make a new kind of ARM core with AMD before he left? I thought I read that somewhere.

Rumor was he was more interested in K12 than Zen, and when K12 got shelved, he started looking elsewhere.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
I like to think this time, it is different. ( I hope ),

Haven't you heard of the term "The more things change, the more they stay the same"?

Revolutionary means it can apply ideas that are radically different to what has been done before. The problem with revolution is it fails on having the right balance. Intel's EPIC ISA failed because it was off balance, and tried to push work to compilers too much. Pentium 4's Trace Cache didn't work because it tried to rely on it exclusively - once they re-adopted it for Sandy Bridge not as a replacement of existing ideas but a complementary of it, it worked very well.

I hope they find the right balance, but do not expect a real, total ground-up like the architectures that failed. They don't work.

Regarding the alien architecture Moonbogg was talking about. Sure it might be fantastic but I doubt you'll be able to run anything you currently have on it. :D
 
  • Like
Reactions: lightmanek

Thunder 57

Diamond Member
Aug 19, 2007
4,080
6,806
136
Pentium 4's Trace Cache didn't work because it tried to rely on it exclusively - once they re-adopted it for Sandy Bridge not as a replacement of existing ideas but a complementary of it, it worked very well.

Indeed. Too bad AMD didn't figure that out until Zen. I'm really curious as to what more can be done. Two things that come to mind lately are the uop cache and micro-op fusion. Is there anything interesting out there that is being worked on? I'm sure there is, but we won't find out until it comes out.

Regarding the alien architecture Moonbogg was talking about. Sure it might be fantastic but I doubt you'll be able to run anything you currently have on it. :D

No, I'm pretty sure they've got that all figured out. Haven't you seen Independence Day? ;)
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
No, I'm pretty sure they've got that all figured out. Haven't you seen Independence Day? ;)

Yea I did. The alien mothership got hacked by a Powerbook 5300 that didn't even have WiFi. I'm pretty sure if they invaded us now we'll totally own them with our Coffeelake Ultrabooks.

uop cache and micro-op fusion.

Yea, however the uop cache is based on the Trace cache, and uop fusion may be a feature particular to x86, so not entirely new either. The technologies that are being used today often comes from ideas that were thought of in the 70's.
 
  • Like
Reactions: lightmanek

Thunder 57

Diamond Member
Aug 19, 2007
4,080
6,806
136
Yea I did. The alien mothership got hacked by a Powerbook 5300 that didn't even have WiFi. I'm pretty sure if they invaded us now we'll totally own them with our Coffeelake Ultrabooks.

Well there was clearly a PowerPC to Alien translation/emulation layer going on.

Yea, however the uop cache is based on the Trace cache, and uop fusion may be a feature particular to x86, so not entirely new either. The technologies that are being used today often comes from ideas that were thought of in the 70's.

I never did look into the trace cache all that much. It sounds very much like a uop cache. Are there differences? Micro-op fusion has been around since Conroe I want to say, so certainly not new. I'm just wondering if there is anything else out there that we may see in a new architecture. I've heard about Speculative Multithreading from David Kanter among a few others. By the sounds of it, it just wasn't all that feasible. Sort of like the compilers that were supposed to make Itanium awesome.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
I never did look into the trace cache all that much. It sounds very much like a uop cache. Are there differences?

The biggest difference is in the way the CPU relies on them. Pentium 4 had a single decoder and no L1 Instruction cache. Trace Cache was supposed to replace both. If it had a high hit rate it would have been great, because it would have cut down on several pipeline stages. But despite the large physical size it wasn't big enough. They could have opted to have 3 wide decoders and L1 I-cache but this was back in early 2000s when architecture was limited by transistor counts and die sizes and it would have simply made it too large. Plus it might have impacted frequency somewhat.

The idea for the Trace Cache was by minimizing decoders and cutting down L1 cache you'd end up with a core that's small and scales in frequency well, while the Trace Cache would make up for the lack of decoders and I-cache. The core was still bloated and way too large, also used lots of power.

Sandy Bridge's uop cache merely aids in it in case the 4-wide decoders and the L1-I cache is not enough. So the worst case scenario isn't absolutely horrific as it was with Netburst chips.

Micro Op fusion has been used since Pentium M back in 2003.
 

Thunder 57

Diamond Member
Aug 19, 2007
4,080
6,806
136
The biggest difference is in the way the CPU relies on them. Pentium 4 had a single decoder and no L1 Instruction cache. Trace Cache was supposed to replace both. If it had a high hit rate it would have been great, because it would have cut down on several pipeline stages. But despite the large physical size it wasn't big enough. They could have opted to have 3 wide decoders and L1 I-cache but this was back in early 2000s when architecture was limited by transistor counts and die sizes and it would have simply made it too large. Plus it might have impacted frequency somewhat.

Yes, as it were, the P4 was already significantly larger in die size than contemporary Athlons. So to clarify, the trace cache is basically a uop cache, except that in modern designs, there is still an L1 instruction cache?

Micro Op fusion has been used since Pentium M back in 2003.

Wow, that is a bit further than I had thought. I'm not surprised it came from the Pentium M team though.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Yes, as it were, the P4 was already significantly larger in die size than contemporary Athlons. So to clarify, the trace cache is basically a uop cache, except that in modern designs, there is still an L1 instruction cache?

And the decoders, the lack of which may be more important than L1 I-cache. Remember, the Netburst designs having only 1 decoder was a big deal, because going from 1 to 2 decoders results in a relatively easy, large improvement, in some cases nearly 2x, and Pentium III design had 3 of them.

x86 designs use decoders to convert more complicated x86 instructions into internal micro ops. Trace Cache stores decoded instructions, so if data exists in the Trace Cache(called a cache hit), you get to skip decoding stages, which also cuts down on pipeline stages. You get the benefit of not needing to decode, and the benefit of short pipeline stages which cuts down on branch misprediction which is costly.

Trace Cache had to do more, so its more complicated and bigger than uop cache.

I'm not surprised it came from the Pentium M team though.

Don't make the mistake of thinking them as a better team. Internally I thought they might have different goals. The original Pentium M team, from Haifa was more core-focused, while the Oregon team was more platform and I/O focused. You get the beefier core from Haifa, and things like Hyperthreading and integrated memory controller + QPI from Oregon. I am not sure if the distinction exists anymore. If it is, Skylake is Haifa's.
 

Thunder 57

Diamond Member
Aug 19, 2007
4,080
6,806
136
Trace Cache stores decoded instructions, so if data exists in the Trace Cache(called a cache hit), you get to skip decoding stages, which also cuts down on pipeline stages. You get the benefit of not needing to decode, and the benefit of short pipeline stages which cuts down on branch misprediction which is costly.

Yea, I am familiar with the decoding. I just thought that a uop cache consisted of decoded instructions. Hence, "micro-op cache". If you get a hit there you save the decoding stage and some power. I'm just trying to understand how the trace cache was different.
 

Thunder 57

Diamond Member
Aug 19, 2007
4,080
6,806
136
Don't make the mistake of thinking them as a better team. Internally I thought they might have different goals. The original Pentium M team, from Haifa was more core-focused, while the Oregon team was more platform and I/O focused. You get the beefier core from Haifa, and things like Hyperthreading and integrated memory controller + QPI from Oregon. I am not sure if the distinction exists anymore. If it is, Skylake is Haifa's.

I don't mean to say that the P4 was total crap. It must've had excellent branch prediction for it's time to keep that pipeline filled. I am really curious as to what Intel did to keep Prescott IPC right around Northwood despite the significantly long pipeline. Branch prediction, increased L1 and L2. Beyond that I have no idea.
 

naukkis

Golden Member
Jun 5, 2002
1,020
853
136
And the decoders, the lack of which may be more important than L1 I-cache. Remember, the Netburst designs having only 1 decoder was a big deal, because going from 1 to 2 decoders results in a relatively easy, large improvement, in some cases nearly 2x, and Pentium III design had 3 of them.

One decoder was newer problem with netburst as instructions are decoded only if L1i misses. Of course having more decoders and real L1i is beneficial if there's die area to use that and way to power down decoders when uOP cache hits to prevent them wasting power. Both options needs transistors which weren't free at Netburst time.

Nowadays Intel cpu's will power down decode phase when loops fit in uOP cache so no power loss from having more complicated front end which is most beneficial at context switches when uOP cache is empty(uOP cache has to be virtual)

But nowadays is pretty much guaranteed that x86 will lose efficiency race against ARM because of complex instruction decode wasting power - and getting even with ARM means that hardware compatibility to x86 will be lost.
 

stuff_me_good

Senior member
Nov 2, 2013
206
35
91
I love proper competition and all new technology and I like the idea Intel getting Jim Keller, but now I'm worried at the same time that in few years time, AMD is going to fall off of cliff like in phenom days because no one to steer the CPU design and in their infinite wisdom doing something horribly disastrous.

Yes Zen has still a lot to give, but if intel has in 4 years new design that is out of the box faster than their current at that time it will trounce latest and greatest Zen design which has barely been able to catch up intels latest *lake design.

And yet we get to the point where monopoly continues and people are bitching not having competition, yet doing nothing about it like voting with their wallets.
 

NTMBK

Lifer
Nov 14, 2011
10,455
5,842
136
I love proper competition and all new technology and I like the idea Intel getting Jim Keller, but now I'm worried at the same time that in few years time, AMD is going to fall off of cliff like in phenom days because no one to steer the CPU design and in their infinite wisdom doing something horribly disastrous.

Yes Zen has still a lot to give, but if intel has in 4 years new design that is out of the box faster than their current at that time it will trounce latest and greatest Zen design which has barely been able to catch up intels latest *lake design.

And yet we get to the point where monopoly continues and people are bitching not having competition, yet doing nothing about it like voting with their wallets.

A modern CPU design team is way bigger than one person. Hopefully the culture and processes that created Zen are still in place at AMD, and will lead to more great designs in the future.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
One decoder was newer problem with netburst as instructions are decoded only if L1i misses. Of course having more decoders and real L1i is beneficial if there's die area to use that and way to power down decoders when uOP cache hits to prevent them wasting power. Both options needs transistors which weren't free at Netburst time.

Please read previous posts.

Pentium 4 did not have L1I cache. Trace Cache was it. And I bet while overall the impact may not have been big as having just 1 decoder, there would still have been non-negligible impact, and in some scenarios where the impact was huge. When there was nothing in the Trace Cache and it was indeed a 1-wide chip.

I don't mean to say that the P4 was total crap. It must've had excellent branch prediction for it's time to keep that pipeline filled.

Yea, I just thought to mention that because just like Jim Keller, Steve Jobs, Elon Musk, and even teams like Haifa, people tend to put them in a pedestal. Yea they do good sure. Remember for Jobs he got evicted from Apple before he came to shape Apple that it is today. We need mistakes to shape us for better.

About the Trace Cache, yeah, there's low-level technical differences but the concept is similar. They both store decoded instructions.
 

jpiniero

Lifer
Oct 1, 2010
16,840
7,284
136
Kind of assuming this is the server-focused product that comes after the Rapids. Given Intel's interest in accelerators, maybe this is where they really get serious about enabling non-CPU tiles.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,777
6,791
136
Most likely it will be Zen 5 vs Ocean Cove. Zen 2 would have taped out already, and Zen 3 would have its key design goals frozen.
Mike Clark already started defining what the future of Zen 5 would look like and incoming challenges would definitely charge them up which is a very good thing for innovation.
Hopefully AMD can generate enough profits in the next 18 months to secure resources and talents for Zen 5, the outcome after 2021-22 would be an epic showdown again after so many sloppy years.
Core vs Core design, no more process handicap.
 
  • Like
Reactions: .vodka and raghu78

LTC8K6

Lifer
Mar 10, 2004
28,520
1,576
126
Ok, when...
AMD and Intel have both answered the question about when fixed chips will be released. It was discussed several times in the big thread on Meltdown/Spectre.

I believe Intel said fixed chips would be out this year. Can't remember AMD's answer.
 

TigerMonsoonDragon

Senior member
Feb 11, 2008
589
39
91
AMD and Intel have both answered the question about when fixed chips will be released. It was discussed several times in the big thread on Meltdown/Spectre.

I believe Intel said fixed chips would be out this year. Can't remember AMD's answer.
So, the architecture in question WILL have the fixes?
 

scannall

Golden Member
Jan 1, 2012
1,960
1,678
136
AMD and Intel have both answered the question about when fixed chips will be released. It was discussed several times in the big thread on Meltdown/Spectre.

I believe Intel said fixed chips would be out this year. Can't remember AMD's answer.
AMD's 470 chipsets already have the Spectre 1 and 2 mitigations in place. Meltdown is Intel only. Intel has said sometime this year.
 

LTC8K6

Lifer
Mar 10, 2004
28,520
1,576
126
AMD's 470 chipsets already have the Spectre 1 and 2 mitigations in place. Meltdown is Intel only. Intel has said sometime this year.
I'm not aware of any hardware fixes by either Intel or AMD.