Discussion Intel current and future Lakes & Rapids thread

Page 24 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

dangerman1337

Senior member
Sep 16, 2010
333
5
81
Is Sapphire Rapids is mostly the Process of the Process> Architecture>Optimization? Intel post-Tiger Lake seems confusing.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Sapphire Rapid is supposedly the first major architectural rework (Zen or Pentium M style) since Core was introduced.

We'll see what performance it brings.

"Revolution" isn't always a good thing. It's only when you can bring about a good change. Oftentimes revolutions and revolutionary things are only done for the sake of being revolutionary. Pentium 4 was revolutionary. Bulldozer architecture was revolutionary.

Also physical limits prohibit revolutionary increases in performance from going on indefinitely. Going from one revolution to another means that much more work. Eventually nature says: STOP!

If you see successful architectures, they are often a very well thought out implementation of ideas already known for years.
 
Mar 10, 2006
11,715
2,012
126
Sapphire Rapid is supposedly the first major architectural rework (Zen or Pentium M style) since Core was introduced.

There is such a big "TOCK" in the works, but it's not clear when the product intercept point is for the core. Sapphire Rapids could be a good time for it.
 
  • Like
Reactions: CatMerc

raghu78

Diamond Member
Aug 23, 2012
4,093
1,475
136
https://www.techpowerup.com/237404/intel-sapphire-rapids-micro-architecture-succeeds-tiger-lake

https://www.intel.com/content/www/u...sets/platform-codenames.html#search-0=tinsley

Sapphire rapids server platform is named Tinsley

Successor to Tiger lake
Big arch change
Supposedly 7nm (I'm weary of this one)

I think we will see Sapphire Rapid in 2021 on 7nm or 10+++ depending on how Intel executes 7nm development . imo we might see a launch cadence something like this
Cannonlake - 2018 Icelake - 2019 Tigerlake - 2020 Sapphire Rapids - 2021
 

crashtech

Lifer
Jan 4, 2013
10,521
2,111
146
It seems that splitting the uarch into two is overdue, one for max efficiency, and one for max performance. They've done a fairly good job of making Core a "Swiss Army Knife," but I'm not sure of the value of trying to have it all with basically one core design anymore.
 
Mar 10, 2006
11,715
2,012
126
It seems that splitting the uarch into two is overdue, one for max efficiency, and one for max performance. They've done a fairly good job of making Core a "Swiss Army Knife," but I'm not sure of the value of trying to have it all with basically one core design anymore.

It's not even about efficiency vs performance, it's about performance/efficiency for each type of workload.

The trade offs required in a core designed to be used along side 27+ copies of itself for data center/HPC workloads are very different from the ones that you'll want to make for an ultra-fast, highly efficient client core.

That's one of the reasons Kaby Lake-Y, for example, gets slapped around by an A11 Bionic in terms of efficiency; A11 Bionic CPU cores are designed to run typical client workloads faster, while something like the Kaby Lake core is designed for a very wide range of power envelopes, types of software (e.g. code that benefits from wide vectors as well as code that doesn't), and so on.

This strategy has been leading to sub-optimal designs all around for a while now, and with the ARMy doing custom ARM cores tailored specifically for server/data center while using the standard Cortex A-series cores for mobile workloads, a significant vulnerability on Intel's part exists.

The divergence is not only desirable but absolutely necessary if Intel is to compete effectively.
 

crashtech

Lifer
Jan 4, 2013
10,521
2,111
146
It's not even about efficiency vs performance, it's about performance/efficiency for each type of workload.

The trade offs required in a core designed to be used along side 27+ copies of itself for data center/HPC workloads are very different from the ones that you'll want to make for an ultra-fast, highly efficient client core.

That's one of the reasons Kaby Lake-Y, for example, gets slapped around by an A11 Bionic in terms of efficiency; A11 Bionic CPU cores are designed to run typical client workloads faster, while something like the Kaby Lake core is designed for a very wide range of power envelopes, types of software (e.g. code that benefits from wide vectors as well as code that doesn't), and so on.

This strategy has been leading to sub-optimal designs all around for a while now, and with the ARMy doing custom ARM cores tailored specifically for server/data center while using the standard Cortex A-series cores for mobile workloads, a significant vulnerability on Intel's part exists.

The divergence is not only desirable but absolutely necessary if Intel is to compete effectively.
Yeah, I guess they gave this idea a go with Atom et al, but we all know how that's going. I don't know if a closer derivative of Core can diverge that far and be effective, or perhaps it would have been done already.
 
Mar 10, 2006
11,715
2,012
126
Yeah, I guess they gave this idea a go with Atom et al, but we all know how that's going. I don't know if a closer derivative of Core can diverge that far and be effective, or perhaps it would have been done already.

Atom is a different design point than what a hypothetical "Client Core" would deliver. Atom has to be small/cheap, so the implementation will focus on transistor density rather than on transistor performance (denser metal stack, use of lower performance/lower leakage transistors across the design, etc.), you lose frequency capability when your goal is density & extreme efficiency.

A "Client Core" would not be designed with such a low cost in mind. Basically, a hypothetical Client-only core would probably still be built with high performance/higher leakage transistors, thicker upper metal layers, etc. But such a core roadmap wouldn't see a widening of the vector units from 256b to 512b (for AVX3 compatibility, just gang two AVX256 units together and execute at half throughput) -- even 256 bit units are probably overkill for client workloads, so maybe ganging a bunch of 128b units would be the right choice for such a core.

Heck, just validating these ISA extensions takes away valuable engineering effort that could go into tuning the cores to run common legacy code faster, so don't be surprised if the client cores adopt new ISA extensions much more slowly than the higher end stuff (ala Atom, which still doesn't even have AVX).

Point is, there are so many trade offs that hurt both client and server that come from having to design a single core for both.
 

Ajay

Lifer
Jan 8, 2001
15,332
7,792
136
https://www.techpowerup.com/237404/intel-sapphire-rapids-micro-architecture-succeeds-tiger-lake

https://www.intel.com/content/www/u...sets/platform-codenames.html#search-0=tinsley


Sapphire rapids server platform is named Tinsley

Successor to Tiger lake
Big arch change
Supposedly 7nm (I'm weary of this one)

Well, 7nm should be with EUV, which will allow Intel to move away from multi-patterning for at least that node. I would think the odds of hitting 7nm on schedule are much better than what we've seen the past couple of nodes with SADP & SAQP.
 
Mar 10, 2006
11,715
2,012
126
Well, 7nm should be with EUV, which will allow Intel to move away from multi-patterning for at least that node. I would think the odds of hitting 7nm on schedule are much better than what we've seen the past couple of nodes with SADP & SAQP.

True. The foundries are eyeing EUV insert in their 7nm nodes for 2H 2019 mass production, so if EUV is good to go by then, Intel should have an easier (though still by no means easy) time with 7nm.
 

eddman

Senior member
Dec 28, 2010
239
87
101
I don't know much about the technical aspects of process nodes. Is it possible that intel is having such issues compared to samsung and TSMC because their 10 nm process is supposed to have smaller pitches?
 
Last edited:

TheGiant

Senior member
Jun 12, 2017
748
353
106
Well we have 5 years without IPC improvement. No wonder Apple is pissed with Intel current actions.

The next year desktop CPU should have at least 5% IPC increase...
 
  • Like
Reactions: Drazick

VirtualLarry

No Lifer
Aug 25, 2001
56,227
9,990
126
A "Client Core" would not be designed with such a low cost in mind. Basically, a hypothetical Client-only core would probably still be built with high performance/higher leakage transistors, thicker upper metal layers, etc. But such a core roadmap wouldn't see a widening of the vector units from 256b to 512b (for AVX3 compatibility, just gang two AVX256 units together and execute at half throughput) -- even 256 bit units are probably overkill for client workloads, so maybe ganging a bunch of 128b units would be the right choice for such a core.
Sounds like... Zen's approach to AVX/AVX2 workloads?
 
  • Like
Reactions: Drazick

TheGiant

Senior member
Jun 12, 2017
748
353
106
I doubt Ivy Bridge and Kabylake have identical IPC.
Sapphire rapid is coming in 2020, Skylake was here in 2015, that's 5 years.....

do you see any architectural changes on the roadmap until sapphire rapid?
 

Dayman1225

Golden Member
Aug 14, 2017
1,152
973
146
Sapphire rapid is coming in 2020, Skylake was here in 2015, that's 5 years.....

do you see any architectural changes on the roadmap until sapphire rapid?


I believe Icelake is a new core, it's Architecture on the PAO scale, probably a few *
% IPC bump and 10nm+
 

mikk

Diamond Member
May 15, 2012
4,112
2,108
136
Sapphire rapid is coming in 2020, Skylake was here in 2015, that's 5 years.....

do you see any architectural changes on the roadmap until sapphire rapid?


Ok but this is rubbish because Icelake is coming before Sapphire Rapid.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
That's one of the reasons Kaby Lake-Y, for example, gets slapped around by an A11 Bionic in terms of efficiency; A11 Bionic CPU cores are designed to run typical client workloads faster, while something like the Kaby Lake core is designed for a very wide range of power envelopes, types of software (e.g. code that benefits from wide vectors as well as code that doesn't), and so on.

We have to also consider that Apple is a much better executed company. There are always people that do better than others in the exact same circumstances. Also lots of top talent(including from Intel) moved to Apple. The future is brighter for them. This may not be just a technical issue, but something far larger. Like company culture and mindset.

Atom is nowhere near A11 Bionic despite what should be aimed at a similar cost device. A11 does what 15W Intel chips do at 5W or less. That's an unprecedented achievement. I don't think Intel had that kind of lead over AMD.
 
  • Like
Reactions: scannall

TheGiant

Senior member
Jun 12, 2017
748
353
106
Ok but this is rubbish because Icelake is coming before Sapphire Rapid.
Well, do you really think Icelake will see the light of the day until maybe late 2019?

My opinion:

2017-Q4 2018 CFL 6C
2018 Q4-introduction of 8C desktop- IMO CFL 8C still on 14nm+++ with 115W TDP
2019 Q4 icelake up to 10% IPC 8C 4,5GHz base DDR5 based

We will see what AMD brings to the table
 

eddman

Senior member
Dec 28, 2010
239
87
101
That's one of the reasons Kaby Lake-Y, for example, gets slapped around by an A11 Bionic in terms of efficiency

A11 does what 15W Intel chips do at 5W or less.

Is it really that bad though? In geekbench, the 4.5 W i7-7Y75 more or less matches A11 in single-threaded tests, and it's still a 14 nm chip. Obviously it loses in multi-threaded because of A11's 4 companion cores. The upcoming CNL-Y should be a better comparison.

http://browser.geekbench.com/v4/cpu/compare/3753091?baseline=4216188

This is still a great achievement by apple though. No question about that.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Is it really that bad though? In geekbench, the 4.5 W i7-7Y75 more or less matches A11 in single-threaded tests, and it's still a 14 nm chip. Obviously it loses in multi-threaded because of A11's 4 companion cores.

This is still a great achievement by apple though. No question about that.

This is a fantastic achievement. If Intel had these cores, they'd be parading around the country touting its benefits.

Since A11 isn't out for comprehensive benchmarking, let's look at A10X. The A10X is a valid alternative because the performance is essentially same as the A11.

-While the Single Core performance may be comparable between A10X and Core M, multi-threading is faster on the A10X and A11.

-A10X goes into a Tablet that weights 1lb and 6mm thick, and has no thermal issues. Their A-series chips also have no issues with throttling from top performance. With Core M ones, ever since Broadwell in 2015, we have come to expect a rather significant variation in performance depending on thermal design, set TDP, and code being run. Often, companies have to use the cTDPup 7W to lessen the performance loss by going Core M.

-Battery life based on Apple chips are 40-60% better than Core M ones. The former is roughly comparable with Atom-based devices while the latter is essentially equal to 15W rated Core chips - TDP does not matter for battery life because modern chips have advanced power management. Core is substantially behind.

-A10X/A11's GPU performance is on the class of expensive eDRAM-equipped Iris parts. The Iris parts are so expensive no one uses them outside of super expensive custom configs or absolute top of the line devices. Also, it does not exist at 5W.

-Technically, A11 Bionic is absolutely awesome. No one outside of Apple has achieved true asynchronous multi-processing. Many companies have come and went trying to achieve what they have done with A11. It's really hard to take advantage of the 4 small cores and 2 big cores in A11 to work with synergy. The controller in the A11 seamlessly manages the transition between the two and essentially acts like a better version of Intel's Hyperthreading. The small cores also allow low-usage power consumption to be better as well, a benefit that Hyperthreading lacks.


Apple achieves the best of both worlds - Atom-like thermals and battery life, along with Core like performance. They even top that by doing it better than Core on graphics. Their multi-threading implementation with 4 small cores are really awesome too. Top-notch in all metrics.
 
Last edited:
  • Like
Reactions: raghu78

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
Intel had better hope Apple never gets that degree of efficiency working at higher TDPs.