Discussion Intel current and future Lakes & Rapids thread

Dayman1225 · Sep 28, 2017

https://www.techpowerup.com/237404/intel-sapphire-rapids-micro-architecture-succeeds-tiger-lake

https://www.intel.com/content/www/u...sets/platform-codenames.html#search-0=tinsley

Sapphire rapids server platform is named Tinsley

Successor to Tiger lake
Big arch change
Supposedly 7nm (I'm weary of this one)

dangerman1337 · Sep 28, 2017

Is Sapphire Rapids is mostly the Process of the Process> Architecture>Optimization? Intel post-Tiger Lake seems confusing.

CatMerc · Sep 28, 2017

Sapphire Rapid is supposedly the first major architectural rework (Zen or Pentium M style) since Core was introduced.

IntelUser2000 · Sep 28, 2017

CatMerc said:
Sapphire Rapid is supposedly the first major architectural rework (Zen or Pentium M style) since Core was introduced.

We'll see what performance it brings.

"Revolution" isn't always a good thing. It's only when you can bring about a good change. Oftentimes revolutions and revolutionary things are only done for the sake of being revolutionary. Pentium 4 was revolutionary. Bulldozer architecture was revolutionary.

Also physical limits prohibit revolutionary increases in performance from going on indefinitely. Going from one revolution to another means that much more work. Eventually nature says: STOP!

If you see successful architectures, they are often a very well thought out implementation of ideas already known for years.

Arachnotronic · Sep 28, 2017

CatMerc said:
Sapphire Rapid is supposedly the first major architectural rework (Zen or Pentium M style) since Core was introduced.

There is such a big "TOCK" in the works, but it's not clear when the product intercept point is for the core. Sapphire Rapids could be a good time for it.

raghu78 · Sep 28, 2017

Dayman1225 said:
https://www.techpowerup.com/237404/intel-sapphire-rapids-micro-architecture-succeeds-tiger-lake

https://www.intel.com/content/www/u...sets/platform-codenames.html#search-0=tinsley

Sapphire rapids server platform is named Tinsley

Successor to Tiger lake
Big arch change
Supposedly 7nm (I'm weary of this one)

I think we will see Sapphire Rapid in 2021 on 7nm or 10+++ depending on how Intel executes 7nm development . imo we might see a launch cadence something like this
Cannonlake - 2018 Icelake - 2019 Tigerlake - 2020 Sapphire Rapids - 2021

crashtech · Sep 28, 2017

It seems that splitting the uarch into two is overdue, one for max efficiency, and one for max performance. They've done a fairly good job of making Core a "Swiss Army Knife," but I'm not sure of the value of trying to have it all with basically one core design anymore.

Arachnotronic · Sep 28, 2017

crashtech said:
It seems that splitting the uarch into two is overdue, one for max efficiency, and one for max performance. They've done a fairly good job of making Core a "Swiss Army Knife," but I'm not sure of the value of trying to have it all with basically one core design anymore.

It's not even about efficiency vs performance, it's about performance/efficiency for each type of workload.

The trade offs required in a core designed to be used along side 27+ copies of itself for data center/HPC workloads are very different from the ones that you'll want to make for an ultra-fast, highly efficient client core.

That's one of the reasons Kaby Lake-Y, for example, gets slapped around by an A11 Bionic in terms of efficiency; A11 Bionic CPU cores are designed to run typical client workloads faster, while something like the Kaby Lake core is designed for a very wide range of power envelopes, types of software (e.g. code that benefits from wide vectors as well as code that doesn't), and so on.

This strategy has been leading to sub-optimal designs all around for a while now, and with the ARMy doing custom ARM cores tailored specifically for server/data center while using the standard Cortex A-series cores for mobile workloads, a significant vulnerability on Intel's part exists.

The divergence is not only desirable but absolutely necessary if Intel is to compete effectively.

crashtech · Sep 28, 2017

Arachnotronic said:
It's not even about efficiency vs performance, it's about performance/efficiency for each type of workload.

The trade offs required in a core designed to be used along side 27+ copies of itself for data center/HPC workloads are very different from the ones that you'll want to make for an ultra-fast, highly efficient client core.

That's one of the reasons Kaby Lake-Y, for example, gets slapped around by an A11 Bionic in terms of efficiency; A11 Bionic CPU cores are designed to run typical client workloads faster, while something like the Kaby Lake core is designed for a very wide range of power envelopes, types of software (e.g. code that benefits from wide vectors as well as code that doesn't), and so on.

This strategy has been leading to sub-optimal designs all around for a while now, and with the ARMy doing custom ARM cores tailored specifically for server/data center while using the standard Cortex A-series cores for mobile workloads, a significant vulnerability on Intel's part exists.

The divergence is not only desirable but absolutely necessary if Intel is to compete effectively.

Yeah, I guess they gave this idea a go with Atom et al, but we all know how that's going. I don't know if a closer derivative of Core can diverge that far and be effective, or perhaps it would have been done already.

Arachnotronic · Sep 28, 2017

crashtech said:
Yeah, I guess they gave this idea a go with Atom et al, but we all know how that's going. I don't know if a closer derivative of Core can diverge that far and be effective, or perhaps it would have been done already.

Atom is a different design point than what a hypothetical "Client Core" would deliver. Atom has to be small/cheap, so the implementation will focus on transistor density rather than on transistor performance (denser metal stack, use of lower performance/lower leakage transistors across the design, etc.), you lose frequency capability when your goal is density & extreme efficiency.

A "Client Core" would not be designed with such a low cost in mind. Basically, a hypothetical Client-only core would probably still be built with high performance/higher leakage transistors, thicker upper metal layers, etc. But such a core roadmap wouldn't see a widening of the vector units from 256b to 512b (for AVX3 compatibility, just gang two AVX256 units together and execute at half throughput) -- even 256 bit units are probably overkill for client workloads, so maybe ganging a bunch of 128b units would be the right choice for such a core.

Heck, just validating these ISA extensions takes away valuable engineering effort that could go into tuning the cores to run common legacy code faster, so don't be surprised if the client cores adopt new ISA extensions much more slowly than the higher end stuff (ala Atom, which still doesn't even have AVX).

Point is, there are so many trade offs that hurt both client and server that come from having to design a single core for both.

Ajay · Sep 28, 2017

Dayman1225 said:
https://www.techpowerup.com/237404/intel-sapphire-rapids-micro-architecture-succeeds-tiger-lake

https://www.intel.com/content/www/u...sets/platform-codenames.html#search-0=tinsley

Sapphire rapids server platform is named Tinsley

Successor to Tiger lake
Big arch change
Supposedly 7nm (I'm weary of this one)

Well, 7nm should be with EUV, which will allow Intel to move away from multi-patterning for at least that node. I would think the odds of hitting 7nm on schedule are much better than what we've seen the past couple of nodes with SADP & SAQP.

Arachnotronic · Sep 28, 2017

Ajay said:
Well, 7nm should be with EUV, which will allow Intel to move away from multi-patterning for at least that node. I would think the odds of hitting 7nm on schedule are much better than what we've seen the past couple of nodes with SADP & SAQP.

True. The foundries are eyeing EUV insert in their 7nm nodes for 2H 2019 mass production, so if EUV is good to go by then, Intel should have an easier (though still by no means easy) time with 7nm.

eddman · Sep 28, 2017

I don't know much about the technical aspects of process nodes. Is it possible that intel is having such issues compared to samsung and TSMC because their 10 nm process is supposed to have smaller pitches?

TheGiant · Sep 28, 2017

Well we have 5 years without IPC improvement. No wonder Apple is pissed with Intel current actions.

The next year desktop CPU should have at least 5% IPC increase...

mikk · Sep 28, 2017

TheGiant said:
Well we have 5 years without IPC improvement. No wonder Apple is pissed with Intel current actions.

I doubt Ivy Bridge and Kabylake have identical IPC.

VirtualLarry · Sep 28, 2017

Arachnotronic said:
A "Client Core" would not be designed with such a low cost in mind. Basically, a hypothetical Client-only core would probably still be built with high performance/higher leakage transistors, thicker upper metal layers, etc. But such a core roadmap wouldn't see a widening of the vector units from 256b to 512b (for AVX3 compatibility, just gang two AVX256 units together and execute at half throughput) -- even 256 bit units are probably overkill for client workloads, so maybe ganging a bunch of 128b units would be the right choice for such a core.

Sounds like... Zen's approach to AVX/AVX2 workloads?

TheGiant · Sep 28, 2017

mikk said:
I doubt Ivy Bridge and Kabylake have identical IPC.

Sapphire rapid is coming in 2020, Skylake was here in 2015, that's 5 years.....

do you see any architectural changes on the roadmap until sapphire rapid?

Dayman1225 · Sep 28, 2017

https://videocardz.com/73045/intel-to-launch-b360-h370-and-h310-chipsets-in-q12018

Nothing about cannonlake on this road map

EDIT: completely missed it's a desktop roadmap

Dayman1225 · Sep 28, 2017

TheGiant said:
Sapphire rapid is coming in 2020, Skylake was here in 2015, that's 5 years.....

do you see any architectural changes on the roadmap until sapphire rapid?

I believe Icelake is a new core, it's Architecture on the PAO scale, probably a few *
% IPC bump and 10nm+

mikk · Sep 28, 2017

TheGiant said:
Sapphire rapid is coming in 2020, Skylake was here in 2015, that's 5 years.....

do you see any architectural changes on the roadmap until sapphire rapid?

Ok but this is rubbish because Icelake is coming before Sapphire Rapid.

IntelUser2000 · Sep 28, 2017

Arachnotronic said:
That's one of the reasons Kaby Lake-Y, for example, gets slapped around by an A11 Bionic in terms of efficiency; A11 Bionic CPU cores are designed to run typical client workloads faster, while something like the Kaby Lake core is designed for a very wide range of power envelopes, types of software (e.g. code that benefits from wide vectors as well as code that doesn't), and so on.

We have to also consider that Apple is a much better executed company. There are always people that do better than others in the exact same circumstances. Also lots of top talent(including from Intel) moved to Apple. The future is brighter for them. This may not be just a technical issue, but something far larger. Like company culture and mindset.

Atom is nowhere near A11 Bionic despite what should be aimed at a similar cost device. A11 does what 15W Intel chips do at 5W or less. That's an unprecedented achievement. I don't think Intel had that kind of lead over AMD.

TheGiant · Sep 29, 2017

mikk said:
Ok but this is rubbish because Icelake is coming before Sapphire Rapid.

Well, do you really think Icelake will see the light of the day until maybe late 2019?

My opinion:

2017-Q4 2018 CFL 6C
2018 Q4-introduction of 8C desktop- IMO CFL 8C still on 14nm+++ with 115W TDP
2019 Q4 icelake up to 10% IPC 8C 4,5GHz base DDR5 based

We will see what AMD brings to the table

eddman · Sep 29, 2017

Arachnotronic said:
That's one of the reasons Kaby Lake-Y, for example, gets slapped around by an A11 Bionic in terms of efficiency

IntelUser2000 said:
A11 does what 15W Intel chips do at 5W or less.

Is it really that bad though? In geekbench, the 4.5 W i7-7Y75 more or less matches A11 in single-threaded tests, and it's still a 14 nm chip. Obviously it loses in multi-threaded because of A11's 4 companion cores. The upcoming CNL-Y should be a better comparison.

http://browser.geekbench.com/v4/cpu/compare/3753091?baseline=4216188

This is still a great achievement by apple though. No question about that.

IntelUser2000 · Sep 30, 2017

eddman said:
Is it really that bad though? In geekbench, the 4.5 W i7-7Y75 more or less matches A11 in single-threaded tests, and it's still a 14 nm chip. Obviously it loses in multi-threaded because of A11's 4 companion cores.

This is still a great achievement by apple though. No question about that.

This is a fantastic achievement. If Intel had these cores, they'd be parading around the country touting its benefits.

Since A11 isn't out for comprehensive benchmarking, let's look at A10X. The A10X is a valid alternative because the performance is essentially same as the A11.

-While the Single Core performance may be comparable between A10X and Core M, multi-threading is faster on the A10X and A11.

-A10X goes into a Tablet that weights 1lb and 6mm thick, and has no thermal issues. Their A-series chips also have no issues with throttling from top performance. With Core M ones, ever since Broadwell in 2015, we have come to expect a rather significant variation in performance depending on thermal design, set TDP, and code being run. Often, companies have to use the cTDPup 7W to lessen the performance loss by going Core M.

-Battery life based on Apple chips are 40-60% better than Core M ones. The former is roughly comparable with Atom-based devices while the latter is essentially equal to 15W rated Core chips - TDP does not matter for battery life because modern chips have advanced power management. Core is substantially behind.

-A10X/A11's GPU performance is on the class of expensive eDRAM-equipped Iris parts. The Iris parts are so expensive no one uses them outside of super expensive custom configs or absolute top of the line devices. Also, it does not exist at 5W.

-Technically, A11 Bionic is absolutely awesome. No one outside of Apple has achieved true asynchronous multi-processing. Many companies have come and went trying to achieve what they have done with A11. It's really hard to take advantage of the 4 small cores and 2 big cores in A11 to work with synergy. The controller in the A11 seamlessly manages the transition between the two and essentially acts like a better version of Intel's Hyperthreading. The small cores also allow low-usage power consumption to be better as well, a benefit that Hyperthreading lacks.

Apple achieves the best of both worlds - Atom-like thermals and battery life, along with Core like performance. They even top that by doing it better than Core on graphics. Their multi-threading implementation with 4 small cores are really awesome too. Top-notch in all metrics.

DrMrLordX · Sep 30, 2017

Intel had better hope Apple never gets that degree of efficiency working at higher TDPs.

Discussion Intel current and future Lakes & Rapids thread

Golden Member

Senior member

Golden Member

Elite Member

Lifer

Diamond Member

Lifer

Lifer

Lifer

Lifer

Lifer

Lifer

Senior member

Senior member

Diamond Member

No Lifer

Senior member

Golden Member

Golden Member

Diamond Member

Elite Member

Senior member

Senior member

Elite Member

Lifer