Discussion Intel current and future Lakes & Rapids thread

mikk · Aug 20, 2020

IntelUser2000 said:
@Edrick Pretty good chance we'll see one. I can't see why not. As to when it'll arrive, server parts are a priority, so maybe summer?

Icelake-X would be a waste of 10nm capacity, I don't think we will see this, it's unlikely.

IntelUser2000 · Aug 20, 2020

LightningZ71 said:
which is 160 cores, a crazy high, near outlandish number, and likely near impossible to wire to the I/O die successfully with even tech from two years down the road), which in a 2P system, is 24-28 compute dies, or 192-224 cores.

The simple math ignores a lot of reality, such as 5nm only giving a small benefit in terms of perf/watt, and power being the big limited of modern MPUs. Clock down might be big enough that its not worth all that effort. It's not just clock speeds that are at a glacial pace, but voltage scaling.

Also they might be able to get that many if they use the last generation Zen 3. With Zen 4, this is where it straddles the line of being unreasonable.

Wider cores follow the inverse square law, where power increase and number of transistors are roughly equal to square of the performance improvement - 2x perf gain, 4x area and power.

This is why I also doubt without some substantial changes, they cannot fit enough Golden Coves in 10nm to make Sapphire Rapids. Substantial changes mean things like abandoning the pursuit for 5GHz.

Exist50 · Aug 20, 2020

IntelUser2000 said:
The simple math ignores a lot of reality, such as 5nm only giving a small benefit in terms of perf/watt, and power being the big limited of modern MPUs. Clock down might be big enough that its not worth all that effort. It's not just clock speeds that are at a glacial pace, but voltage scaling.

Also they might be able to get that many if they use the last generation Zen 3. With Zen 4, this is where it straddles the line of being unreasonable.

Wider cores follow the inverse square law, where power increase and number of transistors is roughly equal to square of the performance improvement - 2x perf gain, 4x area and power.

Where are you seeing that 5nm is only a minor improvement?

IntelUser2000 · Aug 20, 2020

Exist50 said:
Where are you seeing that 5nm is only a minor improvement?

Remember 20nm? What about 10nm? 5nm is the same. It's slightly more than a half node when you consider how it was 20 years ago and what a full node really was.

I should have said small, but compared to what he's suggesting its "minor".

Exist50 · Aug 20, 2020

IntelUser2000 said:
Remember 20nm? What about 10nm? 5nm is the same. It's slightly more than a half node when you consider how it was 20 years ago and what a full node really was.

I should have said small, but compared to what he's suggesting its "minor".

Definitely not. I'm not sure what gave you that impression, but 5nm is a full node improvement.

IntelUser2000 · Aug 20, 2020

Exist50 said:
Definitely not. I'm not sure what gave you that impression, but 5nm is a full node improvement.

Sure if you take the "new" standards its full node. But the transistor performance gain is actually same as from 16 to 10. Back then many skipped the 10nm and waited for 7nm. Same happened with 20nm.

Adonisds · Aug 20, 2020

IntelUser2000 said:
Sure if you take the "new" standards its full node. But the transistor performance gain is actually same as from 16 to 10. Back then many skipped the 10nm and waited for 7nm. Same happened with 20nm.

What makes you say that?

Thala · Aug 20, 2020

IntelUser2000 said:
Sure if you take the "new" standards its full node. But the transistor performance gain is actually same as from 16 to 10. Back then many skipped the 10nm and waited for 7nm. Same happened with 20nm.

It is 30% power saving at iso performance and 15% performance at iso power compared to 7nm according to TSMC - pretty much within expectation for a new node and certainly not "minor".

LightningZ71 · Aug 20, 2020

It's also a non-trivial improvement in circuit density. If they choose to take all 30% of the power improvement AND they also move the 12nm I/O die to N7 or some other power improved node, then there will be enough total package power improvement headroom to accommodate 50% more CCDs while maintaining the current MHz targets. That's 96 cores per package, and, with SMT2, 384 threads in a 2P server, with wider cores and higher IPC than Rome.

That's going to be hard to keep up with, even with ~38 core XCC dies at 10nm+/SF.

itsmydamnation · Aug 21, 2020

IntelUser2000 said:
Sure if you take the "new" standards its full node. But the transistor performance gain is actually same as from 16 to 10. Back then many skipped the 10nm and waited for 7nm. Same happened with 20nm.

20nm also sucked for other reasons. Like it was a hot mess. i had a 20nm chip in an old mobile. was hot, was messy

Sony Xperia Z5 - Full phone specifications

www.gsmarena.com

out side of the processor i did like the phone.

DrMrLordX · Aug 21, 2020

mikk said:
Icelake-X would be a waste of 10nm capacity, I don't think we will see this, it's unlikely.

It was hard enough for anyone to get a 10980XE. I can't imagine Intel making a similar mistake with IceLake-SP dice.

jpiniero · Aug 21, 2020

mikk said:
Icelake-X would be a waste of 10nm capacity, I don't think we will see this, it's unlikely.

I still think Icelake-SP is going to be cancelled in the end or very tiny volume only, but if they do bother there's going to be a large percentage of chips with <16 cores. How many of those are they going to be able to actually sell as an SP product? Maybe Xeon-W would be more realistic than doing a short run of Icelake-X.

Better chance of a Sapphire Rapids-X happening.

Thala · Aug 21, 2020

LightningZ71 said:
It's also a non-trivial improvement in circuit density. If they choose to take all 30% of the power improvement AND they also move the 12nm I/O die to N7 or some other power improved node, then there will be enough total package power improvement headroom to accommodate 50% more CCDs while maintaining the current MHz targets. That's 96 cores per package, and, with SMT2, 384 threads in a 2P server, with wider cores and higher IPC than Rome.

That's going to be hard to keep up with, even with ~38 core XCC dies at 10nm+/SF.

How can you assume that you can have both 50% more cores AND wider and higher IPC SMT2 cores? How much more power are you assigning to each core in your calculation compared to Rome?

jpiniero · Aug 21, 2020

Thala said:
How can you assume that you can have both 50% more cores AND wider and higher IPC SMT2 cores? How much more power are you assigning to each core in your calculation compared to Rome?

Isn't he talking about Genoa and not Milan? That seems not unrealistic for Genoa.

LightningZ71 · Aug 21, 2020

Milan is N7+, Genoa is 5nm. The shrink will allow wider cores in a smaller CCD. N5 is power draw improved, and the extra width per core, while increasing power draw in and of itself, should be coupled with the usual generational improvements in core power draw. Add to that an improved I/O design and process and the TOTAL PACKAGE power envelope should be able to handle 12 CCDs.

Lodix · Aug 22, 2020

@itsmydamnation
That's is a flaw in the design from Qualcomm, not of the process node. Can we stop blaming foundries for bad designs from the companies ? Same thing now with people trashing SS for upcoming Nvidia's GPUs. Vega was a failure when jumping from 28nm to 14nm FinFet.

lobz · Aug 22, 2020

Lodix said:
@itsmydamnation
That's is a flaw in the design from Qualcomm, not of the process node. Can we stop blaming foundries for bad designs from the companies ? Same thing now with people trashing SS for upcoming Nvidia's GPUs. Vega was a failure when jumping from 28nm to 14nm FinFet.

So both AMD and NVIDIA canceled their 20nm GPUs because Qualcomm designed a bad SoC? Come on...

Thala · Aug 22, 2020

jpiniero said:
Isn't he talking about Genoa and not Milan? That seems not unrealistic for Genoa.

Yes we are talking about Genoa. I was asking how much increase in power he accounts for the cores themselfs - given they are higher IPC and SMT, which is not power neutral.

LightningZ71 · Aug 22, 2020

My basic assumption is that the change to Genoa from the preceding design will be, essentially a refinement on the existing Zen 3 design logical layout, with a transition to the rules for 5nm. While the change from Zen 2 to Zen 3 is expected to be big (wider core, 8 core CCX, possible support for AVX512 by combining two 256 units, etc), the change from Zen 3 to zen 4, I believe, will focus on improvements to the existing logical design and not a big tear up, with the possible exception of adding resources to support SMT4.

With those assumptions, I propose that the big shrink from 7nm to 5nm will allow enough room on the Epyc package for at least 12 CCDs, and, if they make improvements on the package, the substrate, their ability to pack in the pins, and other key points, they could potentially offer a lower frequency, power optimized part with 16 CCDs. This assumes that the I/O die also moves to an improved process, which doesn't have to be TSMC N7, but instead could use a trailing, but improved from GloFo 12lp, node from TSMC or Samsung.

All of those changes can allow the total package power draw of Epyc to fit into the current board and socket envelope.

I want to make it clear, I don't expect parts with more than 12 CCDs from Genoa, I just suggest that it would be possible to make one. I think that there is more than enough space and power available to offer 12 CCD genoas that are optimized around providing maximum cores and threads at decent clock speeds, and continue to offer 8 CCD genoas that focus on maximizing clock speeds in 64 cores. The server market, as we all know, is not uniform. Some need maximum threads per rack unit, and others need maximum performance per core or thread. Genoa can potentially serve both parties.

eek2121 · Aug 22, 2020

Why are we talking about AMD chips in an Intel thread? Please stay on topic.

Ajay · Aug 23, 2020

IntelUser2000 said:
Remember 20nm? What about 10nm? 5nm is the same. It's slightly more than a half node when you consider how it was 20 years ago and what a full node really was.

I should have said small, but compared to what he's suggesting its "minor".

TSMC N5's xtor performance is like that of a half node, but at an ~85 decrease in area - that's more like a full node. That's why the 5N node will be used (plus the large reduction in masks/process steps).

mikk · Aug 26, 2020

https://twitter.com/x/status/1298439604748472320

Tigerlake based Pentium. This is a huge update because the current Pentium Gold 6405U runs at 2.4 Ghz without Turbo. Singlescore up from 600 to 1100 points.

All entries are still based on stepping 1, I wonder if this is the shipping stepping already and stepping 2 comes with LPDDR5 support some time later.

DrMrLordX · Aug 26, 2020

How many 10nm wafers are going to go towards Pentium production under the current circumstances?

TheGiant · Aug 26, 2020

DrMrLordX said:
How many 10nm wafers are going to go towards Pentium production under the current circumstances?

IMO all bad dies and the dies that can't clock high enough
that single core score is as high as 2600X/10400 and that is nice for a pentium
going from 600 to 1100 is a conroe jump at the low end

DrMrLordX · Aug 26, 2020

TheGiant said:
IMO all bad dies and the dies that can't clock high enough

TigerLake in general still has to compete with IceLake-SP, so wafer allocation may not be pretty. There may only be one TigerLake die though (4c) so the proportionate number of failed dice that could be released as Pentiums may be greater than when dealing with something like Comet Lake.

Discussion Intel current and future Lakes & Rapids thread

Diamond Member

Elite Member

Platinum Member

Elite Member

Platinum Member

Elite Member

Member

Golden Member

Platinum Member

Diamond Member

Lifer

Lifer

Golden Member

Lifer

Platinum Member

Senior member

Platinum Member

Golden Member

Platinum Member

Diamond Member

Lifer

Diamond Member

Lifer

Senior member

Lifer