• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Discussion Intel current and future Lakes & Rapids thread

Page 262 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
which is 160 cores, a crazy high, near outlandish number, and likely near impossible to wire to the I/O die successfully with even tech from two years down the road), which in a 2P system, is 24-28 compute dies, or 192-224 cores.

The simple math ignores a lot of reality, such as 5nm only giving a small benefit in terms of perf/watt, and power being the big limited of modern MPUs. Clock down might be big enough that its not worth all that effort. It's not just clock speeds that are at a glacial pace, but voltage scaling.

Also they might be able to get that many if they use the last generation Zen 3. With Zen 4, this is where it straddles the line of being unreasonable.

Wider cores follow the inverse square law, where power increase and number of transistors are roughly equal to square of the performance improvement - 2x perf gain, 4x area and power.

This is why I also doubt without some substantial changes, they cannot fit enough Golden Coves in 10nm to make Sapphire Rapids. Substantial changes mean things like abandoning the pursuit for 5GHz.
 
The simple math ignores a lot of reality, such as 5nm only giving a small benefit in terms of perf/watt, and power being the big limited of modern MPUs. Clock down might be big enough that its not worth all that effort. It's not just clock speeds that are at a glacial pace, but voltage scaling.

Also they might be able to get that many if they use the last generation Zen 3. With Zen 4, this is where it straddles the line of being unreasonable.

Wider cores follow the inverse square law, where power increase and number of transistors is roughly equal to square of the performance improvement - 2x perf gain, 4x area and power.

Where are you seeing that 5nm is only a minor improvement?
 
Where are you seeing that 5nm is only a minor improvement?

Remember 20nm? What about 10nm? 5nm is the same. It's slightly more than a half node when you consider how it was 20 years ago and what a full node really was.

I should have said small, but compared to what he's suggesting its "minor".
 
Remember 20nm? What about 10nm? 5nm is the same. It's slightly more than a half node when you consider how it was 20 years ago and what a full node really was.

I should have said small, but compared to what he's suggesting its "minor".

Definitely not. I'm not sure what gave you that impression, but 5nm is a full node improvement.
 
Sure if you take the "new" standards its full node. But the transistor performance gain is actually same as from 16 to 10. Back then many skipped the 10nm and waited for 7nm. Same happened with 20nm.

It is 30% power saving at iso performance and 15% performance at iso power compared to 7nm according to TSMC - pretty much within expectation for a new node and certainly not "minor".
 
It's also a non-trivial improvement in circuit density. If they choose to take all 30% of the power improvement AND they also move the 12nm I/O die to N7 or some other power improved node, then there will be enough total package power improvement headroom to accommodate 50% more CCDs while maintaining the current MHz targets. That's 96 cores per package, and, with SMT2, 384 threads in a 2P server, with wider cores and higher IPC than Rome.

That's going to be hard to keep up with, even with ~38 core XCC dies at 10nm+/SF.
 
Sure if you take the "new" standards its full node. But the transistor performance gain is actually same as from 16 to 10. Back then many skipped the 10nm and waited for 7nm. Same happened with 20nm.
20nm also sucked for other reasons. Like it was a hot mess. i had a 20nm chip in an old mobile. was hot, was messy
out side of the processor i did like the phone.
 
Icelake-X would be a waste of 10nm capacity, I don't think we will see this, it's unlikely.

I still think Icelake-SP is going to be cancelled in the end or very tiny volume only, but if they do bother there's going to be a large percentage of chips with <16 cores. How many of those are they going to be able to actually sell as an SP product? Maybe Xeon-W would be more realistic than doing a short run of Icelake-X.

Better chance of a Sapphire Rapids-X happening.
 
It's also a non-trivial improvement in circuit density. If they choose to take all 30% of the power improvement AND they also move the 12nm I/O die to N7 or some other power improved node, then there will be enough total package power improvement headroom to accommodate 50% more CCDs while maintaining the current MHz targets. That's 96 cores per package, and, with SMT2, 384 threads in a 2P server, with wider cores and higher IPC than Rome.

That's going to be hard to keep up with, even with ~38 core XCC dies at 10nm+/SF.

How can you assume that you can have both 50% more cores AND wider and higher IPC SMT2 cores? How much more power are you assigning to each core in your calculation compared to Rome?
 
How can you assume that you can have both 50% more cores AND wider and higher IPC SMT2 cores? How much more power are you assigning to each core in your calculation compared to Rome?

Isn't he talking about Genoa and not Milan? That seems not unrealistic for Genoa.
 
Milan is N7+, Genoa is 5nm. The shrink will allow wider cores in a smaller CCD. N5 is power draw improved, and the extra width per core, while increasing power draw in and of itself, should be coupled with the usual generational improvements in core power draw. Add to that an improved I/O design and process and the TOTAL PACKAGE power envelope should be able to handle 12 CCDs.
 
@itsmydamnation
That's is a flaw in the design from Qualcomm, not of the process node. Can we stop blaming foundries for bad designs from the companies ? Same thing now with people trashing SS for upcoming Nvidia's GPUs. Vega was a failure when jumping from 28nm to 14nm FinFet.
 
Isn't he talking about Genoa and not Milan? That seems not unrealistic for Genoa.

Yes we are talking about Genoa. I was asking how much increase in power he accounts for the cores themselfs - given they are higher IPC and SMT, which is not power neutral.
 
My basic assumption is that the change to Genoa from the preceding design will be, essentially a refinement on the existing Zen 3 design logical layout, with a transition to the rules for 5nm. While the change from Zen 2 to Zen 3 is expected to be big (wider core, 8 core CCX, possible support for AVX512 by combining two 256 units, etc), the change from Zen 3 to zen 4, I believe, will focus on improvements to the existing logical design and not a big tear up, with the possible exception of adding resources to support SMT4.

With those assumptions, I propose that the big shrink from 7nm to 5nm will allow enough room on the Epyc package for at least 12 CCDs, and, if they make improvements on the package, the substrate, their ability to pack in the pins, and other key points, they could potentially offer a lower frequency, power optimized part with 16 CCDs. This assumes that the I/O die also moves to an improved process, which doesn't have to be TSMC N7, but instead could use a trailing, but improved from GloFo 12lp, node from TSMC or Samsung.

All of those changes can allow the total package power draw of Epyc to fit into the current board and socket envelope.

I want to make it clear, I don't expect parts with more than 12 CCDs from Genoa, I just suggest that it would be possible to make one. I think that there is more than enough space and power available to offer 12 CCD genoas that are optimized around providing maximum cores and threads at decent clock speeds, and continue to offer 8 CCD genoas that focus on maximizing clock speeds in 64 cores. The server market, as we all know, is not uniform. Some need maximum threads per rack unit, and others need maximum performance per core or thread. Genoa can potentially serve both parties.
 
Remember 20nm? What about 10nm? 5nm is the same. It's slightly more than a half node when you consider how it was 20 years ago and what a full node really was.

I should have said small, but compared to what he's suggesting its "minor".
TSMC N5's xtor performance is like that of a half node, but at an ~85 decrease in area - that's more like a full node. That's why the 5N node will be used (plus the large reduction in masks/process steps).
 

Tigerlake based Pentium. This is a huge update because the current Pentium Gold 6405U runs at 2.4 Ghz without Turbo. Singlescore up from 600 to 1100 points.

All entries are still based on stepping 1, I wonder if this is the shipping stepping already and stepping 2 comes with LPDDR5 support some time later.
 
IMO all bad dies and the dies that can't clock high enough

TigerLake in general still has to compete with IceLake-SP, so wafer allocation may not be pretty. There may only be one TigerLake die though (4c) so the proportionate number of failed dice that could be released as Pentiums may be greater than when dealing with something like Comet Lake.
 
Back
Top