My guess is manufacturability.Density or power improvements for the cache redesign? I mean, is there another reason to go through the trouble?
My guess is manufacturability.Density or power improvements for the cache redesign? I mean, is there another reason to go through the trouble?
Rocket Lake isn't monolithic. Intel isn't talking about the Icelake-U 4c yields but they must be awful.There is no possible universe where TGL-U/Y are not monolithic, it simply makes zero sense financially and technically for Intel to do otherwise.
Oh, but the yields, the yields. And that product has been pushed back until well into 2020.There are already engineering samples of 38C 10nm ICL-SP XCC dies back from the fabs, no? (Not that they'll actually be able to yield well enough to make a profitable or even competitive product, but at least Intel wants everyone to know that they're attempting this.)
Right, which is an interesting product in-and-off itself. Considering how late Rocket Lake-U shows up, I'm curious as to why Intel even needs it were monolithic Tigerlake-U yields projected to be acceptable. I mean really, if they could cut down on the iGPU of Tigerlake to strap on two more cores, then why would they even need Rocket Lake-U at all? Since it's only GT1 anyway. They wouldn't. Smart money is that Intel can't just do that. Either because their yield equation would go to pot (for whatever reason; maybe the iGPUs are more defect-resistant, though looking back at Cannonlake I doubt it) or because Intel is going to fluff up their 10nm yields some by splitting up iGPU/SoC into a separate 10nm die (like the one going onto Rocket Lake-U!) so they don't have to go full-blow monolithic for Tigerlake. If the yields on 10nm are still bad enough, there's all the incentive in the world for Intel to produce an EMIBed Tigerlake. They're already producing an EMIBed Rocket Lake-U so why not?The only client chips we've seen on leaked roadmaps that even hinted at a multi-chip module strategy are Rocket Lake-U
Everything I've read indicates that Rocketlake isn't monolithic. 14nm Willow Cove + 10nm SoC/iGPU (gen12).RKL-U is a monolithic, 14nm 6+1 die with 6 Willow Cove cores and GT1 Gen12/Xe graphics with up to 32 EUs.
That one is going to be interesting. It's like Piednol finally got what he wanted, albeit far too late. 10c though, woof.Rocket Lake will also have H and S series parts, and it looks like 6+1 and 10+1 monolithic 14nm dies are planned.
The original Willowcove is referred to as Ultra-wide OoO. Where as previous cores Skylake, Palmcove/Cannonlake, Icelake/Sunnycove are referred to as Super-wide OoO.
Just want to throw that out there again.
No, the reference is to its OoO execution capability.Was this an official name? That is just so embarrassing.
Where did you hear this? Is the Re-Order Buffer increasing by a substantial amount again? On Sunny Cove it’s already 352.No, the reference is to its OoO execution capability.
Everything before Willowcove is super-wide OoO, everything that is Willowcove and after is ultra-wide OoO.
Skylake and beyond is a Super-wide OoO x86 core
Willowcove and beyond is a Ultra-wide OoO x86 core
That would be depth, not width. I want to believe intel is so stupid to misname something so fundamental but maybe this one is too much.Where did you hear this? Is the Re-Order Buffer increasing by a substantial amount again? On Sunny Cove it’s already 352.
It was from a linkedin profile. Who was at AMD for Bulldozer/Piledriver and then went to work on Skylake/Cannonlake(early) and Sunnycove/Willowcove(later) at Intel. SKL/SNC/CNL, all being Super-wide OoO and WLC being Ultra-wide OoO.Where did you hear this? Is the Re-Order Buffer increasing by a substantial amount again? On Sunny Cove it’s already 352.
Geez, that would be insane!My speculation is Willowcove will jump to six reservation stations;
2 computational, 2 two-port store data, 2 load+store AGU parts.
10 ports -> 16 ports
Lakes use unified RS design (going all the way back to P6), the output port count increased by 1 maybe 2, I don't remember exactly, from skylake to cove.It was from a linkedin profile. Who was at AMD for Bulldozer/Piledriver and then went to work on Skylake/Cannonlake(early) and Sunnycove/Willowcove(later) at Intel. SKL/SNC/CNL, all being Super-wide OoO and WLC being Ultra-wide OoO.
Skylake -> two reservation stations; 1 computation+store data, 1 2load+1store portion
Sunnycove -> four reservation stations; 1 computation, 1 two-port store data, 2 load+store portions
My speculation is Willowcove will jump to six reservation stations;
2 computational, 2 two-port store data, 2 load+store AGU parts.
10 ports -> 16 ports
It doesn't actually stop there... AVX3 which is the enhanced hardware version of AVX512. Which on another profile is planned for Willowcove cores, but not Tigerlake cores.Geez, that would be insane!
https://images.anandtech.com/doci/14514/BackEnd.jpgLakes use unified RS design (going all the way back to P6), the output port count increased by 1 maybe 2, I don't remember exactly, from skylake to cove.
They went to 10 from 8. So it's two more. That picture is misleading btw... the RS is segmented to four types uop types, but is not entirely separated like in Atom for example.
As @dmens stated earlier, I thought that width was related to the number of decode and execution units, not related to the OoO execution, Out of Order execution is purely related to the depth of the pipeline.It was from a linkedin profile. Who was at AMD for Bulldozer/Piledriver and then went to work on Skylake/Cannonlake(early) and Sunnycove/Willowcove(later) at Intel. SKL/SNC/CNL, all being Super-wide OoO and WLC being Ultra-wide OoO.
Skylake -> two reservation stations; 1 computation+store data, 1 2load+1store portion
Sunnycove -> four reservation stations; 1 computation, 1 two-port store data, 2 load+store portions
My speculation is Willowcove will jump to six reservation stations;
2 computational, 2 two-port store data, 2 load+store AGU parts.
10 ports -> 16 ports
There are already 4 lanes in avx512. (2 in avx) Saying rather than infers that there is currently a flat 512 bit lane which there isn'tIt doesn't actually stop there... AVX3 which is the enhanced hardware version of AVX512. Which on another profile is planned for Willowcove cores, but not Tigerlake cores.
EVEX vector length bits in the prefex;
00b: 128bit (XMM)
01b: 256bit (YMM)
10b: 512bit (ZMM)
Rather, than execution vertically; 1 instruction -> 1 512-bit datapath.
Willowcove could execute horizontally; 1 instruction -> 4 128-bit datapaths.
Which is known to reduce power consumption.
https://images.anandtech.com/doci/14514/BackEnd.jpg
Skylake has two RS by Intel slides, and Sunnycove has four by Intel slides.
https://images.anandtech.com/doci/13699/Ronak20.jpg
The execution of Intel's FPU can only operate within one mode, while utilizing all resources.There are already 4 lanes in avx512. (2 in avx) Saying rather than infers that there is currently a flat 512 bit lane which there isn't
Not true. Lanes have always been part of the AVX execution design.The execution of Intel's FPU can only operate within one mode, while utilizing all resources.
Everything I'm coming up with when I look for AVX3 turns out to be AVX512. So what exactly is this AVX3 you speak of?The execution of Intel's FPU can only operate within one mode, while utilizing all resources.
SSE128, AVX128, AVX256, AVX512 => Under the new FPU can run any and it will always be 1-to-1 with the hardware.
SSE128 on AVX512 => Bad
AVX128 on AVX512 => Bad
AVX256 on AVX512 => Bad
The next FPU uses AVX512's VL-encoding like SVE. This is further enhanced within AVX3.
Yes it's true. It is widely believed that after Skylake Intel barely designed SunnyCove and as you can see it has a ready x86 Wellow Cove core design for a long time and it's quite possible that Golden Cove is also at the final stage.As @dmens
The widely held theory is that while Skylake has had an extended life span the design team haven’t just been resting on their laurels and instead been cranking out new architectures. I hope this is true as that could mean we are in for a backlog of steady IPC gains in the near future.
Intel will launch its new mainstream desktop series in Q1 2020. The Comet Lake-S, aka 10th Gen Core series, will feature up to 10 cores and 20 threads. The CPUs will be divided into three power tiers: 125W, 65W and 35W.
The new Intel CPUs will require 400-series motherboards as the socket has been changed to LGA1200. Assuming that the information provided by Xfastest is accurate, this would force a change right before switching to a smaller node. As the Comet Lake is still based on Skylake cores, meaning, it is still 14nm fabrication process. The first mainstream desktop architecture to use 10nm is believed to be Ice Lake.
Hmm. Seems like a strange situation. I also wonder how the notebook OEMs are going to deal with all these 14nm products clogging up the channel. It could get kinda ugly.According to Digitimes, Intel says that (the real prices OEMs are paying?) are much higher for Icelake compared to Comet Lake as well.
Just the opposite - I guess the point is that Icelake is extremely limited volume because of the awful yield. Tigerlake will likely just be limited.Hmm. Seems like a strange situation. I also wonder how the notebook OEMs are going to deal with all these 14nm products clogging up the channel. It could get kinda ugly.