I thought I alone was thinking that.I would be surprised if Rocket Lake-S was released this year even though it's now realistic they could do it thanks to the demand destruction from the virus.
I thought I alone was thinking that.I would be surprised if Rocket Lake-S was released this year even though it's now realistic they could do it thanks to the demand destruction from the virus.
It could still be Sunny Cove on 14nm, this could get them a nice uplift in ST performance provided that the clocks do not go significantly down (more than 10%). I still have no idea how intel thinks that part will be able to compete with Zen 3 desktop parts though. The added bonus of GPU is nice but didn't do much for them in 2019 when they lost share in desktop , plus AMD will have a native 8C/16T Zen3 based APU with RDNA2 in early 2021 so this advantage will be nullified.
Hey, you guys fans of curveballs on your expectations?
Got a great one for you.
Intel wishes that was the only RDNA2 APU in the near future.plus AMD will have a native 8C/16T Zen3 based APU with RDNA2 in early 2021 so this advantage will be nullified.
Single core cpus????Well actually no.
With that space you can have another skylake core so, by your math 8 core skylake cpus use the same space of 6 sunny cove cpus, so the performance increase is actually negative.
Too much concentration on single core cpus in 2020 leave the market wide open for others to succeed.
Hmm, I’m sure this has nothing to do with how completely Intel 10 no sucks. We have no visibility into what Intel had to do vis-a-vis the design implementation to get the damn CPUs to work at anywhere near decent clocks. The whole 10nm product stack for cpus is a disaster just based on the monumental screw up that P1274 was.It's relevant in context to the claim "Sunny Cove (architecture) is ridiculously complex".
If we compare Sunny Cove to Skylake we see 18-19% extra IPC performance for 38% more transistors (217M vs ~300M), so it's a really good performance increase for the extra transistors. Not ridiculous complex.
If we compare Sunny Cove to Zen2 we see about 9% better IPC against 33% less transistors (~300M vs ~400M). Not ridiculous complex either.
Then there's the distinction between complexity and transistor count. More complexity enables less transistors for similar performance, which is opposite to what people here seem to assume. It's a trade-off that I think most if not all CPU manufacturers would prefer, as it generally means less die-space on the same process for similar performance.
Yea, I will believe it when I see Anandtech (or Toms or the like) review it. No leaks, no opinions will sway me.@witeken about Rocket Lake on Twitter:
"AVX-512, +20-30% IPC and >5.0GHz will surely steamroll Ryzen."
This guy is golden
Hey it's not me who is measuring the size of a single cpu core without taking into account the size of a group of cpu cores and measuring the rest of the package.Single core cpus????
I don't usually, but I went to twitter and read all the replies. Your post is a nice summation of those.It's like he ate some bad shrooms , crazy tripping
If we compare core transistor count with L3 cache things won't look so radical. 2MB of L3 takes 175M transistors, so after declining that lefts 40M transistors for Skylake, 50M for Zen2 and 125M for Sunny Cove. L3 cache isn't problem, it packs lots of transistors in small area and won't consume much power.
It could still be Sunny Cove on 14nm, this could get them a nice uplift in ST performance provided that the clocks do not go significantly down (more than 10%).
@mikk It doesn't rule out chiplet at all. We basically knew about it from the same sources that leaked Rocketlake info. 14+14 or 14+10 doesn't leave much room for imagination.
Keller said core transistors so you don't have to take out L3 cache numbers.
And second you are overestimating the transistor count for the caches. Even with 8T and 1 extra bit for ECC, it still is under 150 million. They are likely still using 6T SRAM(8T is for L2) for L3 caches.
Your estimation of 40 million transistors for Skylake is so off I suggest you recheck your data. That number for the core only existed in the early Pentium 4 days. Even Prescott beats that figure by a wide margin, nevermind far more complex uarchs of today! 6T SRAM with 1 ECC is only 120 million transistors.
Another counter argument is that you think the number is including the L3. Even with 120 million transistors for the L3 cache, that means only 90 million for Skylake! Again that's 2004 numbers!
"AVX-512, +20-30% IPC and >5.0GHz will surely steamroll Ryzen."
This guy is golden
There's also 48 bits of tags for every 64byte(512bits) of cacheline memory.
That's what Intel gives you, every two cores with 4MB L3 added for Skylake adds that 435 or so million transistor to their reported die transistor counts, that's where that 217 million core transistors figure come from. It seems lowish for core, but every switching transistor uses power and if they don't limit logic switching transistors power usage will skyrocket. Like with Sunny Cove.....
He clearly must be taking a pretty strong blue pill./s
Btw, on more serious note, I remember well his hilariously bad/wrong predictions about Intel's processes.
Actually reasonable claims involve throwing away the possibility of a complete or even close to complete backport out of the window, and if that weren't enough, also throwing some clocks out too.Taking away his exaggeration, what’s the more reasonable performance estimate?
Also what’s the delta to Zen3....he seems to be comparing to Zen2 which is not the right comparison.
Actually reasonable claims involve throwing away the possibility of a complete or even close to complete backport out of the window, and if that weren't enough, also throwing some clocks out too.
The claim that "14nm clocks better than 10nm, so Rocket Lake can simultanously be both a complete (or even close to complete) backport and superior clocks to 10nm products" severely lacks fundamental understanding of how CPUs are designed.
There's no easier way for me to put it.
Actually reasonable claims involve throwing away the possibility of a complete or even close to complete backport out of the window, and if that weren't enough, also throwing some clocks out too.
Alright, fine then. I can give a more in-depth explanation, but not a full depth rundown. I don't even know if giving this much info is fine, I'm just going to hope it's fine, because the last thing I want is a friend of mine getting in trouble. And more than anything else, I'm sick to death of hearing the 5GHz Willow Cove on 14nm thing.uzzi, you're often quite sensible, but I'm not sure what you think is fundamentally missing here. Yes, I think witeken's prediction is wildly optimistic, but fundamentally 10nm hasn't seemed to provide any inherent improvement in clock speed, and if you ignore power consumption and die size, there's not all that much more to it.
Alright, fine then. I can give a more in-depth explanation, but not a full depth rundown. I don't even know if giving this much info is fine, I'm just going to hope it's fine, because the last thing I want is a friend of mine getting in trouble. And more than anything else, I'm sick to death of hearing the 5GHz Willow Cove on 14nm thing.
Some of this info will actually contradict with what I've said in the past.... or at least seem that way at first. Please read through the full post first before jumping to conclusions.
So as I'm sure all are aware, your average CPU is designed with dozens, if not hundreds of IPs put together. Well, if you wanted to do a straight backport, it would take you between 2-6 months, depending on the amount of IP you have to work with, the number of timings you'd have to rework, that kinda thing. But, if you did that... well you'd end up with an absolutely atrocious product.
So if you want to try and get somethign usable, then you need to do some reworking of those IPs. Each of them will be validated for the node their based on, for different degrees of clocks, power, area, durability (not sure if this is the correct word, but I'm sure you get the jist) to the node you want them to be on. So you'll be playing around with all 4 of those variables to get something you can work with.
But you see, when backporting this IP to another node directly, it won't meet those same targets as the original node even in a best case scenario. There would be a serious deficit in those four categories compared to the original node, and I'm not even talking about something just a straight reduction in density would be able to solve.. You first need to find a balance between those above things on the new node, but this is complicated by the fact that different nodes are specced to run a different number of logic levels even at the same frequency. To roughly quote - they'd have to do a significant amount of deep re-workings of the IPs provided they wanted to get that IP to run at the same frequency as 10nm products without something atroocious in the other categories. This is what could take dozens of months to complete.
Even by the end of all this, you'd end up with a product that would require obscene amounts of power just to try and sustain clocks around 4GHz - in fact, to quoite them specifically: "With the time this would take to lower then intended frequency an terrible power ( seeing power would be an issue even at a very "low" frequency -> you don't run a cpu at +4GHz consistently for no good reason ) they should screw it and just go samsung / tsmc."
And well, I became certain we weren't looking at a direct backport the second D0cTB said Rocket Lake started actual devlopment in 2019 just a couple of days ago. They don't have the time for a complete backport.
This is why I wouldn't suggest you believe the full backport rumours so easily. I won't claim to know what it is directly (though I know a couple of people who were told by the bunny in DMs why they should also thing RKL-S isn't a backport, they didn't want to share and I don't plan on probing them for info), but I can - with confidence - say it is not a full backport. A full backport would probably be outperformed by Comet Lake.
So as I'm sure all are aware, your average CPU is designed with dozens, if not hundreds of IPs put together. Well, if you wanted to do a straight backport, it would take you between 2-6 months, depending on the amount of IP you have to work with, the number of timings you'd have to rework, that kinda thing.
But, if you did that... well you'd end up with an absolutely atrocious product.
So if you want to try and get somethign usable, then you need to do some reworking of those IPs. Each of them will be validated for the node their based on, for different degrees of clocks, power, area, durability
But you see, when backporting this IP to another node directly, it won't meet those same targets as the original node even in a best case scenario
Even by the end of all this, you'd end up with a product that would require obscene amounts of power just to try and sustain clocks around 4GHz