Discussion Intel current and future Lakes & Rapids thread

uzzi38 · Dec 19, 2020

shady28 said:
Not strictly true, Tom's uses 2933 for stock config on intel, at least during the 5600X review

The sites that used same speed RAM also tended to get different results from sites like AT that used the chipsets certified JEDEC settings. The win for Zen 3 *mostly* still there, but it is a lot less clear cut, and in some cases Intel wins.

TPU used exactly the same DDR4-3200 RAM settings, using what is a Zen 2 friendly FlareX CL14 kit, and got raked over the coals for it by the AMD crowd when it showed Comet Lake winning in the aggregate with a 2080 Ti.

One has to wonder what would have happened if someone used an Intel friendly RAM kit, stuck that into a Zen 3, and then ran those benchmarks.

Tom's test setup, using 2933 on Intel, 3200 on Zen for stock :

View attachment 36001

Computerbase.de also uses MFR rated RAM speeds :

View attachment 36004

Lol that's not why TPU "got raked over the coals", they acknowledged themselves their results weren't accurate and were a result of what looked like a driver bug between Zen 3 and Turing.

Also, "Zen 2 friendly RAM kit"? It's a memory kit lol, there's no magic AMD favouring or Intel favouring kits.

Btw, Gamer's Nexus use DDR4-3200 for both AMD and Intel and get some of the best results for Zen 3 vs Comet Lake. So much so that an overclocked 10900K (including memory and cache OC) can't match stock Zen 3 results.

Exist50 · Dec 19, 2020

dmens said:
Heh. Wait until you see the sustained performance of a ”low power” Rocketlake chip. If anyone even bothers testing it, that is.

Aa pointed out several times, we know the base clocks, and rough IPC increase. This just reads as denial at this point.

jpiniero · Dec 19, 2020

dmens said:
We will see about that.

Not worse than Comet Lake in perf/w isn't a very ambitious goal.

Det0x · Dec 19, 2020

Hulk said:
Yes that is true. But as you wrote they are squeezing Sunny Cove into 14nm so they could have done it 3 years ago if they had a little foresight. Even at the expense of overall clocks they would be in a better place today with a more efficient architecture even if it were at lower clocks.

They have repeated all of the bad decisions they made with the P4 all over again. Now they have to dig out of that hole. Except this time they don't have a "Core" analog in their back pocket, AMD is all over them, and Apple is in the game along with ARM. Now we'll see what Intel is really made of.

On another thought I don't know if AMD's 7nm parts can all core up to a continuous 4+GHz as reported by HWinfo's Average Effective Clock report.

Can someone with an AMD 7nm part turbo up all cores to 90%+ "Total CPU Usage" (HWinfo) for a few minutes and report "Average Effective Clock?" Also from HWinfo.

I'm starting to think the transistor density is too much at 7nm for all core sustained speeds of 4GHz as reported by HWinfo's Average Effective Clock. We've seen Comet Lake load 10 cores to 4950MHz Average Effective Clock so that's why I'm curious. Something is going on here.

I dont know why, but it is only in handbrake ZEN acts in this way you describe.. In everything else i've found it to boost normally (4.5ghz+ average effective clocks with 100% usage)

Normal boosting in IBT high, prime and cinebench etc

Low effective clock in handbrake

Same settings and same run on everything

dmens · Dec 19, 2020

Exist50 said:
Aa pointed out several times, we know the base clocks, and rough IPC increase. This just reads as denial at this point.

Heh, and that tells you what exactly about power efficiency? Are you accusing someone of denying your opinion which is in turn is based on rumors? LOL.

There are so many red siren warning signs on this product that any positive rumor regarding it should be treated with extreme caution.

dmens · Dec 19, 2020

jpiniero said:
Not worse than Comet Lake in perf/w isn't a very ambitious goal.

Yeah, I know right? And yet, somehow... well... watch it happen.

Exist50 · Dec 19, 2020

dmens said:
Heh, and that tells you what exactly about power efficiency?

Clocks, IPC, and power consumption can tell you efficiency. I shouldn't have to explain something so basic.

dmens said:
Are you accusing someone of denying your opinion which is in turn is based on rumors? LOL.

While your position seems to be based on literally nothing. I'll take rumors (many of which have proven to be accurate) over that.

dmens said:
There are so many red siren warning signs on this product

Now you're either handwaving, or citing your own denial as its own cause.

dmens · Dec 19, 2020

Exist50 said:
Clocks, IPC, and power consumption can tell you efficiency. I shouldn't have to explain something so basic.

Yes, please explain these basic matters. Tell me how you extrapolate power efficiency from IPC. I'd love to hear it.

While your position seems to be based on literally nothing. I'll take rumors (many of which have proven to be accurate) over that.

Yeah I know... so many years wasted working at Intel on CPU design and micro-architecture... not worth anything at all.

Now you're either handwaving, or citing your own denial as its own cause.

Well, it is not my fault you don't recognize the warning signs. I'll give you one to chew on... ever wonder what the power/perf cost would be cranking the skylake core up to 5 wide decode without the benefit of a gate delay reduction from process? Then you look at the planned frequencies on this thing... and the PL2 power numbers from the spec... doesn't take much to put that together. OK well, it might take a EE degree. Hah. But you know what, enjoy your rumors, if they make you feel better.

Exist50 · Dec 19, 2020

dmens said:
Yes, please explain these basic matters. Tell me how you extrapolate power efficiency from IPC. I'd love to hear it.

Efficiency = perf/power = IPC * frequency / TDP.

As I said, very basic.

dmens said:
Yeah I know... so many years wasted working at Intel on CPU design and micro-architecture... not worth anything at all.

You have all the reputability of Piednoel at this point. You certainly have not written anything that demonstrates the least bit of architectural knowledge. On the other hand, you've consistently shown that your sole goal on this forum seems to be trash talking Intel. It's trolling, not technical discussion.

dmens said:
I'll give you one to chew on... ever wonder what the power/perf cost would be cranking the skylake core up to 5 wide decode without the benefit of a gate delay reduction from process?

Already taken into account. You're just ignoring the IPC benefit that provides, and thus the ability to hit the same performance at lower clocks and voltage, assuming roughly equal critical path vs CML.

dmens · Dec 19, 2020

Exist50 said:
Efficiency = perf/power = IPC * frequency / TDP.

As I said, very basic.

Let's see... IPC moves around and is highly workload dependent, frequency moves around based on the whims of a PCU, and TDP has been rendered utterly meaningless about a decade ago. Nice try though. Pro-tip: all the variables of your "equation" on the right are intertwined. For you to claim to be able to extract efficiency as a function of any of them in isolation is pure fantasy.

You have all the reputability of Piednoel at this point. You certainly have not written anything that demonstrates the least bit of architectural knowledge. On the other hand, you've consistently shown that your sole goal on this forum seems to be trash talking Intel. It's trolling, not technical discussion.

Ooooof, low blow. FWIW, I did see that guy's resume... let's just say that mine is a bit less.... verbose.

Already taken into account. You're just ignoring the IPC benefit that provides, and thus the ability to hit the same performance at lower clocks and voltage, assuming roughly equal critical path vs CML.

Hope you realize the critical gate depth increase is one of the hardest problems to solve when growing the width of the machine. Put it another way: your assumption is absolutely ridiculous.

shady28 · Dec 19, 2020

uzzi38 said:
Lol that's not why TPU "got raked over the coals", they acknowledged themselves their results weren't accurate and were a result of what looked like a driver bug between Zen 3 and Turing.

No, that isn't what they found.

They did a later test with high speed RAM that showed that if you used overclocked RAM, the tables turned. However, their results for their standard test bed of a 2080 Ti + DDR4-3200 remain intact.

Big criticism here is that they could have run even higher speed RAM on the Intel platform.

They stuck with DDR4-3800 to keep things 'even', however that just so happens to be the best performing speed *for AMD* due to AMD's Infinity Fabric on their test box maxing at 1900Mhz X2 = 3800Mhz RAM. Zen could use higher speed RAM, but it would be out of sync with the Inifinity fabric and hence suffer a performance penalty.

*It is not the best performing speed for Intel. * Intel will continue to get a performance boost to 4600 and beyond.

It would have been much more interesting if they had run the Intel rig with the fastest RAM that it would support, rather than artificially limiting it to the max speed the Zen 3 could effectively use.

Here's an example of the results they got.

Note again, they are using 3800 because that is the most effective speed for AMD.

uzzi38 said:
Also, "Zen 2 friendly RAM kit"? It's a memory kit lol, there's no magic AMD favouring or Intel favouring kits.

Btw, Gamer's Nexus use DDR4-3200 for both AMD and Intel and get some of the best results for Zen 3 vs Comet Lake. So much so that an overclocked 10900K (including memory and cache OC) can't match stock Zen 3 results.

Intel platforms do better with speed, AMD with lower CL, at least on Zen 2. This particular kit is optimized for AMD.

Exist50 · Dec 19, 2020

dmens said:
Let's see... IPC moves around and is highly workload dependent

It's Sunny Cove uarch. We know how it behaves.

dmens said:
frequency moves around based on the whims of a PCU, and TDP has been rendered utterly meaningless about a decade ago

Intel's used base clock all-core non-AVX (high) frequency to define TDP for years now. If you claim they've changed that with Rocket Lake, you'll need to offer evidence.

dmens said:
Pro-tip: all the variables of your "equation" on the right are intertwined. For you to claim to be able to extract efficiency as a function of any of them in isolation is pure fantasy.

I literally defined efficiency. If you're just going to ignore the very definition, then you've abandoned even the pretense of honest discussion.

dmens said:
Hope you realize the critical gate depth increase is one of the hardest problems to solve when growing the width of the machine. Put it another way: your assumption is absolutely ridiculous.

Except all available evidence suggests roughly equal max frequency to Comet Lake on effectively the same process. It can't have changed much.

Unless you insist that every rumor/leak we've seen are wrong, in which case it's as I said - denial.

Thunder 57 · Dec 19, 2020

shady28 said:
*It is not the best performing speed for Intel. * Intel will continue to get a performance boost to 4600 and beyond.

Show me a link to some benchmarks. You might get a few fps, but I doubt it would be anything significant.

Intel platforms do better with speed, AMD with lower CL, at least on Zen 2. This particular kit is optimized for AMD. (citation needed)

RAM is RAM. This is nothing more than marketing. It's not like there is RGB "designed" for any certain motherboard brand.

dmens · Dec 19, 2020

Exist50 said:
It's Sunny Cove uarch. We know how it behaves.

Really? Do you know how it behaves when backported to a different, older process?

Intel's used base clock all-core non-AVX (high) frequency to define TDP for years now. If you claim they've changed that with Rocket Lake, you'll need to offer evidence.

That is not remotely close to how it is defined. The base clock is what the part is guaranteed to deliver with all its cores without any turbo gimmicks, and as such, due to die variation, is heavily guard-banded such that normally, running at base clock should get nowhere close to the TDP.

This is because TDP is wattage limit *recommendation* given to OEM’s. As such, it is a loose number that Intel/AMD can and will play games with depending on how much thermal margin they feel like having and still sell the most parts they can. Feeling desperate? Cut the margin and bin out the parts that don’t match up.

I literally defined efficiency. If you're just going to ignore the very definition, then you've abandoned even the pretense of honest discussion.

Yeah well, your definition is flawed to the point of being totally useless, so that is your own fault.

Except all available evidence suggests roughly equal max frequency to Comet Lake on effectively the same process. It can't have changed much.

LOL. Here’s one way you can converge a design with a higher gate depth to the same frequency: you increase the voltage. Guess what happens to power when you do that?

Unless you insist that every rumor/leak we've seen are wrong, in which case it's as I said - denial.

Yes yes, I am denying the veracity of your preferred rumors. Guilty as charged. Happy?

Exist50 · Dec 19, 2020

dmens said:
Really? Do you know how it behaves when backported to a different, older process?

Process doesn't change uarch, and IPC is solely a factor of uarch.

dmens said:
That is not remotely close to how it is defined. The base clock is what the part is guaranteed to deliver with all its cores without any turbo gimmicks, and as such, due to die variation, is heavily guard-banded such that normally, running at base clock should get nowhere close to the TDP.

AnandTech Forums: Technology, Hardware, Software, and Deals

Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

www.anandtech.com

dmens said:
Yeah well, your definition is flawed to the point of being totally useless, so that is your own fault.

Lol, post your definition then. Fairies per unicorn?

dmens said:
LOL. Here’s one way you can converge a design with a higher gate depth to the same frequency: you increase the voltage. Guess what happens to power when you do that?

They can't increase voltage. It goes up to nearly 1.4V on Comet Lake. Not even recommended for sustained overclocks. They've pushed it as far as they can.

dmens said:
Yes yes, I am denying the veracity of your preferred rumors. Guilty as charged. Happy?

Well then what do you think clocks and TDP will be? If you're so insistent that everyone else is wrong, put a stake in the ground.

shady28 · Dec 19, 2020

Thunder 57 said:
Show me a link to some benchmarks. You might get a few fps, but I doubt it would be anything significant.

RAM is RAM. This is nothing more than marketing. It's not like there is RGB "designed" for any certain motherboard brand.

It's the timings that are set for a particular brand, for one. There are many people who have not been able to effectively use RAM setup for Intel on AMD rigs, this used to be a big issue.

As for the benchmarks, you can find them all over.

You could watch this guy as he slowly gimps his 10900K down to the level of most of these benchmark review sites (DDR4-3200 CL 14 / Ring 4Ghz) - and loses 21% of this FPS in SOTR.

Or here - this is the effect of just going from DDR4-3800 to 4000 without changing any timings.

Testing 4000MHz RAM: games

Digital Foundry investigates whether RAM speed affects fps for gaming on Intel, and what's more important: frequency vs timings.

www.eurogamer.net

Red bar is DDR4-4000 :

Hulk · Dec 19, 2020

Det0x said:
I dont know why, but it is only in handbrake ZEN acts in this way you describe.. In everything else i've found it to boost normally (4.5ghz+ average effective clocks with 100% usage)

Normal boosting in IBT high, prime and cinebench etc
View attachment 36018 View attachment 36019 View attachment 36022

Low effective clock in handbrake
View attachment 36023

Same settings and same run on everything

This is interesting. Too bad we don't have more data but let's try to analyze this.

We have two 16 core Zen 3 parts, which both have average effective CPU of about 82%.

12 core Zen 3 at 86% and 12 core Zen 2 at 80%.

10 core Comet Lake at 97%

The rest of the scores are from 8 or less core parts and except for 1 result they all are 95% or greater.

The most obvious conclusion is that Handbrake isn't able to use 16 cores/32 threads very effectively. But then we have the 80% usage for the 12 core Zen 3 and the 90% for the 10 core Comet Lake.

Without more scores to analyze I'm stumped?

I noticed that the three results you provided with nearly 100% CPU usage were benchmarks, which are specifically designed to stress all available cores.

Do you have any programs that you use day-to-day that stress the CPU higher than Handbrake?

dmens · Dec 19, 2020

Exist50 said:
Process doesn't change uarch, and IPC is solely a factor of uarch.

Too bad for you, power efficiency is dependent on far more than just uarch.

It's been fun but I have better things to do than to educate dilettantes.

DrMrLordX · Dec 20, 2020

Exist50 said:
It's Sunny Cove uarch. We know how it behaves.

No, it's Cypress Cove. It doesn't seem to behave the same per clock as Sunny Cove (nor Willow Cove), at least comparing GB5 results from IceLake-U to the few Rocket Lake-S leaks we have thus far. We need more data to fully understand what has changed. I think it's safe to say that we will not be seeing a +18% IPC uplift.

Exist50 · Dec 20, 2020

DrMrLordX said:
No, it's Cypress Cove. It doesn't seem to behave the same per clock as Sunny Cove (nor Willow Cove), at least comparing GB5 results from IceLake-U to the few Rocket Lake-S leaks we have thus far. We need more data to fully understand what has changed. I think it's safe to say that we will not be seeing a +18% IPC uplift.

No matter what they call it, it's a Sunny Cove backport. There may be some fudge factor in the numbers from the uncore, but by and large its measured IPC should be in line with any other Sunny Cove implementation.

naukkis · Dec 20, 2020

Exist50 said:
No matter what they call it, it's a Sunny Cove backport. There may be some fudge factor in the numbers from the uncore, but by and large its measured IPC should be in line with any other Sunny Cove implementation.

It's a totally new core, backporting as whole core isn't possible. In silicon processes everything goes to one direction, you could take your old design from older process and simply shrink it to to new process with little effort, but taking new design back to older process is something totally different. And as "backporting" to old process means that they have to design new cpu arch to older process, which is just as expensive as to design new arch to new process even Intel didn't even consider that until new processes failed totally.

Rocketlake is totally desperate move from Intel. But as their Skylake-arch is clocked way over it's efficiency point new bigger cpu-arch for 14nm could be more efficient at some point at high frequency making at least some sense with some products.

Exist50 · Dec 20, 2020

naukkis said:
It's a totally new core, backporting as whole core isn't possible. In silicon processes everything goes to one direction, you could take your old design from older process and simply shrink it to to new process with little effort, but taking new design back to older process is something totally different. And as "backporting" to old process means that they have to design new cpu arch to older process, which is just as expensive as to design new arch to new process even Intel didn't even consider that until new processes failed totally.

Rocketlake is totally desperate move from Intel. But as their Skylake-arch is clocked way over it's efficiency point new bigger cpu-arch for 14nm could be more efficient at some point at high frequency making at least some sense with some products.

This is complete nonsense. For most cores, you just take your same RTL (which includes uarch) and redo the backend work to harden it on the new process. Assuming it's not 100% synthesizable, which some are. This works the same for a shrink or a backport.

Intel's difficulty is that they have custom circuits for each process, so they have to replicate them or find replacements when they port. This doesn't change the uarch, however.

naukkis · Dec 20, 2020

Exist50 said:
This is complete nonsense. For most cores, you just take your same RTL (which includes uarch) and redo the backend work to harden it on the new process. Assuming it's not 100% synthesizable, which some are. This works the same for a shrink or a backport.

Hardest part to do is critical path optimization and wire delay - if they design their uarch to given process backporting it to older process means that structures will become bigger and too slow to meet timing targets - it will work but only at much lower frequencies that it was designed originally.

So what is needed is complete redesign of all structures for older process - full redesign. And if design was optimized for smaller process there are structures that simply aren't physically possible to replicate in older process.

For AMD their designs are fully synthesizable but there won't be any backports as their design for 14nm is much better than direct backport from 7nm design to 14nm. Even their 7nm design need some tweaks like cutting L1i-cache to half to make physical implementation possible - think about what they need to cut out if they want to bakcport that to twice as big transistors.

DrMrLordX · Dec 20, 2020

Exist50 said:
No matter what they call it, it's a Sunny Cove backport. There may be some fudge factor in the numbers from the uncore, but by and large its measured IPC should be in line with any other Sunny Cove implementation.

There is no evidence to indicate that is the case. There is scant evidence it indicate that is not the case. Backporting is not the same as a die shrink.

SAAA · Dec 20, 2020

Ajay said:
Neither of those architectures was designed for 14nm. They need both needed the increase in xtor density and power reduction to effectively replace Skylake. Rocket Lake is so late because the engineers had to figure out how to squeeze a modified *Cove architecture onto 14nm dice.

DrMrLordX said:
There is no evidence to indicate that is the case. There is scant evidence it indicate that is not the case. Backporting is not the same as a die shrink.

Beside some changes to make it work on 14 nm I too don't think there's any architectural difference between Sunny and Cypress Cove cores.
They named it differently as it's a new product mixing gen 12 graphics (from Tiger) and gen 11 cores (from Ice), but there's little indication they would suddenly come up with a new architecture when they didn't bother touching Skylake for over, uh, 6 years.
I mean, surely they could have included some tiny small changes in late Coffee and Comet Lakes rather than doing a new one for a single product, that will last maybe 9 months, right?

Discussion Intel current and future Lakes & Rapids thread

Platinum Member

Platinum Member

Lifer

Golden Member

Platinum Member

Platinum Member

Platinum Member

Platinum Member

Platinum Member

Platinum Member

Platinum Member

Platinum Member

Diamond Member

Platinum Member

Platinum Member

Platinum Member

Diamond Member

Platinum Member

Lifer

Platinum Member

Golden Member

Platinum Member

Golden Member

Lifer

Senior member