Discussion Intel current and future Lakes & Rapids thread

Zucker2k · Dec 23, 2020

inf64 said:
In conclusion, R15 shows 12% IPC lead for Zen3 while newer R20 shows 6.4%. Basically there will be benchmarks where Cypress Cove might perform like Icelake/Willow Cove and there will be cases where it won't. If it clocks to 4.7Ghz all core turbo it might match 5800X or very slightly edge it out in some games. It will need 5.3Ghz ST Turbo to come close to 5800X ST performance

If I ever read anything about a ES AMD product not being final, buggy bios, slow RAM, and all the other convenient variables that suggest the final product will perform better from you, I will conveniently refer you to this post. This is another example of the bias I keep harping about.

inf64 · Dec 23, 2020

AoTs result for Rocketlake 11700KF 8C/16T with RTX3080 (no iGPU, base 3.6GHz, 5Ghz max Turbo)

GPU Benchmark - Ashes of the Singularity

Video Card GPU Benchmarks, High quality testing of your video card and gaming system.

www.ashesofthesingularity.com

For comparison, 5600X (6C/12T) with same GPU, same API/preset/resolution/version of the benchmark

GPU Benchmark - Ashes of the Singularity

Video Card GPU Benchmarks, High quality testing of your video card and gaming system.

www.ashesofthesingularity.com

Gideon · Dec 23, 2020

inf64 said:
AoTs result for Rocketlake 11700KF 8C/16T with RTX3080 (no iGPU, base 3.6GHz, 5Ghz max Turbo)

GPU Benchmark - Ashes of the Singularity

Video Card GPU Benchmarks, High quality testing of your video card and gaming system.

www.ashesofthesingularity.com

For comparison, 5600X (6C/12T) with same GPU, same API/preset/resolution/version of the benchmark

GPU Benchmark - Ashes of the Singularity

Video Card GPU Benchmarks, High quality testing of your video card and gaming system.

www.ashesofthesingularity.com

Ashes is so ridiculously memory-bound that unless you know the memory speed/timings for both the comparison is essentially meaningless.

shady28 · Dec 23, 2020

Out of curiosity I ran Cinebench R23 with my RAM set to SPD instead of XMP. This is 2133, my system has a 102.5 BCLK so it is running at 2180.

Before this I had ran it with my RAM at 2933 X 1.025 BCLK = 3006.

With 2133 SPD I scored 7872 multi, with 2933 I scored 8421.

That's a 7% difference due to the RAM speed just going 2133->2933. I'll hazard a guess 3200 would net at least 8%.

I also ran CPU-Z a few times and the difference was negligible. About 1% on single core and ~1.5% multi-core.

This makes sense as CPU-Z runs a bunch of small benchmarks that can probably reside mostly in cache, while cinebench is working with a larger data set.

2133 Run

2933 Run

Zucker2k · Dec 23, 2020

And on OG Mobile Skylake (HM170) Lenovo Y700-15ISK

Cinebench 15 Stock vs Overclocked RAM

Cinebench R20 Stock vs Overclocked RAM

Intel x'Lake Mobile Platform Memory Overclocking

So, we all know, starting with Skylake in late 2015, Intel locked down its mobile platform to any form of overclocking, except for multiplier unlocked mobile chips with the HK suffix. The mainstream Haswell (4th gen) platform provided 2 extra turbo bins over the base turbo frequency but this was...

forums.anandtech.com

I don't even think 4 cores running at 3.1Ghz are going to be as starved as 8 powerful cores but there's noticeable improvement is scores.

shady28 · Dec 23, 2020

It looks like more refined ES' are leaking out now. This just showed up on Ashes :

https://twitter.com/x/status/1341550889169895429

These scores are sorted by highest FPS to lowest FPS. So for recent benchmarks crazy_1440P and an RTX 3080 running DirectX12, the 11700KF has the #3 spot. #1 and #2 are a 9900K and a 5600X. I would assume they are both running OC RAM and probably OC CPU, possibly OC GPU as well. OFC we have no idea about the 11700KF setup :

Hulk · Dec 23, 2020

Intel has been "widening" the front and back end of Core since it's inception. Is Willow Cove pretty balanced in this respect or do you think it would benefit from a wider front or back end?
I realize this is a difficult question and has many nuances. Running what code, die size, efficiency parameter, etc...
I guess what I'm asking is what should Intel's next major revision of Willow Cove concentrate on improving the most.

Seems like Zen 3 is crazy wide on front and back ends and a lot of effort has gone into minimizing pipeline stalls. How wide can these things get before there is negligible gains using the current software infrastructure?

Looks like there will be a "race to the finish" as far as process goes. You can only go so small. I'm wondering if there will be a similar IPC race, meaning you can only go so wide and that's it, the game is over? Until a completely new way of doing things from a software and hardware perspective is devised.

shady28 · Dec 23, 2020

Hulk said:
Intel has been "widening" the front and back end of Core since it's inception. Is Willow Cove pretty balanced in this respect or do you think it would benefit from a wider front or back end?
I realize this is a difficult question and has many nuances. Running what code, die size, efficiency parameter, etc...
I guess what I'm asking is what should Intel's next major revision of Willow Cove concentrate on improving the most.

Seems like Zen 3 is crazy wide on front and back ends and a lot of effort has gone into minimizing pipeline stalls. How wide can these things get before there is negligible gains using the current software infrastructure?

Looks like there will be a "race to the finish" as far as process goes. You can only go so small. I'm wondering if there will be a similar IPC race, meaning you can only go so wide and that's it, the game is over? Until a completely new way of doing things from a software and hardware perspective is devised.

I remember new CPUs being introduced back in the 80s and 90s, the last generation almost instantly became junk. Wait 2 generations for an upgrade and your old $2000 PC was relegated to a job as a doorstop or footstool.

Now we wait 5 years and argue endlessly about minutia over a ~20% IPC gain. And we don't even get the full 20%, because the new chips can't clock as high as the old chips. Edit And, that's not even areal IPC increase, it's mostly due to cache re-work. To wit, on TPU's 1080P aggregate performance with a 2080 Ti, the difference between a 5800X and an i3-10300 is 5.4%. That is not noticeable to most humans, and all of it can be attributed to the clock speed difference between those 2 chips (i3 @ 4.4Ghz single turbo vs 5800X at 4.7Ghz single turbo).

It's real clear to me that the future of desktop performance increases is going to be just like it has been for mobile phone SoC's. Specialized circuitry with developer libraries & compilers to make use of them for common algorithms. Apple gets that, and Intel is quickly moving that way.

General purpose compute is pretty much done, it's a very 1990s concept at this point.

Thunder 57 · Dec 23, 2020

shady28 said:
I remember new CPUs being introduced back in the 80s and 90s, the last generation almost instantly became junk. Wait 2 generations for an upgrade and your old $2000 PC was relegated to a job as a doorstop or footstool.

Now we wait 5 years and argue endlessly about minutia over a ~20% IPC gain. And we don't even get the full 20%, because the new chips can't clock as high as the old chips. Edit And, that's not even areal IPC increase, it's mostly due to cache re-work. To wit, on TPU's 1080P aggregate performance with a 2080 Ti, the difference between a 5800X and an i3-10300 is 5.4%. That is not noticeable to most humans, and all of it can be attributed to the clock speed difference between those 2 chips (i3 @ 4.4Ghz single turbo vs 5800X at 4.7Ghz single turbo).

It's real clear to me that the future of desktop performance increases is going to be just like it has been for mobile phone SoC's. Specialized circuitry with developer libraries & compilers to make use of them for common algorithms. Apple gets that, and Intel is quickly moving that way.

General purpose compute is pretty much done, it's a very 1990s concept at this point.

We all miss Dennard Scaling but it's not coming back. That's why I think AMD is looking to pick up Xilinx and Intel with Altera. Intel dropping 5G had to hurt though, and AMD may be making another ATI mistake. I guess we shall see.

DrMrLordX · Dec 23, 2020

tamz_msc said:
Intel software running better on Intel hardware isn't a problem.

Of course it is! Don't defend anti-competitive behavior.

ondma · Dec 23, 2020

DrMrLordX said:
Of course it is! Don't defend anti-competitive behavior.

Never miss a chance do you? What if a game developed by nVidia or even (gasp!) wonderful AMD runs better on their respective hardware? Is that "anti-competitive" also?

itsmydamnation · Dec 24, 2020

ondma said:
Never miss a chance do you? What if a game developed by nVidia or even (gasp!) wonderful AMD runs better on their respective hardware? Is that "anti-competitive" also?

way to compare apples to oranges , since when does nVidia and AMD GPU's have the same instruction sets and then when do games specific disable features/performance for one vendor when both vendors support the exact same instruction...... oh wait that doesn't happen does it.

Edrick · Dec 24, 2020

DrMrLordX said:
Of course it is! Don't defend anti-competitive behavior.

Can't tell if that is sarcasm or not. I really hope it is.

tamz_msc · Dec 24, 2020

DrMrLordX said:
Of course it is! Don't defend anti-competitive behavior.

Why should Intel ensure that their software also runs well on AMD hardware? What incentive do they have to make that happen? This is different from the CPUID checking Intel compiler shenanigans. Intel spends their own resources to develop high-performance libraries for their own products. They've got no reason to bother optimizing for other products. It is AMD's prerogative to develop software for their own products, but they take the open-source approach which while advantageous in some respect also means that too often you have to rely on other people with different motivations to do the job which should have been done by themselves in the first place.

itsmydamnation said:
way to compare apples to oranges , since when does nVidia and AMD GPU's have the same instruction sets and then when do games specific disable features/performance for one vendor when both vendors support the exact same instruction...... oh wait that doesn't happen does it.

Godfall, Cyberpunk 2077 are examples of raytracing being timed exclusives tied to AMD and NVIDIA respectively

itsmydamnation · Dec 24, 2020

tamz_msc said:
Godfall, Cyberpunk 2077 are examples of raytracing being timed exclusives tied to AMD and NVIDIA respectively

And NV and AMD have completely different instructions within their own ISA's to do ray tracing so its still not an apples to apples comparison. with GPU drivers being a black box you 100% have to test every family of GPU to ensure compatibility, A CPU you have to just check at run time if the CPU supports the instruction set and away you go.

tamz_msc · Dec 24, 2020

itsmydamnation said:
And NV and AMD have completely different instructions within their own ISA's to do ray tracing so its still not an apples to apples comparison. with GPU drivers being a black box you 100% have to test every family of GPU to ensure compatibility, A CPU you have to just check at run time if the CPU supports the instruction set and away you go.

Except that MKL is a proprietary software that was developed by Intel for use with Intel architectures. Even if you disable CPUID checks on AMD the performance is wildly varying depending on the library in use. Which means that there are specific optimizations developed by Intel for Intel CPUs. Why should Intel spend resources to do the same with AMD CPUs? How can this be anticompetitive?

uzzi38 · Dec 24, 2020

tamz_msc said:
Except that MKL is a proprietary software that was developed by Intel for use with Intel architectures. Even if you disable CPUID checks on AMD the performance is wildly varying depending on the library in use. Which means that there are specific optimizations developed by Intel for Intel CPUs. Why should Intel spend resources to do the same with AMD CPUs? How can this be anticompetitive?

Why would we care if performance is "wildly varying" when it was between no improvement to a massive improvement in all tests. That is to say, as far as I remember there were no major regressions, so what does it matter if performance is "wildly varying".

Nobody's asking Intel to dedicate resources to maximise performance on Zen based platforms. They can just disable CPUID checks and call it a day, nobody would complain.

tamz_msc · Dec 24, 2020

uzzi38 said:
Why would we care if performance is "wildly varying" when it was between no improvement to a massive improvement in all tests. That is to say, as far as I remember there were no major regressions, so what does it matter if performance is "wildly varying".

Nobody's asking Intel to dedicate resources to maximise performance on Zen based platforms. They can just disable CPUID checks and call it a day, nobody would complain.

You are commenting based on outdated information.

uzzi38 · Dec 24, 2020

tamz_msc said:
You are commenting based on outdated information.

Then enlighten me with up to date information.

RTX · Dec 24, 2020

Why'd the gov give money to a foreign company ( TSMC ) to build a new fab rather than to a US company ( Globalfoundries or Intel )? They couldn't build a new one solely for government projects?

Intel Advances in DoD’s Chip Packaging Push

Developing new capabilities for integrating complex semiconductor devices such as chiplets into a single package is seen as one way of reviving the U.S.

www.enterpriseai.news

uzzi38 · Dec 24, 2020

RTX said:
Why'd the gov give money to a foreign company ( TSMC ) to build a new fab rather than to a US company ( Globalfoundries or Intel )? They couldn't build a new one solely for government projects?

Intel Advances in DoD’s Chip Packaging Push

Developing new capabilities for integrating complex semiconductor devices such as chiplets into a single package is seen as one way of reviving the U.S.

www.enterpriseai.news

Because GF is out of the leading edge game and Intel isn't exactly reliable when it comes to delivering leading edge nodes. 14nm started off bad, took some time to really become good, 10nm slipped for years and 7nm is slipping.

tamz_msc · Dec 24, 2020

uzzi38 said:
Then enlighten me with up to date information.

The environment variable workaround is no longer supported as it has been removed. Intel still checks for CPUID but that also can be bypassed. There are Zen-specific kernels now but not for all libraries, AVX2 kernels are still faster than Zen kernels. With these workarounds MKL is performant on Zen but this is applicable if you compile your own code, ie. for commercial software that come with precompiled libraries applying this workaround is going to be difficult.

Intel-MKL-on-AMD-Zen

jpiniero · Dec 24, 2020

uzzi38 said:
Because GF is out of the leading edge game

And Intel is pretty close to getting out too.

RTX · Dec 24, 2020

uzzi38 said:
Because GF is out of the leading edge game and Intel isn't exactly reliable when it comes to delivering leading edge nodes. 14nm started off bad, took some time to really become good, 10nm slipped for years and 7nm is slipping.

It'd help mitigate some of the supply issues for US companies if they could use their existing fabs for non gov stuff and the gov projects via new/separate fabs. Perhaps GF could've been granted several billion in R&D money?

uzzi38 · Dec 24, 2020

tamz_msc said:
The environment variable workaround is no longer supported as it has been removed. Intel still checks for CPUID but that also can be bypassed. There are Zen-specific kernels now but not for all libraries, AVX2 kernels are still faster than Zen kernels. With these workarounds MKL is performant on Zen but this is applicable if you compile your own code, ie. for commercial software that come with precompiled libraries applying this workaround is going to be difficult.

Intel-MKL-on-AMD-Zen

The environment variable workaround is gone but it's pretty clear that CPUID-based checks are still done. That very link you provided details them in fact.

In fact, if I'm reading this right, the situation is even worse than before because the previous CPUID checks are still in placed, but now you also have the additional stage of enforcing the use of non Zen-based kernels instead of the obvious solution of just detecting which ISAs the processor in question supports. Afaik Windows has functions for getting that information, but there are also 3rd party repos specific to that functionality. For example: GitHub - Mysticial/FeatureDetector: What features does your CPU and OS support?

That being said, I haven't written code to handle similar tasks, I'm a lowly web developer myself. But as far as I can see, there are plenty of ways to enable AVX2 without having to perform specific optimisations for each platform or whatever - nothing that should take anywhere near a significant amount of time or effort.

You're justifying MKL running poorly on non-Intel systems because they developed the software. From a legal standpoint it holds up just fine and even from an ethical standpoint, nobody should ask Intel to do differently were there intrinsic optimisations that needed to be done to meaningfully extract more performance. But that's not the case, and the post you just linked shows that perfectly. There were significant gains in performance just by enabling the use of AVX2.

Discussion Intel current and future Lakes & Rapids thread

Golden Member

Diamond Member

Platinum Member

Platinum Member

Golden Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Member

Platinum Member

Diamond Member

Lifer

Member

Platinum Member