Discussion Intel current and future Lakes & Rapids thread

Hulk · Oct 11, 2021

coercitiv said:
Obviously we can agree to disagree on this, but I hope nobody will act all surprised when it turns out E cores on the desktop are used solely for area efficiency.

Great analysis. At least for me this sentence explains a lot. I mean about the e's being there for area efficiency, not power/compute efficiency.

Zucker2k · Oct 11, 2021

Hulk said:
Great analysis. At least for me this sentence explains a lot. I mean about the e's being there for area efficiency, not power/compute efficiency.

They serve those purposes too. Except for outright single thread performance, the gracemont cores should provide more multithreaded performance for area and power consumed. The latter category is yet unknown so don't quote me on that. Hehe

Hulk · Oct 11, 2021

Rethinking this post. I made have made a math error..

I wish benchmarks like CB and CPUz would poll the CPU clocks during the run and report an average clockspeed during the benchmark run. It would make it so much easier to compare IPC (throughput).

Asterox · Oct 11, 2021

mikk said:
Yes 12400 is more efficient than the 5600x which is also close to 80W package power in other tests. However Intel cannot go as high as Zen 3 with the clock speeds to achieve this efficiency, 4 Ghz vs ~4.5 Ghz in multithread is quite a big difference. Assuming this 12400 test is representative.

It is not that simple, you need more data for that conclusion.

Cinebench performanse per watt without precise details(Gamer Nexus)is not enough, especially if you take into account the CPU frequency difference.At same all core CPU frequency, R5 5600X would be even much more efficient.i5 12400 with 4.4ghz all core turbo, well it will be much less efficient no doubt.

i5 12400

- 4ghz all core turbo
- 4.4ghz singlecore

R5 5600X

- 4.4-4.5ghz all core turbo
- 4.6ghz singlecore

dullard · Oct 11, 2021

Markfw said:
I reserve all comments until a recognized authority like Anandtech has FULLY reviewed the final product. And actually, I want 3-5 of these before I will comment. First looks say it may be competitive, but at higher power usage. What if its only in a couple of benchmarks that it wins ? What if it fails in most things ? What if the power usage is way too high ? What if its a combination of all of these ? And at what magnitudes ? Lets talk once its out... And maybe even a week later.

If you want to talk about it later, then why post now? And why post with so many possible negative outcomes for Intel if you want to be neutral and really analyze the data? It seems like you really just wanted to seed bad press about Intel, you wanted to have your say, and then you wanted to try to shut others up claiming that we should wait to talk about it.

Asterox · Oct 11, 2021

Delete post, to much beer.

Zucker2k · Oct 11, 2021

Asterox said:
i5 12400

- 4ghz all core turbo
- 4.4ghz singlecore

R5 5600X

- 4.4-4.5ghz all core turbo
- 4.6ghz singlecore

At the same clock and power consumption, the difference should be even greater than 10%. And how come all of a sudden, you're now the one talking about Zen 3 inefficiencies at clock frequencies you had no problems with in the past? Oh wait, this is no longer RKL on 14nm++++. What I find funny is that you'll rather take a performance hit in order to claim parity in efficiency. Well, efficiency is all about performance per power consumption so I don't know why you're grasping at straws.

Abwx · Oct 11, 2021

mikk said:
Yes 12400 is more efficient than the 5600x which is also close to 80W package power in other tests.

5600X is around 70W in Cinebench R20.

Intel Core i9-11900K & i5-11600K „Rocket Lake-S“ im Test: Leistungsaufnahme und Overclocking

Core i9-11900K und i5-11600K im Test: Leistungsaufnahme und Overclocking / Leistungsaufnahme in Anwendungen / Leistungsaufnahme in Spielen

www.computerbase.de

Zucker2k · Oct 11, 2021

Thread Director and W11 scheduler? Not what a lot of naysayers suggested not too long ago.

https://twitter.com/x/status/1446742599193141253

mikk · Oct 11, 2021

Abwx said:
5600X is around 70W in Cinebench R20.

Intel Core i9-11900K & i5-11600K „Rocket Lake-S“ im Test: Leistungsaufnahme und Overclocking

Core i9-11900K und i5-11600K im Test: Leistungsaufnahme und Overclocking / Leistungsaufnahme in Anwendungen / Leistungsaufnahme in Spielen

www.computerbase.de

It's Cinebench R20 not AIDA64 FPU which uses much more power on Intel CPUs. We don't have power consumption numbers from Cinebench but they should be lower. On Zen 3 Cinebench R20 isn't worst case either, Anandtech measured 76W in y-cruncher.

Hitman928 · Oct 11, 2021

Zucker2k said:
Thread Director and W11 scheduler? Not what a lot of naysayers suggested not too long ago.

View attachment 51246

Has anyone compared performance with Rocket or Cometlake between Windows versions?

Abwx · Oct 11, 2021

mikk said:
It's Cinebench R20 not AIDA64 FPU which uses much more power on Intel CPUs. We don't have power consumption numbers from Cinebench but they should be lower. On Zen 3 Cinebench R20 isn't worst case either, Anandtech measured 76W in y-cruncher.

As i said Aida FP use less power than Cinebench, Y-Cruncher use AVX256 and will consume a little more, at a comparable level there s POVRay since it manage to saturate a core throughput with a single thread, and then above all there s Prime 95 wich is almost a power virus.

That being said possibly that Aida use more power than CB if the CPU support AVX512.

Zucker2k · Oct 11, 2021

Here's a thought:
Because of the hybrid nature of ADL, any benchmark whose result is not cumulative is not going to look as good compared to previous archs because of the low clocks of the E cores. In particular, those benches that adds cores and divides by same number. Actual work should not reflect this seeming deficiency. I wouldn't pay too much heed to such benchmarks.

mikk · Oct 11, 2021

Abwx said:
As i said Aida FP use less power than Cinebench, Y-Cruncher use AVX256 and will consume a little more, at a comparable level there s POVRay since it manage to saturate a core throughput with a single thread, and then above all there s Prime 95 wich is almost a power virus.

That being said possibly that Aida use more power than CB if the CPU support AVX512.

On my ancient i7-7700K with a fixed 4.2 Ghz which is limited to AVX2...

Cinebench R20= 68W (2100 points)
AIDA64 CPU= 49.5W
AIDA64 FPU= 76W
AIDA64 CPU+FPU= 67-69W

AIDA64 FPU is using more power than Cinebench R20 as I said.

insertcarehere · Oct 11, 2021

Abwx said:
As i said Aida FP use less power than Cinebench, Y-Cruncher use AVX256 and will consume a little more, at a comparable level there s POVRay since it manage to saturate a core throughput with a single thread, and then above all there s Prime 95 wich is almost a power virus.

Except people have run AIDA64 stress tests on the 5600x, and there's no indication that it either consumes less power there than Cinebench, or that the 5600x consumes significantly less power in that test than what we saw on the 12400 screenshot.

~76w package power (CPU + FPU)

https://ee.ofweek.com/2020-11/ART-8330-2801-30469451_4.html ~98w package power

https://www.expreview.com/76781.html ~85w power

https://news.xfastest.com/review/review-02/87558/amd-ryzen-5000-5950x-5900x-5800x-5600x/ ~75w delta between AIDA64FPU & idle

Makaveli · Oct 11, 2021

On my Rig I do have PBO on also so not stock.

Aida

Cinebench R20

mikk · Oct 11, 2021

You have to write what AIDA64 stress test you are using, it's meaningless otherwise. AVX stress tests on AMD is not as heavy as on Intel CPUs anyways.

Makaveli · Oct 11, 2021

mikk said:
You have to write what AIDA64 stress test you are using, it's meaningless otherwise. AVX stress tests on AMD is not as heavy as on Intel CPUs anyways.

Zucker2k · Oct 11, 2021

insertcarehere said:
Except people have run AIDA64 stress tests on the 5600x, and there's no indication that it either consumes less power there than Cinebench, or that the 5600x consumes significantly less power in that test than what we saw on the 12400 screenshot.

~76w package power (CPU + FPU)

https://ee.ofweek.com/2020-11/ART-8330-2801-30469451_4.html ~98w package power

https://www.expreview.com/76781.html ~85w power

View attachment 51248

https://news.xfastest.com/review/review-02/87558/amd-ryzen-5000-5950x-5900x-5800x-5600x/ ~75w delta between AIDA64FPU & idle

The 12400 figure is for package power, so 78w vs 98w. This level of efficiency is remarkable.

deasd · Oct 11, 2021

SisoftwareSandra results got continous updating from anonymous:

https://twitter.com/x/status/1447502120731643904

Apparently Win11 plays a BIG role here... but I still make assumption like before, very likely the small cores didn't work well if you compare Sandra to other tests... is AVX performance of Gracemont crippled compare to some other old big cores like RKL/CML/Zen2/3 just for efficiency in legacy code?

and here's a funny reading (maybe off-topic) but still very likely correlated to the GeekBench leaks of ADL before: GB blocks the leaks of pre-release hardwares

Geekbench now blocks access to pre-release hardware benchmarks - KitGuru

Over the years, we've seen many, many early benchmark results for unreleased processors all coming f

www.kitguru.net

Zucker2k said:
The 12400 figure is for package power, so 78w vs 98w. This level of efficiency is remarkable.

That 5600x PBO is on which loaded all cores with 4.6Ghz@1.32v, same review page also had 5800x power number ~123w which is in line with @Makaveli 's result ~127w that also PBO on

JoeRambo · Oct 11, 2021

deasd said:
Apparently Win11 plays a BIG role here... but I still make assumption like before, very likely the small cores didn't work well if you compare Sandra to other tests...

It could be as simple as Sandra having the type of workload called - equal parcels: It takes number of threads @ 24 ( for 8+HT +8 small) and runs an equally sized computation on each. It works fine on homogenous architectures, where each thread has same performance, but fails utterly in hybrid architecture.
On hybrid, big cores complete the workload first and what happens next depends on OS:
1) On hybrid unaware OS like W10 => workload is left to keep on running on small cores, while big core is sat Idle => this decision looks stupid, but in fact it is optimal on previuos architectures, to stick workload whereever it is now to reap benefits from hot caches, hot branch predictors and so on.
2) On hybrid aware OS like W11, CPU is hinting OS that it currently small core is running heavy workload that would be better fit on big core and it gets rescheduled there and total "performance" increases cause workload completes sooner.

Of course due to inflexible workload, even (2) approach idles the small cores during that final strech of runtime and total potential does not reach 100% like it does in things like CB.

mikk · Oct 11, 2021

Makaveli said:
View attachment 51251

CPU+FPU is lower than FPU only. Untick all expect for FPU and check it again.

Makaveli · Oct 11, 2021

mikk said:
CPU+FPU is lower than FPU only. Untick all expect for FPU and check it again.

mikk · Oct 11, 2021

Ok thanks, so it's different on AMD. On Zen 3 AIDA64 is not a worst case but it's not that new. AVX stress tests on Intel CPUs are much more heavy.

deasd · Oct 11, 2021

JoeRambo said:
It could be as simple as Sandra having the type of workload called - equal parcels: It takes number of threads @ 24 ( for 8+HT +8 small) and runs an equally sized computation on each. It works fine on homogenous architectures, where each thread has same performance, but fails utterly in hybrid architecture.
On hybrid, big cores complete the workload first and what happens next depends on OS:
1) On hybrid unaware OS like W10 => workload is left to keep on running on small cores, while big core is sat Idle => this decision looks stupid, but in fact it is optimal on previuos architectures, to stick workload whereever it is now to reap benefits from hot caches, hot branch predictors and so on.
2) On hybrid aware OS like W11, CPU is hinting OS that it currently small core is running heavy workload that would be better fit on big core and it gets rescheduled there and total "performance" increases cause workload completes sooner.

Of course due to inflexible workload, even (2) approach idles the small cores during that final strech of runtime and total potential does not reach 100% like it does in things like CB.

Sounds like a mess... if true Sandra & other apps alike have to rewrite the code just for big+little? Win11 isn't enough? Now I see that's why 'N+0' i5s more attractive than any other above i7...

Discussion Intel current and future Lakes & Rapids thread

Diamond Member

Golden Member

Diamond Member

Golden Member

Elite Member

Golden Member

Golden Member

Lifer

Golden Member

Diamond Member

Diamond Member

Lifer

Golden Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Senior member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Senior member