Discussion Intel current and future Lakes & Rapids thread

IntelUser2000 · Oct 17, 2020

It feels like even with October availability, they rushed out Tigerlake. Driver issues! The GPU not being fully utilized in games.

Mash IT also has a dedicated gaming test for the 9310.

mikk · Oct 17, 2020

His Rocket League test looks completely borked, package power only 11W and both CPU and GPU well below 100% while running at 53°, so why it is running with only 11W, there is no logical heat/power limit. It reminds me of Ivy Bridge HD 4000 8 years ago, there was a "low clock bug" in some games the first weeks after launch (even on a desktop SKU), it was a driver bug.

IntelUser2000 · Oct 17, 2020

Actually mikk, if you look at graphics settings, it shows that Rocket League is capped at 62 fps. You can see from the gameplay that its basically hitting that limit most of the time.

For a game that plays so well, its a way to conserve battery life and reduce heat/noise.

Gideon · Oct 18, 2020

naukkis said:
In those exteme MT-loads Zen3 will operate at it's base clock ~3.4Ghz at TDP power level. It might be that Gracemont cores will operate at higher frequency at those tasks.

Not really. Average clocks of 3950x are listed in the anandtech's review. With 16 loaded cores it's almost 3.9 Ghz.

And even when you ignore clock speed (that will realistically give AMD at least a 10% advantage):

Zen 3 will have at least a 10% IPC lead, realistically more (unless Gracemont is >10% faster than Skylake)
AMD gains ~40% from SMT in such workloads.

I do agree, that even among MT workloads many do not scale well to 16-24+ threads. Most consumer stuff (outside rendering) will not scale very well past 10-12 cores and should thus run very competitively on Alder Lake. I just find it unlikely that a 8 core Alder lake can beat 16 Zen 3 in workloads that do scale to 32 threads.

EDIT: Anyway sorry for the OT, won't discuss this point forward.

majord · Oct 18, 2020

Is there any actual evidence of the supposed Skylake IPC from gracemont? Letalone it scaling to anywhere near 4Ghz?

Not saying it's untrue, just everyone's talking like its gospel. It's barely going to be what you'd class as " big.Little" If their IPC and clock frequencies are anywhere near what's being proposed by some people in this thread - just make them all Golden Cove!

jpiniero · Oct 18, 2020

majord said:
Is there any actual evidence of the supposed Skylake IPC from gracemont? Letalone it scaling to anywhere near 4Ghz?

4 Ghz is a stretch but Gracemont having near Skylake IPC is what's been rumored.

Not saying it's untrue, just everyone's talking like its gospel. It's barely going to be what you'd class as " big.Little" If their IPC and clock frequencies are anywhere near what's being proposed by some people in this thread - just make them all Golden Cove!

Mobile benefits from having the small cores. Thus why it's there.

Zucker2k · Oct 18, 2020

Gideon said:
Not really. Average clocks of 3950x are listed in the anandtech's review. With 16 loaded cores it's almost 3.9 Ghz.

And even when you ignore clock speed (that will realistically give AMD at least a 10% advantage):

Zen 3 will have at least a 10% IPC lead, realistically more (unless Gracemont is >10% faster than Skylake)

AMD gains ~40% from SMT in such workloads.

I do agree, that even among MT workloads many do not scale well to 16-24+ threads. Most consumer stuff (outside rendering) will not scale very well past 10-12 cores and should thus run very competitively on Alder Lake. I just find it unlikely that a 8 core Alder lake can beat 16 Zen 3 in workloads that do scale to 32 threads.

EDIT: Anyway sorry for the OT, won't discuss this point forward.

If average load is 3.875GHz then heaviest loads (AVX) should be even lower, and that's ignoring the extra avx unit introduced in Zen 3.

Gideon · Oct 18, 2020

Zucker2k said:
If average load is 3.875GHz then heaviest loads (AVX) should be even lower, and that's ignoring the extra avx unit introduced in Zen 3.

The workload Anandtech used on the graph was POV-ray, that uses AVX-2 extensively. Also zen3 might have improved FP units, instead of more of them (Which IMO is more likely given smaller gains in geekbench).

mikk · Oct 18, 2020

Gideon said:
The workload Anandtech used on the graph was POV-ray, that uses AVX-2 extensively.

How much, do you have a source?

Gideon · Oct 18, 2020

mikk said:
How much, do you have a source?

According to Anandtech's Cannon-Lake review, out of AVX-2 workloads POV-Ray reduced the frequency the most (to 2.2 Ghz)

AVX-512 workloads reduced the clocks even lower, but out of anandtech test suite at the time POV-Ray was the hardest hitter for AVX-2, also for Kaby Lake.

The Cannon Lake processor loses frequency as the cores are loaded, and severely loses frequency when AVX2/AVX512 is applied based on our testing. Comparing that to the Kaby Lake on Intel’s mature 14nm node, it keeps its turbo and only loses a few hundred MHz with AVX2. This part does not have AVX512, which is a one up for the Cannon Lake.

The biggest discrepancy we observed for AVX2 was in our POV-Ray test.

jpiniero · Oct 18, 2020

AVX doesn't really hurt Zen frequency like it does on Intel, so it doesn't matter that much.

IntelUser2000 · Oct 19, 2020

majord said:
Is there any actual evidence of the supposed Skylake IPC from gracemont? Letalone it scaling to anywhere near 4Ghz?

I expect few % better if anything.

jpiniero said:
AVX doesn't really hurt Zen frequency like it does on Intel, so it doesn't matter that much.

That's true, but serious downclocking happened on Intel starting with AVX-512. Some downclocking existed on AVX2 with Broadwell but not so seriously.

Dual AVX-512 was obviously too much on the 14nm process.

On 10nm, Icelake shows nearly no reduction in clocks in 1xAVX-512 mode. Icelake-SP should improve clocks in AVX-512 too.

If for example the original Zen had 2xAVX-512 it would have had to degrade seriously too.

Gideon · Oct 19, 2020

IntelUser2000 said:
That's true, but serious downclocking happened on Intel starting with AVX-512. Some downclocking existed on AVX2 with Broadwell but not so seriously.

Dual AVX-512 was obviously too much on the 14nm process.

On 10nm, Icelake shows nearly no reduction in clocks in 1xAVX-512 mode. Icelake-SP should improve clocks in AVX-512 too.

If for example the original Zen had 2xAVX-512 it would have had to degrade seriously too.

This is also why i think that while Gracemont might have even slightly better integer performance than Skylake, it won't have as much FP perfomance. It just seems odd to add 2x256bit vector units to a small core,
2x 128bit or 1x256bit seem a lot more likely IMO (with what it would have FP perfomance more comparable to Zen 1).

I mean, just look at how die space increase on Zen 2 vs Zen 1 for doubling the SIMD units (and that's with huge shrinks for SRAM, going from 14nm to 7nm). What would be the purpose to waste die-space for that on cores which whole point is to be small?

Zen 1:

5017e5_727ef30243c5476bb4804714f2cfa348~mv2.webp

Zen 2:

5017e5_982e0e47d7c04dd6936d28007c655c09~mv2.webp

Besides, if they already had 2x256 bit, they'd probably try to implement AVX-512 support as well

majord · Oct 19, 2020

IntelUser2000 said:
I expect few % better if anything.

That's true, but serious downclocking happened on Intel starting with AVX-512. Some downclocking existed on AVX2 with Broadwell but not so seriously.

Dual AVX-512 was obviously too much on the 14nm process.

On 10nm, Icelake shows nearly no reduction in clocks in 1xAVX-512 mode. Icelake-SP should improve clocks in AVX-512 too.

If for example the original Zen had 2xAVX-512 it would have had to degrade seriously too.

Yeah look to be honest Skylake IPC, at least in a majority of workloads isn't far fetched. The last I benchmarked was Goldmont+ , and It was not bad at all. but was well out of puff as a perf/watt part @ its max turbo of 2.7ghz.
Tremont - who knows, there's very little benchmarks to backup the claims for it over goldmont+ , I can't even tell what frequency they're maxing out at. Seems lower than Goldmont (I see quoted 1.8Ghz max for the tremont cores?)

mikk · Oct 19, 2020

majord said:
I can't even tell what frequency they're maxing out at. Seems lower than Goldmont (I see quoted 1.8Ghz max for the tremont cores?)

Elkhart Lakes clocks up to 3.0 Ghz: https://ark.intel.com/content/www/us/en/ark/products/codename/128825/elkhart-lake.html#@nofilter

Jasper Lakes clock up to 3.1 (mobile), 3.3 Ghz (desktop) according to rumors: https://hothardware.com/news/intel-10nm-jasper-lakecpu-lineup-leaks

This is all on the medicore Icelake 10nm node. Gracemont should clock way higher than this.

IntelUser2000 · Oct 19, 2020

@Gideon Goldmont Plus already supports 128-bit SSE. Actually fully pipelined 128-bit vector unit started with the predecessor, Goldmont.

You can see from the die shots the overall die is tiny, nevermind the vector units.

Also, while they could choose to support AVX-512, it still needs extra transistors even with the same width.

Intel doesn't call it big little. They call it big/bigger. Tremont/Gracemont is far closer to their bigger cores than little cores are in ARM.

It supports my speculation the performance of the so-called "little" cores have to be high enough for transparent user experience and seamless switching between cores in real world usage scenarios.

LightningZ71 · Oct 19, 2020

Jasper Lake is looking to be quite the little beast for low end mobile. 48EU of Xe architecture iGPU is absolutely nothing to sneeze at and is a massive upgrade as compared to prior generations of the Atom/Celeron/Pentium product. It should easily beat any of the 3CU AMD parts in that market segment so long as the power envelope is sufficient.

Gideon · Oct 19, 2020

IntelUser2000 said:
@Gideon Goldmont Plus already supports 128-bit SSE. Actually fully pipelined 128-bit vector unit started with the predecessor, Goldmont.

You can see from the die shots the overall die is tiny, nevermind the vector units.

Yeah that's true, but that's not 2x256bit AVX2 units that would be required for syklake-like Floating Point perfomance. While it might happen, I'm not convinced such increase is justified.

JoeRambo · Oct 19, 2020

Gideon said:
Besides, if they already had 2x256 bit, they'd probably try to implement AVX-512 support as well

I think with AVX512 the problem is not the execution units "area", but rather the PRF size and the instructions themselves.

1) There is huge jump in registers count -> instead of 16x256bit you get 32x512bit requiring at least 4 times area. And given that PRF is not limited to 32, but rather ~100 to allow for OoO => ~200KB of register file size, that is chunk of area.
2) There are instructions that cross the 256bit "lanes". Good example is permutes of 512 bit registers, can't really properly implement them with splitted 2x256 units as they read and write from any part of source/destination 512bit registers. Really need 512bit capable processing unit for those.

And now there's big difference between having two independent 256bit ports and one "unified" for 512bit operations, as that halves your FP throughput for AVX/SSE2 workloads.

IntelUser2000 · Oct 19, 2020

@Gideon I think we're there now. With Tigerlake Celerons and Pentiums are getting AVX2 support. AVX2 is something that was introduced with 22nm Haswell. We're 2 process generations ahead.

It'll also fit their Big/Bigger hybrid strategy much better.

@LightningZ71 Jasper Lake is Gen 11, not Xe. They got that wrong.

@JoeRambo It's larger. Still the largest contributor to area is the doubled execution units. You can see from Skylake-SP. We'll see how much larger it is with Icelake-SP.

jpiniero · Oct 19, 2020

Further Exploring The Intel Tiger Lake Core i7-1165G7 Performance On Ubuntu Linux - Phoronix

www.phoronix.com

Phoronix tested the Tiger Lake XPS with Ubuntu 20.10 and it now runs in ST at the peak frequencies. But it also now hits 90+ C during MT loads, causing the MT frequency to drop, leading to worse performance.

dmens · Oct 19, 2020

IntelUser2000 said:
I expect few % better if anything.

That's true, but serious downclocking happened on Intel starting with AVX-512. Some downclocking existed on AVX2 with Broadwell but not so seriously.

Dual AVX-512 was obviously too much on the 14nm process.

On 10nm, Icelake shows nearly no reduction in clocks in 1xAVX-512 mode. Icelake-SP should improve clocks in AVX-512 too.

If for example the original Zen had 2xAVX-512 it would have had to degrade seriously too.

LOL. What are you talking about. The AVX frequency penalty for Intel has nothing to do with process. It is Intel's decision to use an on-die VR, along with mediocre power grid design. What makes you think AMD would have had the same problem?

Hitman928 · Oct 19, 2020

jpiniero said:
Further Exploring The Intel Tiger Lake Core i7-1165G7 Performance On Ubuntu Linux - Phoronix

www.phoronix.com

Phoronix tested the Tiger Lake XPS with Ubuntu 20.10 and it now runs in ST at the peak frequencies. But it also now hits 90+ C during MT loads, causing the MT frequency to drop, leading to worse performance.

Poor laptop just can't win for trying, lol.

Shivansps · Oct 19, 2020

LightningZ71 said:
Jasper Lake is looking to be quite the little beast for low end mobile. 48EU of Xe architecture iGPU is absolutely nothing to sneeze at and is a massive upgrade as compared to prior generations of the Atom/Celeron/Pentium product. It should easily beat any of the 3CU AMD parts in that market segment so long as the power envelope is sufficient.

No much is needed to beat Vega 3, if it beats a HD630 by 5% you are already matching it.

Hitman928 · Oct 19, 2020

dmens said:
LOL. What are you talking about. The AVX frequency penalty for Intel has nothing to do with process. It is Intel's decision to use an on-die VR, along with mediocre power grid design. What makes you think AMD would have had the same problem?

So you're saying that AVX frequency penalty is a result of IR drop on the chip and voltage droop from the on-die VR? I always thought it was a power limit thing.

Discussion Intel current and future Lakes & Rapids thread

Elite Member

Diamond Member

Elite Member

Platinum Member

Senior member

Lifer

Golden Member

Platinum Member

Diamond Member

Platinum Member

Lifer

Elite Member

Platinum Member

Senior member

Diamond Member

Elite Member

Platinum Member

Platinum Member

Golden Member

Elite Member

Lifer

Platinum Member

Diamond Member

Diamond Member

Diamond Member