Discussion Intel current and future Lakes & Rapids thread

Page 302 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
It feels like even with October availability, they rushed out Tigerlake. Driver issues! The GPU not being fully utilized in games.

Mash IT also has a dedicated gaming test for the 9310.
 
Last edited:

mikk

Diamond Member
May 15, 2012
4,112
2,108
136
His Rocket League test looks completely borked, package power only 11W and both CPU and GPU well below 100% while running at 53°, so why it is running with only 11W, there is no logical heat/power limit. It reminds me of Ivy Bridge HD 4000 8 years ago, there was a "low clock bug" in some games the first weeks after launch (even on a desktop SKU), it was a driver bug.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Actually mikk, if you look at graphics settings, it shows that Rocket League is capped at 62 fps. You can see from the gameplay that its basically hitting that limit most of the time.

For a game that plays so well, its a way to conserve battery life and reduce heat/noise.
 
  • Like
Reactions: Tlh97 and mikk

Gideon

Golden Member
Nov 27, 2007
1,608
3,573
136
In those exteme MT-loads Zen3 will operate at it's base clock ~3.4Ghz at TDP power level. It might be that Gracemont cores will operate at higher frequency at those tasks.
Not really. Average clocks of 3950x are listed in the anandtech's review. With 16 loaded cores it's almost 3.9 Ghz.

3950X%20RHP%20Freq2_575px.png


And even when you ignore clock speed (that will realistically give AMD at least a 10% advantage):
  • Zen 3 will have at least a 10% IPC lead, realistically more (unless Gracemont is >10% faster than Skylake)
  • AMD gains ~40% from SMT in such workloads.

I do agree, that even among MT workloads many do not scale well to 16-24+ threads. Most consumer stuff (outside rendering) will not scale very well past 10-12 cores and should thus run very competitively on Alder Lake. I just find it unlikely that a 8 core Alder lake can beat 16 Zen 3 in workloads that do scale to 32 threads.

EDIT: Anyway sorry for the OT, won't discuss this point forward.
 

majord

Senior member
Jul 26, 2015
433
523
136
Is there any actual evidence of the supposed Skylake IPC from gracemont? Letalone it scaling to anywhere near 4Ghz?

Not saying it's untrue, just everyone's talking like its gospel. It's barely going to be what you'd class as " big.Little" If their IPC and clock frequencies are anywhere near what's being proposed by some people in this thread - just make them all Golden Cove!
 

jpiniero

Lifer
Oct 1, 2010
14,510
5,159
136
Is there any actual evidence of the supposed Skylake IPC from gracemont? Letalone it scaling to anywhere near 4Ghz?

4 Ghz is a stretch but Gracemont having near Skylake IPC is what's been rumored.

Not saying it's untrue, just everyone's talking like its gospel. It's barely going to be what you'd class as " big.Little" If their IPC and clock frequencies are anywhere near what's being proposed by some people in this thread - just make them all Golden Cove!

Mobile benefits from having the small cores. Thus why it's there.
 

Zucker2k

Golden Member
Feb 15, 2006
1,810
1,159
136
Not really. Average clocks of 3950x are listed in the anandtech's review. With 16 loaded cores it's almost 3.9 Ghz.

3950X%20RHP%20Freq2_575px.png


And even when you ignore clock speed (that will realistically give AMD at least a 10% advantage):
  • Zen 3 will have at least a 10% IPC lead, realistically more (unless Gracemont is >10% faster than Skylake)
  • AMD gains ~40% from SMT in such workloads.

I do agree, that even among MT workloads many do not scale well to 16-24+ threads. Most consumer stuff (outside rendering) will not scale very well past 10-12 cores and should thus run very competitively on Alder Lake. I just find it unlikely that a 8 core Alder lake can beat 16 Zen 3 in workloads that do scale to 32 threads.

EDIT: Anyway sorry for the OT, won't discuss this point forward.
If average load is 3.875GHz then heaviest loads (AVX) should be even lower, and that's ignoring the extra avx unit introduced in Zen 3.
 
  • Haha
Reactions: spursindonesia

Gideon

Golden Member
Nov 27, 2007
1,608
3,573
136
How much, do you have a source?
According to Anandtech's Cannon-Lake review, out of AVX-2 workloads POV-Ray reduced the frequency the most (to 2.2 Ghz)

Freq%20POV-Ray_575px.png


AVX-512 workloads reduced the clocks even lower, but out of anandtech test suite at the time POV-Ray was the hardest hitter for AVX-2, also for Kaby Lake.

The Cannon Lake processor loses frequency as the cores are loaded, and severely loses frequency when AVX2/AVX512 is applied based on our testing. Comparing that to the Kaby Lake on Intel’s mature 14nm node, it keeps its turbo and only loses a few hundred MHz with AVX2. This part does not have AVX512, which is a one up for the Cannon Lake.


The biggest discrepancy we observed for AVX2 was in our POV-Ray test.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Is there any actual evidence of the supposed Skylake IPC from gracemont? Letalone it scaling to anywhere near 4Ghz?

I expect few % better if anything.

AVX doesn't really hurt Zen frequency like it does on Intel, so it doesn't matter that much.

That's true, but serious downclocking happened on Intel starting with AVX-512. Some downclocking existed on AVX2 with Broadwell but not so seriously.

Dual AVX-512 was obviously too much on the 14nm process.

On 10nm, Icelake shows nearly no reduction in clocks in 1xAVX-512 mode. Icelake-SP should improve clocks in AVX-512 too.

If for example the original Zen had 2xAVX-512 it would have had to degrade seriously too.
 

Gideon

Golden Member
Nov 27, 2007
1,608
3,573
136
That's true, but serious downclocking happened on Intel starting with AVX-512. Some downclocking existed on AVX2 with Broadwell but not so seriously.

Dual AVX-512 was obviously too much on the 14nm process.

On 10nm, Icelake shows nearly no reduction in clocks in 1xAVX-512 mode. Icelake-SP should improve clocks in AVX-512 too.

If for example the original Zen had 2xAVX-512 it would have had to degrade seriously too.
This is also why i think that while Gracemont might have even slightly better integer performance than Skylake, it won't have as much FP perfomance. It just seems odd to add 2x256bit vector units to a small core,
2x 128bit or 1x256bit seem a lot more likely IMO (with what it would have FP perfomance more comparable to Zen 1).

I mean, just look at how die space increase on Zen 2 vs Zen 1 for doubling the SIMD units (and that's with huge shrinks for SRAM, going from 14nm to 7nm). What would be the purpose to waste die-space for that on cores which whole point is to be small?

Zen 1:
5017e5_727ef30243c5476bb4804714f2cfa348~mv2.webp


Zen 2:
5017e5_982e0e47d7c04dd6936d28007c655c09~mv2.webp

Besides, if they already had 2x256 bit, they'd probably try to implement AVX-512 support as well
 

majord

Senior member
Jul 26, 2015
433
523
136
I expect few % better if anything.



That's true, but serious downclocking happened on Intel starting with AVX-512. Some downclocking existed on AVX2 with Broadwell but not so seriously.

Dual AVX-512 was obviously too much on the 14nm process.

On 10nm, Icelake shows nearly no reduction in clocks in 1xAVX-512 mode. Icelake-SP should improve clocks in AVX-512 too.

If for example the original Zen had 2xAVX-512 it would have had to degrade seriously too.


Yeah look to be honest Skylake IPC, at least in a majority of workloads isn't far fetched. The last I benchmarked was Goldmont+ , and It was not bad at all. but was well out of puff as a perf/watt part @ its max turbo of 2.7ghz.
Tremont - who knows, there's very little benchmarks to backup the claims for it over goldmont+ , I can't even tell what frequency they're maxing out at. Seems lower than Goldmont (I see quoted 1.8Ghz max for the tremont cores?)
 

mikk

Diamond Member
May 15, 2012
4,112
2,108
136
I can't even tell what frequency they're maxing out at. Seems lower than Goldmont (I see quoted 1.8Ghz max for the tremont cores?)


Elkhart Lakes clocks up to 3.0 Ghz: https://ark.intel.com/content/www/us/en/ark/products/codename/128825/elkhart-lake.html#@nofilter

Jasper Lakes clock up to 3.1 (mobile), 3.3 Ghz (desktop) according to rumors: https://hothardware.com/news/intel-10nm-jasper-lakecpu-lineup-leaks

This is all on the medicore Icelake 10nm node. Gracemont should clock way higher than this.
 
  • Like
Reactions: Tlh97 and Spartak

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
@Gideon Goldmont Plus already supports 128-bit SSE. Actually fully pipelined 128-bit vector unit started with the predecessor, Goldmont.

You can see from the die shots the overall die is tiny, nevermind the vector units.

Also, while they could choose to support AVX-512, it still needs extra transistors even with the same width.

Intel doesn't call it big little. They call it big/bigger. Tremont/Gracemont is far closer to their bigger cores than little cores are in ARM.

It supports my speculation the performance of the so-called "little" cores have to be high enough for transparent user experience and seamless switching between cores in real world usage scenarios.
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
Jasper Lake is looking to be quite the little beast for low end mobile. 48EU of Xe architecture iGPU is absolutely nothing to sneeze at and is a massive upgrade as compared to prior generations of the Atom/Celeron/Pentium product. It should easily beat any of the 3CU AMD parts in that market segment so long as the power envelope is sufficient.
 

Gideon

Golden Member
Nov 27, 2007
1,608
3,573
136
@Gideon Goldmont Plus already supports 128-bit SSE. Actually fully pipelined 128-bit vector unit started with the predecessor, Goldmont.

You can see from the die shots the overall die is tiny, nevermind the vector units.
Yeah that's true, but that's not 2x256bit AVX2 units that would be required for syklake-like Floating Point perfomance. While it might happen, I'm not convinced such increase is justified.
 
  • Like
Reactions: spursindonesia

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
Besides, if they already had 2x256 bit, they'd probably try to implement AVX-512 support as well

I think with AVX512 the problem is not the execution units "area", but rather the PRF size and the instructions themselves.

1) There is huge jump in registers count -> instead of 16x256bit you get 32x512bit requiring at least 4 times area. And given that PRF is not limited to 32, but rather ~100 to allow for OoO => ~200KB of register file size, that is chunk of area.
2) There are instructions that cross the 256bit "lanes". Good example is permutes of 512 bit registers, can't really properly implement them with splitted 2x256 units as they read and write from any part of source/destination 512bit registers. Really need 512bit capable processing unit for those.

And now there's big difference between having two independent 256bit ports and one "unified" for 512bit operations, as that halves your FP throughput for AVX/SSE2 workloads.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
@Gideon I think we're there now. With Tigerlake Celerons and Pentiums are getting AVX2 support. AVX2 is something that was introduced with 22nm Haswell. We're 2 process generations ahead.

It'll also fit their Big/Bigger hybrid strategy much better.

@LightningZ71 Jasper Lake is Gen 11, not Xe. They got that wrong.

@JoeRambo It's larger. Still the largest contributor to area is the doubled execution units. You can see from Skylake-SP. We'll see how much larger it is with Icelake-SP.
 
Last edited:

dmens

Platinum Member
Mar 18, 2005
2,271
917
136
I expect few % better if anything.



That's true, but serious downclocking happened on Intel starting with AVX-512. Some downclocking existed on AVX2 with Broadwell but not so seriously.

Dual AVX-512 was obviously too much on the 14nm process.

On 10nm, Icelake shows nearly no reduction in clocks in 1xAVX-512 mode. Icelake-SP should improve clocks in AVX-512 too.

If for example the original Zen had 2xAVX-512 it would have had to degrade seriously too.

LOL. What are you talking about. The AVX frequency penalty for Intel has nothing to do with process. It is Intel's decision to use an on-die VR, along with mediocre power grid design. What makes you think AMD would have had the same problem?
 

Shivansps

Diamond Member
Sep 11, 2013
3,835
1,514
136
Jasper Lake is looking to be quite the little beast for low end mobile. 48EU of Xe architecture iGPU is absolutely nothing to sneeze at and is a massive upgrade as compared to prior generations of the Atom/Celeron/Pentium product. It should easily beat any of the 3CU AMD parts in that market segment so long as the power envelope is sufficient.

No much is needed to beat Vega 3, if it beats a HD630 by 5% you are already matching it.
 

Hitman928

Diamond Member
Apr 15, 2012
5,182
7,633
136
LOL. What are you talking about. The AVX frequency penalty for Intel has nothing to do with process. It is Intel's decision to use an on-die VR, along with mediocre power grid design. What makes you think AMD would have had the same problem?

So you're saying that AVX frequency penalty is a result of IR drop on the chip and voltage droop from the on-die VR? I always thought it was a power limit thing.