I noticed that, too. More performance on the table with Windows 11?
Windows 11 increases Golden Coves IPC?
I noticed that, too. More performance on the table with Windows 11?
If rumors are to be believed, IPC increase for Zen 4 is >20%. Moore's Law is Dead heard something about the jump from Zen 2 to Zen 4 is similar to that as the original Zen 1. So, if you take the IPC gains for Zen 1 as 52%, and knowing that Zen 2 to Zen 3 is 19%, that implies that Zen 3 to Zen 4 is ~28%. I don't think it's going to be that high but I personally think AMD can match Intel's 19% IPC increase, if not out-do them.That would be extremely disappointing, it would make Zen 4 the weakest improvement since Zen>Zen+, especially after the longest wait time between Zen architectures. I hope it will be more substantial than that.
Windows 11 increases Golden Coves IPC?
Your point still stands but the gains in the big cores are not simply additive. The real gain is 1.18 x 1.19 or 40%. So GC is 33% faster than Gracemont. Problem with Gracemont is that it lacks hyperthreading and cant clock as high as the big cores.So if Golden Cove is only 19% faster per clock compared to Sunny Cove, and Sunny Cove is 18% faster than Skylake, and Gracemont is 7% faster than Skylake,
Golden Cove is only 31% faster than Gracemont. So much for a "little core". It sips power like a little core, and uses die area like a little core but much higher performing.
Spot welding!Fyi, incredibly high power lasers use that incredibly high power only for an incredibly short amount time as well![]()
Gracemont versus Skylake
Single thread
-40% higher performance at same power, or 40% less power at same performance
-Total performance is 7-8% higher
Golden Cove's 19% gains are meh considering the sizable changes.
They probably mean 1000x in some niche like matrix operations of low precision int/fp. Shouldn't be too hard to achieve given by 2025 they'll have massive 600W+ MCM GPU derivatives of Xe HP or Xe HPC. And right now they have nothing comparable except a few Xeons with VNNI extensions.Just in case some haven't seen it, here is the official Architecture day slide deck:
It has a lot of stuff in it that I didn't see in the couple of review websites that I read. The slide deck covers: Golden Cove, Grace Mont, Thread Director, Alder Lake, Xe HPG, Sapphire Rapids, IPU, Oak Springs Canyon, Arrow Creek, Mount Evans, and Ponte Vecchio.
The one that really stuck out like an impossible target is 1000x by 2025 (slides 5 to 11). That is, assuming that 1000x refers to performance.
While possible, I would expect to drop more than 15% in those tests.The ones that look like regressions might be AVX-512 tests.
Regardless, it should put them ahead of Zen 3 IPC and (I presume) Intel will keep these clocked slightly higher than Zen 3. So it comes down to how much Zen 3D will actually improve in these small benchmarks for IPC claims.19% IPC is strong for a generational uplift but disappointing considering the base line for that measurement is Cypress Cove which already displayed lower IPC than Zen 3 (and mind you, also lower than both Sunny and Willow Cove) and the sheer size of some of the core compared to Z3 as well. I mean, 2x the ROB? Jeez.
Gracemonts are definitely the real star of the show here.
Zen 3 with V-Cache is probably going to be very workload dependant in it's gains. For synthetics SPEC is definitely going to benefit the most, Cinebench and CPU-z I expect minor improvements at best.Regardless, it should put them ahead of Zen 3 IPC and (I presume) Intel will keep these clocked slightly higher than Zen 3. So it comes down to how much Zen 3D will actually improve in these small benchmarks for IPC claims. I think we might see a year where which is better depends entirely on whether your workload prefers more L3 or has more ILP for Golden Cove to work with.
Just in case some haven't seen it, here is the official Architecture day slide deck:
https://download.intel.com/newsroom.../intel-architecture-day-2021-presentation.pdf

Regardless, it should put them ahead of Zen 3 IPC and (I presume) Intel will keep these clocked slightly higher than Zen 3. So it comes down to how much Zen 3D will actually improve in these small benchmarks for IPC claims.
If I recall correctly Charlie was talking about their first generation 10nm. He was already right, long ago, as they trashed their first generation 10nm. Even if they kept the name as 10nm ESF.I just notice that since Alder Lake is releasing on "Intel 7", Charlie was right that Intel would never release a 10 nm desktop chip![]()
The Golden Cove performance improvement/architectural changes I expected. If true, the Gracemont power/performance compared to Skylake is kind of mind blowing. ST Gracemont is basically faster, more power efficient, and much smaller than Skylake. This begs the question as to why we even need the Golden Cove cores?
After thinking about it a bit I believe the reason stems from the fact that ultimately ST performance is still very important. There are many apps, like Handbrake that don't scale well beyond 8 cores. Throwing 40 Gracemont cores at that app in the same die space as Alder Lake won't show the same performance as 8+8 ADL. Partly because of the higher clocks of Golden Cove, the higher inherent IPC, and as I wrote above, there are still quite a few apps that don't handle high core counts in a linear fashion performance-wise.
Starting to get just a little more excited about ADL...
1) Simple fix to your concern: put a minimum idle limit in the guidance for considering a thread idle. Suppose a game is running at 60 fps. Then each frame takes 16.7 ms. Don't move an idle thread unless it is idle more than 16.7 ms. Then your game can not impacted at all by the scheduler. No ping-pong possible. Note: I'm not saying 16.7 ms is the ideal number, this is just an example. I'm not saying that this is an easy thing to solve (which is why it took both a big change from Intel and Microsoft). I'm just saying that your example is an easy thing to solve.Full presentation has some real "gems" about hardware assisted scheduling. I had good laugh at this slide:
View attachment 49077
Well, that is great example of throwing performance under the bus. Imagine a game thread, say one of thread pool members specialized in heavy processing some physics or AI or whatever data each frame, that has finished its work for the frame and is busy waiting in spin loop waiting for the next frame to start.
L2 cache is hot and full of relevant data. Then morons from Intel and Microsoft arrive and move this recently-busy-but-now-idle thread to small core. Context switch takes ton of time, caches are cold, once thread is back to work again, it is once again promoted to big core and the cycle repeats. At the huge cost of cache misses.
So Intel and MS move in to reign in this ping-pong with scheduler tunables and heuristics and so on and we are back to square zero with schedulers. Don't forget that we are already at negative starting position, since scheduler needs to take into account all that rich performance data from hardware to make its decisions and that also takes cycles and dirties caches.
While it is nice to have less static scheduling that can actually react to changes in characteristics of threads, that will still come at a cost of both peak performance and performance consistency. Alder Lake will be great at full MT load, great up to 8T of load and suffer from scheduler everywhere else and those problems might not even rear their ugly heads until games start to use just right amount of threads in a wrong way.
Exactly. User interface needs high ST performance for a fast "feel". You could have a billion slow cores, and while it will crunch numbers quickly, it would feel like a dog on the user interface thread. And many applications just physically can't be divided into a large number of threads no matter what the programmer tries. There is always a need for a few very fast ST cores.After thinking about it a bit I believe the reason stems from the fact that ultimately ST performance is still very important. There are many apps, like Handbrake that don't scale well beyond 8 cores.
Actually, he said they cancelled 10 nm. So all 10 nm products. He even admitted he was wrongIf I recall correctly Charlie was talking about their first generation 10nm. He was already right, long ago, as they trashed their first generation 10nm. Even if they kept the name as 10nm ESF.
faster per clock, it wont be fasterWe already knew everything that Intel announced today. It is just that a lot of people did not believe the rumors/leaks. I remember getting yelled at for saying that gracemont would be faster than skylake…
