• Guest, The rules for the P & N subforum have been updated to prohibit "ad hominem" or personal attacks against other posters. See the full details in the post "Politics and News Rules & Guidelines."

Question How ahead is Intel in CPU design compared to AMD?

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

Carfax83

Diamond Member
Nov 1, 2010
5,834
524
126
is power draw not part of the design?
Yes, but I was specifically talking about overclocking. The 10980xe review I linked to had all 18 cores running at 4.9ghz, so of course power efficiency is going to go out the window.

That said, the power draw probably wouldn't be that bad for gaming workloads, even when overclocked because no current games will ever come close to utilizing that many cores.

If you're doing rendering though, then you will have a mini nuclear reactor in your house!
 

itsmydamnation

Golden Member
Feb 6, 2011
1,955
1,082
136
Yes, but I was specifically talking about overclocking. The 10980xe review I linked to had all 18 cores running at 4.9ghz, so of course power efficiency is going to go out the window.

That said, the power draw probably wouldn't be that bad for gaming workloads, even when overclocked because no current games will ever come close to utilizing that many cores.

If you're doing rendering though, then you will have a mini nuclear reactor in your house!
In stardock we trust :)

Soontm
hopefully
 

Gideon

Senior member
Nov 27, 2007
677
908
136
Remember I was comparing the 2700x. The 2700x memory latency results are in line with other reviewers. The Zen+ core made improvements to the cache subsystem and memory controller, leading to lower latency all around.
Yes, but:
  1. 2700x doesn't use chiplets (therefore it's latency isn't hindered by them
  2. Per the slides it's memory-latency is 11% better than gen 1 (or in reality usually about 8-10ns). The best AIDA scores I've seen for that are at about 58ns. It still really struggles to go beyond 60ns. As does 3xxx series, which also reaches 61-62ns with 3800Mhz (something 2xxx series won't reach) ram and most agressive timings.
  3. This is confirmed comparing my friends 2600x with 3200 Mhz memory vs my 3700x in geekbench (this is the lower result than previous, running 3200 CL16 stock). 82.1ns vs 72.8ns
  4. Skylake can reach 40ns even 35ns and below with agressive timings.
therefore:
2xxx series is at best case 10ns faster than 3xxx series. Intel is stilll at least 20-25ns faster above that. This extra 20ns part has nothing to do with chiplets and is also possible to be improved upon with Ryzen, regardless of chiplets or not. On top of that, with improved packaging technologies, even withoout interposers <5ns difference between 2700x is also possible (instead of the curretn 8-10ns).


 

Chicken76

Senior member
Jun 10, 2013
217
17
81
I don't possess enough technical knowledge to answer with definite precision how far ahead (or behind) Intel is compared to AMD. Personally though as a longtime computer enthusiast and gamer, I have greatly preferred Intel's design decisions over the years, and other than the disastrous 10nm node, I like what they've done and I'm hoping they can get back on their feet soon so I can update my rig. Large monolithic dies with big multi level cache, high IPC and clockspeed plus an integrated memory controller for low latency and wide vector SIMD have made PC performance better than ever in my opinion in the applications I care about, ie mostly gaming. My 6900K at 4.3ghz is quite old by tech industry standards, yet I can still easily hit and sustain triple digit framerates in many of the most demanding and performance intensive PC games and the CPU is not bottlenecking my Titan Xp.

Also, I am close to 50ns for memory latency with quad channel DDR4 3400.

I'm not totally convinced about AMD's decision to embrace chiplet based designs. Although the Zen 2 core is strong, the step backward for memory latency is disconcerting to me. Sure the massive L3 cache helps a lot to hide memory latency, but that can only get you so much. It will be interesting to see what improvements if any Zen 3 will have in that area.

If you mostly care about gaming, why does it matter what the memory latencies are? Do the games run fast? Well then who cares how that performance is achieved, by employing low or high memory latencies, sprinkling magic faerie dust or some other method?
 

Chicken76

Senior member
Jun 10, 2013
217
17
81
Hey, do you want to shut down these fora?
Hehe, no, but I could not let this pass unanswered (paraphrasing):
Ryzen 3000 runs games much better than Ryzen 2000, despite higher memory latencies. Do I have other use cases where it affects me? No, but I'm worried. :weary:
 

Carfax83

Diamond Member
Nov 1, 2010
5,834
524
126
Yes, but:
  1. 2700x doesn't use chiplets (therefore it's latency isn't hindered by them
  2. Per the slides it's memory-latency is 11% better than gen 1 (or in reality usually about 8-10ns). The best AIDA scores I've seen for that are at about 58ns. It still really struggles to go beyond 60ns. As does 3xxx series, which also reaches 61-62ns with 3800Mhz (something 2xxx series won't reach) ram and most agressive timings.
  3. This is confirmed comparing my friends 2600x with 3200 Mhz memory vs my 3700x in geekbench (this is the lower result than previous, running 3200 CL16 stock). 82.1ns vs 72.8ns
  4. Skylake can reach 40ns even 35ns and below with agressive timings.
That was the point of what I was getting at. The 2700x doesn't use chiplets, but is still able to have a significantly lower memory latency than Zen 2, which has a stronger core, much better prefetchers, AND a much bigger L3 cache. But perhaps I am missing the argument.

Are chiplet architectures and IMCs incongruous with each other? Because it might not be the inclusion of the chiplet design, but the lack of an IMC which explains Zen 2's higher memory latency.

2xxx series is at best case 10ns faster than 3xxx series. Intel is stilll at least 20-25ns faster above that. This extra 20ns part has nothing to do with chiplets and is also possible to be improved upon with Ryzen, regardless of chiplets or not. On top of that, with improved packaging technologies, even withoout interposers <5ns difference between 2700x is also possible (instead of the curretn 8-10ns).
I guess we'll see when Zen 3 arrives. The unified L3 cache will undoubtedly have some impact.
 
Last edited:

Carfax83

Diamond Member
Nov 1, 2010
5,834
524
126
If you mostly care about gaming, why does it matter what the memory latencies are? Do the games run fast? Well then who cares how that performance is achieved, by employing low or high memory latencies, sprinkling magic faerie dust or some other method?
Because low memory latencies is one of the major reasons why Intel is so dominant in gaming. Sure Zen 2 looks good now, but that's because it's competing against Intel CPUs that are using a much older core on a bigger node. When Intel gets back on their game and gets their 10nm process up to form, it will be a different ballgame. Although I have a preference for Intel CPUs, I do want AMD to remain ultra competitive with Intel and not get destroyed like they were in the past.
 

awesomedeluxe

Junior Member
Feb 12, 2020
14
3
36
To the extent that we can compare Ice Lake and Zen 2 directly, the Sunny Cove cores look really good. We can't do that very easily right now because Ice Lake operates in such a small power range and Zen 2 parts in that range are just coming out, but it should be clearer soon.

One consideration for these comparisons is that AMD had to cut down on Zen 2's huge cache to make Renoir work, and it hurt them a lot. Sometimes it sucks to be married to your machine's primary GPU. Looking in my crystal ball, I think AMD may surprise people by getting more aggressive in the mobile APU space. Renoir's "lag" may really just be that Navi is a problematic part. But AMD has an RDNA2 APU that's spent a lot of time in the wash getting ready for the PS5 and neXtbox. It's possible we could see it a revised APU with RDNA2 this year. This APU would still use Zen 2 cores, but it might be able to make room for more cache.

I am less certain what Tiger Lake will bring, but I am more confident than most people Intel will bring those Willow Cove cores and Xe graphics to machines this year even if they have to bend heaven and earth to do so. But a solution for Intel's manufacturing woes that will allow them to bring Cove cores to higher frequencies and more plentiful core configurations does not seem like it will be coming any time soon.
 

naukkis

Senior member
Jun 5, 2002
298
137
116
2xxx series is at best case 10ns faster than 3xxx series. Intel is stilll at least 20-25ns faster above that. This extra 20ns part has nothing to do with chiplets and is also possible to be improved upon with Ryzen, regardless of chiplets or not. On top of that, with improved packaging technologies, even withoout interposers <5ns difference between 2700x is also possible (instead of the curretn 8-10ns).
Those Intel systems with inclusive L3 don't need cache-coherency checks in memory controller, so specially their in-page memory hits are much faster than with Ryzen and other cpu's with non-inclusive last level caches. Skylake-X is better comparison point to Ryzen memory controller, there's still difference but that is much smaller than that 20-25ns.
 

Gideon

Senior member
Nov 27, 2007
677
908
136
Those Intel systems with inclusive L3 don't need cache-coherency checks in memory controller, so specially their in-page memory hits are much faster than with Ryzen and other cpu's with non-inclusive last level caches. Skylake-X is better comparison point to Ryzen memory controller, there's still difference but that is much smaller than that 20-25ns.
We know that zen3 has a unified cache per CCD. My bet is that it's also inclusive (and upscaled by the relevant amount) L2 will hopefully be resized to 1MB to counter added L3 latency (32+mb of unified cache will probably be a bit more than 11ns)
 

maddie

Platinum Member
Jul 18, 2010
2,974
1,556
136
According to this, Intel has warped into the future.

Posted on WCCFTECH, a supposed comparison of Icelake 12C/24T to another Intel 12C/24T CPU. It seems we're getting HT at greater than 100%. In other words the cores execute individual threads faster by using Hyperthreading. Not just overall faster, but on a per thread basis too. Who believes this BS?

I suppose having 4-6 threads/core would explain this, or even if the base clock is higher than the boost :p

 
  • Haha
Reactions: Saylick

naukkis

Senior member
Jun 5, 2002
298
137
116
We know that zen3 has a unified cache per CCD. My bet is that it's also inclusive (and upscaled by the relevant amount) L2 will hopefully be resized to 1MB to counter added L3 latency (32+mb of unified cache will probably be a bit more than 11ns)
We do know that Zen3 can also be configured to be multiple chiplets, in which mode memory controller would need still to have coherency management of its own so there's not much of point to change inclusion for prefer little faster memory access times instead of more effective total cache. And for APU inclusive cache would need both GPU and cpu to share L3 like Intel cpus does - thats also highly unlikely to happen.
 
  • Like
Reactions: Gideon

Markfw

CPU Moderator, VC&G Moderator, Elite Member
Super Moderator
May 16, 2002
19,785
7,161
136
According to this, Intel has warped into the future.

Posted on WCCFTECH, a supposed comparison of Icelake 12C/24T to another Intel 12C/24T CPU. It seems we're getting HT at greater than 100%. In other words the cores execute individual threads faster by using Hyperthreading. Not just overall faster, but on a per thread basis too. Who believes this BS?

I suppose having 4-6 threads/core would explain this, or even if the base clock is higher than the boost :p

All sorts of things appear to be wrong on that pic.... I will just leave it there.
 

CHADBOGA

Golden Member
Mar 31, 2009
1,961
523
136
According to this, Intel has warped into the future.

Posted on WCCFTECH, a supposed comparison of Icelake 12C/24T to another Intel 12C/24T CPU. It seems we're getting HT at greater than 100%. In other words the cores execute individual threads faster by using Hyperthreading. Not just overall faster, but on a per thread basis too. Who believes this BS?

I suppose having 4-6 threads/core would explain this, or even if the base clock is higher than the boost :p

Maybe Intel has implemented SMT4 and it's not being reported on correctly?
 

NostaSeronx

Platinum Member
Sep 18, 2011
2,842
508
126
Maybe Intel has implemented SMT4 and it's not being reported on correctly?
Nah, dawg no one is going SMT4. No one has the money, know how, or the capability of doing SMT4. Definitely not going to happen anytime soon at AMD, at Intel, or at VIA. Just got off a call no SMT4, no architecture in the near future will be SMT4.
 

Insert_Nickname

Diamond Member
May 6, 2012
3,758
446
126
I hear you on that. I had high hopes for MCM mfg. Why not an I/O die for Ryzen 3000-series CPUs, with a tiny little Vega 5/6 iGPU. Maybe not enough for gaming, really, but enough for business desktop usage and presentations and especially, watching internet videos.
For AMD, my dream lineup not going into IPC and clocks, would see :

Ryzen 3 all have small Navi IGP chiplet. Immediately makes them far more attractive for basic office builds.

Ryzen 5 and 7 to have one basic model each with IGP, eg this gen would have been Ryzen 3600G and 3700G.
I was going to suggest just that. Make a small chiplet with a basic 4-6CU IGP, and add that to some those models which only use a single chiplet. The dual chiplet ones will more then likely be used with discrete graphics anyway. The uncore would still need to support the video outputs of course, but I doubt that is a complex add-on.

Would be awesome, but I doubt they'll do it. The new 4000-series have 8 cores already, and a fair IGP which is more then enough for office work. Those 8 Zen2 cores, even with reduced L2, should be good enough for office work for the next 10-20 years.

While we're at it, one model I'd like to see would be a double chiplet 8 core with all 64MB L2. One can dream, right?
 
  • Like
Reactions: Arkaign

cytg111

Lifer
Mar 17, 2008
11,890
2,987
136
If we could get both companies to design a CPU core now on the same process and using the same number of transistors, who would be ahead and by how much? I'm assuming Intel would be ahead because Skylake has a similar IPC to Zen 2, but Zen 2 uses a more dense node, and Skylake is from 2015 while Zen 2 is from 2019.

The both companies using the same process scenario is just to illustrate the problem and I know it will never happen, but maybe there is a smart way to compare current CPU designs and find out who is ahead and by how much. Wikichip has die sizes and die pictures of Sunny Cove and Zen 2, but I'm not sure how to interpret and compare them.

Another question: If Intel had to design future cores using the same process and the same die size as Skylake, how much further could the Skylake design be improved? Also assuming it will have to have to be x86 too and have all the same functions.
I mean *right* now... Id bet that Intel has the better brains and the better patents... But you never know when something like Spectre creeps up and clips your bleeps.
 

scannall

Golden Member
Jan 1, 2012
1,621
963
136
I don't know that Intel is ahead of AMD. AMD has better SMT. Much better security. Slightly better IPC. Though that's close enough to call even. Intel has better memory controller/latency. The next couple of generations should prove interesting. Lets see if Intel gets off their backside now that it's been kicked hard enough.
 
Last edited:
  • Like
Reactions: clemsyn

Insert_Nickname

Diamond Member
May 6, 2012
3,758
446
126
Much easier, just add one zen 2 chiplet to it and voila 16c apu.
Unfortunately, Renoir has a reduced L3 compared to Matisse. So it's not really 1:1. But yeah, if you designed for it, that would absolutely be doable.

Dual renoir chiplet would be a nightmare, double dual imc, dual gpu, dual vce, dual ...
...unless you could get the Renoir die to perform the uncore functions exclusively. Then add one chiplet for a 8C APU. Or two for a 16C APU. It'd be expensive as hell, but maybe just doable.

The Renoir die is only ~150mm2, it's not that much larger then the uncore die on Matisse. But you might run out of room on the AM4 socket.
 

moinmoin

Golden Member
Jun 1, 2017
1,402
1,204
106
The 2700x doesn't use chiplets, but is still able to have a significantly lower memory latency than Zen 2, which has a stronger core, much better prefetchers, AND a much bigger L3 cache.
Because low memory latencies is one of the major reasons why Intel is so dominant in gaming.
Zen 2 significantly improved gaming performance compared to Zen and Zen+ though, to the point that Intel's advantage is limited to some niche corner cases (low res and very high frequency) so Zen 2's high latencies seem not to be too bad (all while offering an obvious area for further improvements in the future).
One consideration for these comparisons is that AMD had to cut down on Zen 2's huge cache to make Renoir work, and it hurt them a lot.
How has it hurt them?
 

nicalandia

Senior member
Jan 10, 2019
246
147
76
Nah, dawg no one is going SMT4. No one has the money, know how, or the capability of doing SMT4. Definitely not going to happen anytime soon at AMD, at Intel, or at VIA. Just got off a call no SMT4, no architecture in the near future will be SMT4.
You must clarify that, Intel Xeon Phi(SMT4), IBM Power9(SMT4/SMT8) And Oracle SPARC M8(SMT8) say otherwise.

Zen is wide enough for SMT4
 

scannall

Golden Member
Jan 1, 2012
1,621
963
136
You must clarify that, Intel Xeon Phi(SMT4), IBM Power9(SMT4/SMT8) And Oracle SPARC M8(SMT8) say otherwise.

Zen is wide enough for SMT4
I keep hearing SMT4 speculation all over these forums. And it really doesn't make any sense in a 7nm and smaller world. On 32nm for instance, it's the only way to get enough threads for their specific use cases on a die that is actually small enough to manufacture. Real cores beat extra threads per core hands down. SMT is a nice bonus that takes minimal extra resources, though it can add an extra attack vector for hackers. As Intel has learned the really hard way. SMT2? Pretty unlikely, but not flat out crazy.

AMD and Intel will have to go wider at some point, which pushes core size up. But still, real cores are better than extra threads on one core.
 
Status
Not open for further replies.

ASK THE COMMUNITY