• Guest, The rules for the P & N subforum have been updated to prohibit "ad hominem" or personal attacks against other posters. See the full details in the post "Politics and News Rules & Guidelines."

Speculation: Ryzen 4000 series/Zen 3

Page 44 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Guru

Senior member
May 5, 2017
698
255
106
There is two possible explanation why AMD changed that at SMT2
You seem pretty confident that SMT4 isn't happening with Zen 3. So how much money would you bet on that? Would you bet your annual income that Zen 3 isn't SMT4? Are you so confident still? Probably not. It's easy to be hero behind keyboard without any responsibility.

My bet is 60:40 for new wider and/or SMT4 core for Zen 3. It's doable, technology is already developed by others, risk is minimal, AMD has great engineers and they had time to do that. All the puzzles shows AMD is hiding something about Zen 3. The only really confident people about Zen 3 SMT are AMD engineers. Everybody else confident is just keyboard hero.
Why is this still discussed? AMD have already said they don't plan smt4 for Ryzen 3. I don't think that it where their bottleneck is. I don't think they will gain much by doing that, especially when they have an amazing system for churning out more cores.

We know for a fact that it will have a unified L3 cache, we know it will have faster gates, better branch predictor and smaller latency between cores. They might actually do some sort of L4 cache as well, to improve flow and feed the cores faster.
 

Richie Rich

Member
Jul 28, 2019
80
61
51
All this SMT4 stuff is just stupid, if ILP only average something like 1.5 on spec why would SMT2 only add ~25% performance given there are 11 pipelines in a Zen core. Because the bottleneck isn't in execution!
Why are you mixing 1.5 CISC ILP with RISC execution units? Keller said IceLake is executing 3-6 instructions at once. Maybe you can explain why Apple moved from A7 (4xALU, 2xLSU) to wider A11/12/13 (6xALUs with still only 2xLSU). I think they had pretty good reason to do that (especially when we know there is massive +58% IPC gain over SkyLake).

The problem that ATV and yourself seem to be discounting is that a leak can contain elements of truth without being wholly truthful, perhaps some of these information dispersals are intentionally planted within companies like AMD to identify leakers when they are suspected to exist - it's what I would do.
The interesting thing is that AVT saw exactly same slides (graphics) just with SMT4 on it. This is the point. They put it there for identifying leakers or because Zen 3 is SMT4 capable. Could be both.

We know for a fact that it will have a unified L3 cache, we know it will have faster gates, better branch predictor and smaller latency between cores. They might actually do some sort of L4 cache as well, to improve flow and feed the cores faster.
Cache, cache, cache. I feel like in Tron movie surrounded by programs caught in endless cycle. No offense however it's funny how many people want to increase code execution by not increasing exe units. Leaked Zen 3 IPC gain of >8% (other says >10%) cannot be achieved by just L3 cache.

BTW a comparison of evolution of Apple/Intel cores:
  • 2012 - Intel IvyBridge (3xALU)... Apple A6 (2xALU) .... Apple is way behind
  • 2013 - Intel Haswell (4xALU)... Apple A7 (4xALU) .... Apple is on par with Intel
  • 2017 - Intel CoffieLake (4xALU)... Apple A11 (6xALU) .... Apple became tech leader
Isn't this interesting?
 
  • Like
Reactions: amd6502

Ajay

Diamond Member
Jan 8, 2001
5,824
1,720
136
Another interesting thing regarding the AMD presentation about Milan SMT2 and unified L3 cache. AdoredTV claims that in his earlier leaked version of that slide there was SMT4 originally.

There is two possible explanation why AMD changed that at SMT2:
  • AMD didn't want to reveal a killer feature like SMT4 at this Zen 2 event (way too early before Zen 3 unveiling)
  • AMD will disable SMT4 for whole Zen 3 generation (due to performance issues due to FPU bottleneck? Zen 4 on 5nm could solve this)


This guy is such a moron with his ‘special' knowledge. He just cannot drop this SMT 4 rumor that he started. He’s impervious to facts.
 

soresu

Senior member
Dec 19, 2014
633
160
116
This guy is such a moron with his ‘special' knowledge. He just cannot drop this SMT 4 rumor that he started. He’s impervious to facts.
At the very least he let a retweet from Lisa Su go to his head.
 

soresu

Senior member
Dec 19, 2014
633
160
116
BTW a comparison of evolution of Apple/Intel cores:
  • 2012 - Intel IvyBridge (3xALU)... Apple A6 (2xALU) .... Apple is way behind
  • 2013 - Intel Haswell (4xALU)... Apple A7 (4xALU) .... Apple is on par with Intel
  • 2017 - Intel CoffieLake (4xALU)... Apple A11 (6xALU) .... Apple became tech leader
Isn't this interesting?
Not even remotely, they don't compete in the same area currently - and the platform that is closest to any competing market (iPad OS) is more or less closed from software freedom, unlike the multitude of platforms that x86 potentially supports (excepting MacOS of course).

Apple Axx would be infinitely more interesting to me if they made iOS a more open system, or in some strange parallel world ran Android or Chrome OS on the latest Axx SoC out of the box.
 

NostaSeronx

Platinum Member
Sep 18, 2011
2,652
410
126
What will be weird if AMD launches multiple Zen3 SKUs with dropping ASPs.
Ryzen 3000(new ASP/2x16 MB *2) -> Ryzen 3003(Zen3(N7P) - lower ASP/1x32 MB *2) -> Ryzen 3005(Zen3+(N6) - even lower ASP/1x32+ MB *2) -> Ryzen 3007(Zen3++(N5) - most lowest ASP/1x 32+ MB *2)
Making it slowly the budget line. Then, there will be Vermeer on AM6 and Genesis Peak on SP5/TR6. <== DDR5 for capacity and HBM2E/3 for speed, first to N5.
 
Last edited:

Richie Rich

Member
Jul 28, 2019
80
61
51
Not even remotely, they don't compete in the same area currently - and the platform that is closest to any competing market (iPad OS) is more or less closed from software freedom, unlike the multitude of platforms that x86 potentially supports (excepting MacOS of course).

Apple Axx would be infinitely more interesting to me if they made iOS a more open system, or in some strange parallel world ran Android or Chrome OS on the latest Axx SoC out of the box.
Why do you escape to SW stuff? I'm talking about HW core development, I'd appreciate stay there. Another area is not an excuse for Intel. Moreover, desktop and servers CPUs should be at the top of IPC and absolute performance. Obviously, Intel was overtaken by the mobile processor and this is a big shame. I don't understand how somebody can defend this Intel's CPU development stagnation.
 
  • Like
Reactions: Lodix

soresu

Senior member
Dec 19, 2014
633
160
116
Why do you escape to SW stuff? I'm talking about HW core development, I'd appreciate stay there. Another area is not an excuse for Intel. Moreover, desktop and servers CPUs should be at the top of IPC and absolute performance. Obviously, Intel was overtaken by the mobile processor and this is a big shame. I don't understand how somebody can defend this Intel's CPU development stagnation.
It's not an escape if it is relevant, everyone praises Apple's hardware to the heavens, but their SW platform restrictions make it useless in my opinion,little more than over powered paper weights - something I found amusing years ago when a Dolphin (GC/Wii) emulator developer praised Apple's Axx cores and damned ARM for not matching them, nevermind that he needed to jailbreak an Apple device merely to test that code, so rather a hollow argument.

As to server CPU's needing to be at the top of IPC, not all servers need that - in a great many server use cases they are simply serving data to a network, rather than doing any significant compute on it which would benefit from that high IPC, such as sending/receiving e-mails, routing video streams, database queries, sending HTML....

These lighter workloads benefit from more cores, more memory and more IO - which AMD is providing along with steady IPC improvement in EPYC.

Either way your pivot to a knock on Intel is just as pointless in this thread as me talking about Apple SW platforms - AMD is advancing not stagnating, and this is a thread about Zen3 after all, I don't care if Intel is slipping into the seventh circle of hell as long as AMD keeps moving forward now that they have momentum.
 
Last edited:
  • Like
Reactions: spursindonesia

soresu

Senior member
Dec 19, 2014
633
160
116
2013 - Intel Haswell (4xALU)... Apple A7 (4xALU) .... Apple is on par with Intel
Also, you seem to be missing parts of the equation there - the IPC is not the only part of the problem, how did the SPEC/GB score compare to actual Haswell CPU's rather than at exactly the same frequency?

Typically if I recall, A7 tanked the battery if it ran full whack for more than short bursts - which is probably why Apple introduced the little cores to improve power efficiency later.
 

Richie Rich

Member
Jul 28, 2019
80
61
51
As to server CPU's needing to be at the top of IPC, not all servers need that - in a great many server use cases they are simply serving data to a network, rather than doing any significant compute on it which would benefit from that high IPC, such as sending/receiving e-mails, routing video streams, database queries, sending HTML....
So why Intel doesn't provide Xeon CPU based on double amount of Atom core? Why 2xALU in-order ARM server CPU are failing? Why 2xALU Bulldozer failed? The answer is that 90% of workloads benefit from high IPC. Moreover you can eliminate those 10% (or whatever percentage it is) by implementing advanced techniques such as SMT2 and SMT4. This is the secret power of x86 CPU today - these are superior in almost every generic code, no expensive optimization needed. That's why Apple moved to 6xALU design, they increased code crunching window even further. Analogicaly for Zen 3, IMHO Mike T. Clark is great man for core re-design, going wider with ALU and AGU, SMT4. Such a Zen 3 would be fast in every code, old, new, just like Apple A12 is great in SPEC2006 (12 years old code). That's my point why 4xALUs core design is obsolete nowadays, and that's why Zen 3 will have most likely 6xALUs IMHO.
 

soresu

Senior member
Dec 19, 2014
633
160
116
That's my point why 4xALUs core design is obsolete nowadays, and that's why Zen 3 will have most likely 6xALUs IMHO.
Something you keep over looking is power consumption - sure A12 has great performance at full whack, but it also drains the battery tout suite, which is why the little cores are needed.

If Zen3 went 6 wide, it would need much more area than the 20% 7nm+ brings, and far more than the meagre 15% (ideally, not necessarily realistic...) power efficiency improvement, an improvement that would likely not even be enough to cover the increased power consumption from going 6 wide, let alone SMT4 too.
 

NostaSeronx

Platinum Member
Sep 18, 2011
2,652
410
126
which is why the little cores are needed.
The little cores are needed because DVFS-complexity. Why build a single super complex core, when they can build a cheap high IPC core and a cheap EPI core.

AMD doesn't fit in that narrative since Zen2 already has a top-of-the-line 0.3-1.5V sensing AVFS. Higher IPC w/ SMT4 can convert a four core boost into a two core boost.
 

soresu

Senior member
Dec 19, 2014
633
160
116
The answer is that 90% of workloads benefit from high IPC.
A citation or 4 would make your argument less of an opinion.
This is the secret power of x86 CPU today - these are superior in almost every generic code, no expensive optimization needed.
The secret power of x86 is the Wintel collaboration that spread it everywhere but mobile, and even then it's because Intel management was too shortsighted to see the potential when Apple came-a-calling during the iPhone development - now ARM owns the mobile space and grew up to gobble MIPS market because of that complacency.
 

soresu

Senior member
Dec 19, 2014
633
160
116
Higher IPC w/ SMT4 can convert a four core boost into a two core boost.
Am I seeing things or are the last 2 things back to front?
The little cores are needed because DVFS-complexity. Why build a single super complex core, when they can build a cheap high IPC core and a cheap EPI core.
Yeah, it would make Apple's big core even bigger still, not to mention I imagine that AMD and Intel have a lot of patents on that sort of dynamic functionality which Apple would have to either license or do a lot of R&D to find an alternative solution which could likely be inferior.
 

NostaSeronx

Platinum Member
Sep 18, 2011
2,652
410
126
Am I seeing things or are the last 2 things back to front?
No, higher IPC w/ SMT4 means two cores have eight threads. Where as in Zen2 eight threads are spread across four weaker cores. Four core boost in Zen2 is less than its two core boost. Higher IPC and SMT4 doesn't explicitly mean higher energy given that Zen2 is mostly a port. A new architecture with higher IPC would either handle the higher current of 7nm better or be able to use 6-track for an improved frequency/power curve given N7P.

Milan => isn't a new CPU architecture (process-optimization) / K17.4
Vermeer => is a new CPU architecture (inflection) / K19.2

Given the above, the new core is on N5. Given its HPC is 6T and its mobile is 5T. There is a huge area/power shrink to be traversed for the Ryzen 4000 family/Vermeer.
 
Last edited:
Mar 11, 2004
19,312
1,792
126
Something you keep over looking is power consumption - sure A12 has great performance at full whack, but it also drains the battery tout suite, which is why the little cores are needed.

If Zen3 went 6 wide, it would need much more area than the 20% 7nm+ brings, and far more than the meagre 15% (ideally, not necessarily realistic...) power efficiency improvement, an improvement that would likely not even be enough to cover the increased power consumption from going 6 wide, let alone SMT4 too.
Yeah, plus it seems like AMD is pushing efficiency on Zen 3. I think that's a calculated move to get Zen into laptops (and other similarly constrained form factors). It helps them on servers/Threadripper (where they offer improved perf/w and performance via higher core counts; for servers I think it'll be the start towards keeping per CPU power in check so that they can increase core counts some but can also increase sockets as their means of offering higher density per rack/server). For the consumer space it enables them to cram a GPU in. I think that would be an easy sell for OEMs where they could sell smaller form factor stuff.
 

soresu

Senior member
Dec 19, 2014
633
160
116
Higher IPC and SMT4 doesn't explicitly mean higher energy given that Zen2 is mostly a port.
Higher IPC and SMT are likely to require significant extra transistors, those don't come for free unless you decrease power consumption somewhere else.

Anyway, how is Zen2 a port, mostly or otherwise?

It doubled Zen1/1+ FP resources amongst other changes like TAGE branch predictor, I'd hardly call that a mere port/shrink by any standards.
 

soresu

Senior member
Dec 19, 2014
633
160
116
I think that's a calculated move to get Zen into laptops (and other similarly constrained form factors).
Yes, I honestly believe AMD want to get a second VR collaboration too.

Given the last one used the Carrizo SoC (Sulon Q), it would be a huge jump to Zen3 and Navi2/RDNA2 - both for power efficiency and raw performance.
 

soresu

Senior member
Dec 19, 2014
633
160
116
Same physical design as Zen. A new design doesn't use the same macro-tiles. Hence, because it is mostly re-using Zen2 assets its mostly a port.
.....

You just said re-using Zen2, after saying same physical design as Zen1.

Perhaps you mean Zen3 is a port of Zen2?

That does make more sense if unified L3 CCD is the only significant change.
 

Richie Rich

Member
Jul 28, 2019
80
61
51
Q3'2019 Lisa Su Q&A said:
We will transition to the 5-nanometer node at the appropriate time and get great benefit from that as well. But we’re doing a lot in architecture. And I would say, that the architecture is where we believe the highest leverage is for our product portfolio going forward.
Another prove that AMD is concentrated heavily on architecture improvements for Zen 3. Remembering another Lisa's statement after unveiling Zen 2, that she's not leaving AMD to IBM because the best things yet will come. Unified L3 cache is just tiny bit of what Zen 3 will bring.

Regarding Zen 3 being wider with 6xALU core. This might not impact area much as 4xALU is one of the smallest area in core (ALUs are 5x smaller than LSU or 10x smaller than FPU). Going wide to 6xALU would cost almost nothing in terms of die size (some other part of core will need to grow accordingly too, like scheduller). However it would cost a lot of brainpower to do that. IMHO that's exactly what Lisa Su is talking about. Just look at the picture of Zen core: https://en.wikichip.org/w/images/thumb/c/cb/amd_zen_core_(annotated).png/500px-amd_zen_core_(annotated).png
 

TheGiant

Senior member
Jun 12, 2017
593
233
86
Haven't we beaten the "wider core" and SMT4 arguments to death already? It's getting repetitive to the point of absurdity.
how is it possible for current x86 models (zen 3k, cfl, next icelake which is better) to reach the IPC of a13 while maintaining like 4GHz freq
is it possible with current tech?
which part of the cores are the bottlenecks?
 

ASK THE COMMUNITY