Speculation: Ryzen 4000 series/Zen 3

Page 52 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

jamescox

Senior member
Nov 11, 2009
637
1,103
136
@int64 , @tamz-msc

How do you people come to this conclusion? You have a SOC that has near 100ms longer memory latency compared to skylake, over 200ms to memory is atrocious (ddr4 2400!) . 2 core Cannon lake doesn't have anything special on the cache front ( exact same config as skylake) yet clock for clock has the same performance as skylake. So unless your assertion is the performance impact of near 100ms of extra access latency is 0 then cannonlake core increased IPC and offset the loss of memory system performance.

Note the “ms” is milliseconds, which you should not have as a latency figure unless you are talking about spinning rust random access latency. DRAM latencies are in nanoseconds. In the nanosecond range, even driving a signal through a wire on the chip can be significant. I believe the Pentium4 actually had two pipeline stages just to drive signals long distance across the chip. This is also why I was surprised that the 4 core cluster went away so soon. I guess they can get low enough latency to due a monolithic cache with 8 cores at 7nm+. I am not sure what the enabler is for that. Perhaps wire length is reduced significantly due to the actual area of the cache being much smaller for it’s size vs. 14 nm.
 

exquisitechar

Senior member
Apr 18, 2017
657
871
136
@int64 , @tamz-msc

How do you people come to this conclusion? You have a SOC that has near 100ms longer memory latency compared to skylake, over 200ms to memory is atrocious (ddr4 2400!) . 2 core Cannon lake doesn't have anything special on the cache front ( exact same config as skylake) yet clock for clock has the same performance as skylake. So unless your assertion is the performance impact of near 100ms of extra access latency is 0 then cannonlake core increased IPC and offset the loss of memory system performance.
Charlie Demerjian also said that the Cannon Lake core increased IPC. It’s just that the implementation we got is underwhelming, as you say.
 

Richie Rich

Senior member
Jul 28, 2019
470
229
76
No, it really isn't. :rolleyes: Too bad we have to keep hearing you and others repeat this over and over until it actually comes out.

Here it is, one more time. I wish I could print out 1000 copies and tape them all over your home until you get it.

View attachment 13532
Imagine you are AMD employee responsible for creating this slide and you know Zen 3 has SMT4. Your goal is to avoid disclose SMT4 and other significant uarch changes because it is planned to be revealed on spring 2020. So lets assume that disclosing Zen3 SMT4 breakthrough feature during Zen2 presentation is not good idea.

What would you put there? You have simply two options:
  1. - leave it empty (this would mean there is something hidden and provoke even bigger speculations)
  2. - put "2X" (knowing SMT4 CPU can run at SMT2 mode too, so you are not technicaly lying and keeps uarch secrets hidden until oficial Zen3 presentation)
Most people was predicting small changes for Zen3 (something like Zen2+ was expected) which exclude SMT4 (I agree, that's logical and I understand why most people didn't beleive those SMT4 speculations). However now when AMD confirmed Zen3 is completely new uarch things changed up side down. New uarch means opportunity for feature set expansion (6xALUs, SMT4, AVX512 half speed) as a solid base for future iterations of Zen4 and 5. This slide proves that Zen3 will be SMT2 capable but not excluding SMT4. Saying Zen3 has no SMT4 feature for 100% is very extreme opinion (only AMD engineers can say this, silicon matters not letters at some slide).

I'm more Heinrich Schliemann type of guy. He admitted the myth about Troy to be truth and following this he actually found the lost city.
 
Last edited:

amrnuke

Golden Member
Apr 24, 2019
1,181
1,772
136
Imagine you are AMD employee responsible for creating this slide and you know Zen 3 has SMT4. Your goal is to avoid disclose SMT4 and other significant uarch changes because it is planned to be revealed on spring 2020. So lets assume that disclosing Zen3 SMT4 breakthrough feature during Zen2 presentation is not good idea.

What would you put there? You have simply two options:
  1. - leave it empty (this would mean there is something hidden and provoke even bigger speculations)
  2. - put "2X" (knowing SMT4 CPU can run at SMT2 mode too, so you are not technicaly lying and keeps uarch secrets hidden until oficial Zen3 presentation)
Most people was predicting small changes for Zen3 (something like Zen2+ was expected) which exclude SMT4 (I agree, that's logical and I understand why most people didn't beleive those SMT4 speculations). However now when AMD confirmed Zen3 is completely new uarch things changed up side down. New uarch means opportunity for feature set expansion (6xALUs, SMT4, AVX512 half speed) as a solid base for future iterations of Zen4 and 5. This slide proves that Zen3 will be SMT2 capable but not excluding SMT4. Saying Zen3 has no SMT4 feature for 100% is very extreme opinion (only AMD engineers can say this, silicon matters not letters at some slide).

I'm more Heinrich Schliemann type of guy. He admitted the myth about Troy to be truth and following this he actually found the lost city.
So your evidence for SMT4 is that... there is no evidence.

And you're backing up your claim by using Schliemann, which is hilarious. He wasn't the first person to dig at Hisarlik, it was being excavated for centuries before, as early as the 15th century. When he did go there to dig, he destroyed several centuries of ruins to reach what he thought was the Troy of the Iliad, and he was wrong - he actually destroyed the Troy of the Iliad on his way to dig down to ruins thousands of years earlier. Most archaeologists view him as an infamous figure, not for his good works but for his stupidity and for extensively damaging the archaeological record at the site. I went back to Wikipedia to confirm something I remember from Ancient Greek history courses, and found the quote: Schliemann "was not very good at separating fact from interpretation". That you claim you're more of a "Schliemann type of guy" is, well, they're your own words. And they seem accurate insofar as the works are concerned.
 

soresu

Platinum Member
Dec 19, 2014
2,660
1,860
136
he actually destroyed the Troy of the Iliad on his way to dig down to ruins thousands of years earlier.
We're going off topic here, but I'll bite as I do love this stuff.

How do they know he destroyed Iliad era Troy/Troia if he actually destroyed it?

Were there fragments that dated to the appropriate era in his excavated rubble or something?
 
Last edited:

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
Imagine you are AMD employee responsible for creating this slide and you know Zen 3 has SMT4. Your goal is to avoid disclose SMT4 and other significant uarch changes because it is planned to be revealed on spring 2020.

Yeah, because the absolute last thing you would want to do would be have S/W developers work on making their programs scale on SMT4.

That would be a nightmare if that happened wouldn't it?

Here, take this, got it for you at the Christmas market:

71GV79NPpZL._UL1163_.jpg
 

HurleyBird

Platinum Member
Apr 22, 2003
2,684
1,268
136
I would take AMD's presentation 32+ MB point to mean that the Zen3 CCD will have more, just that they don't want to say how much yet so early on

I'm taking it to most likely mean that there will be a die with an absurd amount of L3. That, or they will start binning based on the amount of L3 enabled. Marketing speak for "There will be more" is >x, while x+ tends to indicate multiple SKUs.
 

soresu

Platinum Member
Dec 19, 2014
2,660
1,860
136
An interesting question is, would a significant increase in cache be beneficial for AVX512 if Zen3 has it?
 

kapulek

Member
Oct 16, 2010
56
33
91
I'm taking it to most likely mean that there will be a die with an absurd amount of L3. That, or they will start binning based on the amount of L3 enabled. Marketing speak for "There will be more" is >x, while x+ tends to indicate multiple SKUs.
Charlie hinted about different CPU designs back in April, wonder which ZEN gen.
 
  • Like
Reactions: uzzi38

Thunder 57

Platinum Member
Aug 19, 2007
2,675
3,801
136
Imagine you are AMD employee responsible for creating this slide and you know Zen 3 has SMT4. Your goal is to avoid disclose SMT4 and other significant uarch changes because it is planned to be revealed on spring 2020. So lets assume that disclosing Zen3 SMT4 breakthrough feature during Zen2 presentation is not good idea.

What would you put there? You have simply two options:
  1. - leave it empty (this would mean there is something hidden and provoke even bigger speculations)
  2. - put "2X" (knowing SMT4 CPU can run at SMT2 mode too, so you are not technicaly lying and keeps uarch secrets hidden until oficial Zen3 presentation)
Most people was predicting small changes for Zen3 (something like Zen2+ was expected) which exclude SMT4 (I agree, that's logical and I understand why most people didn't beleive those SMT4 speculations). However now when AMD confirmed Zen3 is completely new uarch things changed up side down. New uarch means opportunity for feature set expansion (6xALUs, SMT4, AVX512 half speed) as a solid base for future iterations of Zen4 and 5. This slide proves that Zen3 will be SMT2 capable but not excluding SMT4. Saying Zen3 has no SMT4 feature for 100% is very extreme opinion (only AMD engineers can say this, silicon matters not letters at some slide).

I'm more Heinrich Schliemann type of guy. He admitted the myth about Troy to be truth and following this he actually found the lost city.

The slide literally said "Max cores / threads". They could have just put "Cores / threads" but did not.

Given what we have, saying that Zen 3 having SMT4 is an 80% chance is a very extreme opinion. I would say Zen 3 being SMT2 is about 98% certain, if not more. Never say never, but I think this is about as close to certain as possible at this point.
 

uzzi38

Platinum Member
Oct 16, 2019
2,626
5,927
146
Charlie hinted about different CPU designs back in April, wonder which ZEN gen.

It's wasn't completely different designs in that article. In fact, the differences in designs is exactly what the '32+MB' cache on that slide suggests.
 

Panino Manino

Senior member
Jan 28, 2017
821
1,022
136
We're going off topic here, but I'll bite as I do love this stuff.

How do they know he destroyed Iliad era Troy/Troia if he actually destroyed it?

Were there fragments that dated to the appropriate era in his excavated rubble or something?
Basically he "dug too much", right?
He must have reached a strata that was formed during time period older than the time Troy should had existed at the place. So, he found Troy, but the "wrong" Troy.
 

amd6502

Senior member
Apr 21, 2017
971
360
136
Yes given what was shown at the scientific computing conference it seems SMT4 isn't so likely for Zen3. But this isn't saying anything about general 4-way multithreading. If such 4-way MT was closer to SMT2 than SMT4 it would pass off for all purposes of scientific computing as SMT2. It would be pretty pointless (distracting and confusing to the talk) to confuse the audience with an obscure feature irrelevant to this particular community.

Neither does it rule out optional SMT4 that either isn't available or isn't set as default on most SKUs---though that seems very very unlikely.

The likeliest (>50%) outcome still is vanilla SMT2.

There are other ways to save power on wide cores they might use in this case. Maybe have part of the execution units sleep during lower p-states and when low useage is detected.
 
Last edited:

DrMrLordX

Lifer
Apr 27, 2000
21,627
10,841
136
An interesting question is, would a significant increase in cache be beneficial for AVX512 if Zen3 has it?

Depends on which cache we're talking about here (L2 or L3) and how Zen3 would implement AVX512. If you look at Skylake-X/Skylake-SP, you'll see that Intel modified the cache architecture in two significant ways: L3 became mostly exclusive (and smaller) while L2 got bigger. My general impression is that aggressive use of large vector length SIMD heavily favors L1 and L2 since L3 is too slow. Not sure if it's a bandwidth or latency issue there. You want as much of your working set in L2 as possible. If the working set is too large, the time it takes to evict old data from l2 and shuffle in new data might slow down your application to the point that longer vector length SIMD is no longer of any particular benefit.

I'm not exactly sure how Intel's switch from inclusive L3 to (essentially) exclusive L3 affected AVX512 performance, if at all. But it does seem clear that Intel was working within a limited transistor budget (being constrained by 14nm), so they had to make some sacrifices to increase L2 size to accommodate AVX512.

That brings us back to Zen3. Zen3 is not constrained by process - it's moving to 7nm+ with an estimated 20% increase in density. We don't know if AMD will gain full advantage of that increase in density. Regardless, they have the option of simply increasing L2 size assuming doing so does not negatively impact bandwidth/latency. They're already told us what L3 is like, and they certainly aren't making it any smaller. That being said, AMD could "phone it in" on AVX512 performance and just run at 1/2 throughput sort of the same way Summit Ridge handled AVX2 (op fusion). That helps maintain compatibility with AVX512 applications without requiring 512bit FMACs (or doubled 256bit FMACs), larger L2, or anything else required to sustain that level of throughput per core. I am concerned by the different AVX512 sub-ISAs out there. Trying to support too many of those, or trying to support AVX512 at all, opens up a whole new can of worms that Intel never foisted on the x86 world with AVX or AVX2.

It would be much nicer if AMD could just support SVE2 instead.