Speculation: Ryzen 4000 series/Zen 3

jamescox · Nov 22, 2019

itsmydamnation said:
@int64 , @tamz-msc

How do you people come to this conclusion? You have a SOC that has near 100ms longer memory latency compared to skylake, over 200ms to memory is atrocious (ddr4 2400!) . 2 core Cannon lake doesn't have anything special on the cache front ( exact same config as skylake) yet clock for clock has the same performance as skylake. So unless your assertion is the performance impact of near 100ms of extra access latency is 0 then cannonlake core increased IPC and offset the loss of memory system performance.

Note the “ms” is milliseconds, which you should not have as a latency figure unless you are talking about spinning rust random access latency. DRAM latencies are in nanoseconds. In the nanosecond range, even driving a signal through a wire on the chip can be significant. I believe the Pentium4 actually had two pipeline stages just to drive signals long distance across the chip. This is also why I was surprised that the 4 core cluster went away so soon. I guess they can get low enough latency to due a monolithic cache with 8 cores at 7nm+. I am not sure what the enabler is for that. Perhaps wire length is reduced significantly due to the actual area of the cache being much smaller for it’s size vs. 14 nm.

DrMrLordX · Nov 23, 2019

itsmydamnation said:
at which point SMT 4 will be in Zen4 because they both have 4's in them.... 100% confirmed!!!!

That won't fly in China, no sir!

Tetraphobia - Wikipedia

en.wikipedia.org

exquisitechar · Nov 23, 2019

itsmydamnation said:
@int64 , @tamz-msc

How do you people come to this conclusion? You have a SOC that has near 100ms longer memory latency compared to skylake, over 200ms to memory is atrocious (ddr4 2400!) . 2 core Cannon lake doesn't have anything special on the cache front ( exact same config as skylake) yet clock for clock has the same performance as skylake. So unless your assertion is the performance impact of near 100ms of extra access latency is 0 then cannonlake core increased IPC and offset the loss of memory system performance.

Charlie Demerjian also said that the Cannon Lake core increased IPC. It’s just that the implementation we got is underwhelming, as you say.

Richie Rich · Nov 23, 2019

Thunder 57 said:
No, it really isn't. Too bad we have to keep hearing you and others repeat this over and over until it actually comes out.

Here it is, one more time. I wish I could print out 1000 copies and tape them all over your home until you get it.

View attachment 13532

Imagine you are AMD employee responsible for creating this slide and you know Zen 3 has SMT4. Your goal is to avoid disclose SMT4 and other significant uarch changes because it is planned to be revealed on spring 2020. So lets assume that disclosing Zen3 SMT4 breakthrough feature during Zen2 presentation is not good idea.

What would you put there? You have simply two options:

- leave it empty (this would mean there is something hidden and provoke even bigger speculations)
- put "2X" (knowing SMT4 CPU can run at SMT2 mode too, so you are not technicaly lying and keeps uarch secrets hidden until oficial Zen3 presentation)

Most people was predicting small changes for Zen3 (something like Zen2+ was expected) which exclude SMT4 (I agree, that's logical and I understand why most people didn't beleive those SMT4 speculations). However now when AMD confirmed Zen3 is completely new uarch things changed up side down. New uarch means opportunity for feature set expansion (6xALUs, SMT4, AVX512 half speed) as a solid base for future iterations of Zen4 and 5. This slide proves that Zen3 will be SMT2 capable but not excluding SMT4. Saying Zen3 has no SMT4 feature for 100% is very extreme opinion (only AMD engineers can say this, silicon matters not letters at some slide).

I'm more Heinrich Schliemann type of guy. He admitted the myth about Troy to be truth and following this he actually found the lost city.

Olikan · Nov 23, 2019

Can disclose CCX unification, but can't SMT4... DUDE...

amrnuke · Nov 23, 2019

Richie Rich said:
Imagine you are AMD employee responsible for creating this slide and you know Zen 3 has SMT4. Your goal is to avoid disclose SMT4 and other significant uarch changes because it is planned to be revealed on spring 2020. So lets assume that disclosing Zen3 SMT4 breakthrough feature during Zen2 presentation is not good idea.

What would you put there? You have simply two options:

- leave it empty (this would mean there is something hidden and provoke even bigger speculations)

- put "2X" (knowing SMT4 CPU can run at SMT2 mode too, so you are not technicaly lying and keeps uarch secrets hidden until oficial Zen3 presentation)

Most people was predicting small changes for Zen3 (something like Zen2+ was expected) which exclude SMT4 (I agree, that's logical and I understand why most people didn't beleive those SMT4 speculations). However now when AMD confirmed Zen3 is completely new uarch things changed up side down. New uarch means opportunity for feature set expansion (6xALUs, SMT4, AVX512 half speed) as a solid base for future iterations of Zen4 and 5. This slide proves that Zen3 will be SMT2 capable but not excluding SMT4. Saying Zen3 has no SMT4 feature for 100% is very extreme opinion (only AMD engineers can say this, silicon matters not letters at some slide).

I'm more Heinrich Schliemann type of guy. He admitted the myth about Troy to be truth and following this he actually found the lost city.

So your evidence for SMT4 is that... there is no evidence.

And you're backing up your claim by using Schliemann, which is hilarious. He wasn't the first person to dig at Hisarlik, it was being excavated for centuries before, as early as the 15th century. When he did go there to dig, he destroyed several centuries of ruins to reach what he thought was the Troy of the Iliad, and he was wrong - he actually destroyed the Troy of the Iliad on his way to dig down to ruins thousands of years earlier. Most archaeologists view him as an infamous figure, not for his good works but for his stupidity and for extensively damaging the archaeological record at the site. I went back to Wikipedia to confirm something I remember from Ancient Greek history courses, and found the quote: Schliemann "was not very good at separating fact from interpretation". That you claim you're more of a "Schliemann type of guy" is, well, they're your own words. And they seem accurate insofar as the works are concerned.

DrMrLordX · Nov 23, 2019

Richie Rich said:
Imagine you are AMD employee responsible for creating this slide and you know Zen 3 has SMT4.

Wasn't that slide a leak from the UK conference that wasn't meant to go public?

maddie · Nov 23, 2019

DrMrLordX said:
Wasn't that slide a leak from the UK conference that wasn't meant to go public?

Ha. Disinformation man, disinformation. Deception at its best.

DrMrLordX · Nov 23, 2019

maddie said:
Ha. Disinformation man, disinformation. Deception at its best.

Wow, that Dr. Su sure is sneaky.

soresu · Nov 23, 2019

amrnuke said:
he actually destroyed the Troy of the Iliad on his way to dig down to ruins thousands of years earlier.

We're going off topic here, but I'll bite as I do love this stuff.

How do they know he destroyed Iliad era Troy/Troia if he actually destroyed it?

Were there fragments that dated to the appropriate era in his excavated rubble or something?

Atari2600 · Nov 23, 2019

Richie Rich said:
Imagine you are AMD employee responsible for creating this slide and you know Zen 3 has SMT4. Your goal is to avoid disclose SMT4 and other significant uarch changes because it is planned to be revealed on spring 2020.

Yeah, because the absolute last thing you would want to do would be have S/W developers work on making their programs scale on SMT4.

That would be a nightmare if that happened wouldn't it?

Here, take this, got it for you at the Christmas market:

HurleyBird · Nov 23, 2019

soresu said:
I would take AMD's presentation 32+ MB point to mean that the Zen3 CCD will have more, just that they don't want to say how much yet so early on

I'm taking it to most likely mean that there will be a die with an absurd amount of L3. That, or they will start binning based on the amount of L3 enabled. Marketing speak for "There will be more" is >x, while x+ tends to indicate multiple SKUs.

soresu · Nov 23, 2019

An interesting question is, would a significant increase in cache be beneficial for AVX512 if Zen3 has it?

kapulek · Nov 23, 2019

HurleyBird said:
I'm taking it to most likely mean that there will be a die with an absurd amount of L3. That, or they will start binning based on the amount of L3 enabled. Marketing speak for "There will be more" is >x, while x+ tends to indicate multiple SKUs.

Charlie hinted about different CPU designs back in April, wonder which ZEN gen.

Thunder 57 · Nov 23, 2019

Richie Rich said:
Imagine you are AMD employee responsible for creating this slide and you know Zen 3 has SMT4. Your goal is to avoid disclose SMT4 and other significant uarch changes because it is planned to be revealed on spring 2020. So lets assume that disclosing Zen3 SMT4 breakthrough feature during Zen2 presentation is not good idea.

What would you put there? You have simply two options:

- leave it empty (this would mean there is something hidden and provoke even bigger speculations)

- put "2X" (knowing SMT4 CPU can run at SMT2 mode too, so you are not technicaly lying and keeps uarch secrets hidden until oficial Zen3 presentation)

Most people was predicting small changes for Zen3 (something like Zen2+ was expected) which exclude SMT4 (I agree, that's logical and I understand why most people didn't beleive those SMT4 speculations). However now when AMD confirmed Zen3 is completely new uarch things changed up side down. New uarch means opportunity for feature set expansion (6xALUs, SMT4, AVX512 half speed) as a solid base for future iterations of Zen4 and 5. This slide proves that Zen3 will be SMT2 capable but not excluding SMT4. Saying Zen3 has no SMT4 feature for 100% is very extreme opinion (only AMD engineers can say this, silicon matters not letters at some slide).

I'm more Heinrich Schliemann type of guy. He admitted the myth about Troy to be truth and following this he actually found the lost city.

The slide literally said "Max cores / threads". They could have just put "Cores / threads" but did not.

Given what we have, saying that Zen 3 having SMT4 is an 80% chance is a very extreme opinion. I would say Zen 3 being SMT2 is about 98% certain, if not more. Never say never, but I think this is about as close to certain as possible at this point.

soresu · Nov 23, 2019

kapulek said:
Charlie hinted about different CPU designs back in April, wonder which ZEN gen.

Technically they are already if you count the differences in IOD's between the different segments.

uzzi38 · Nov 23, 2019

kapulek said:
Charlie hinted about different CPU designs back in April, wonder which ZEN gen.

It's wasn't completely different designs in that article. In fact, the differences in designs is exactly what the '32+MB' cache on that slide suggests.

Panino Manino · Nov 23, 2019

soresu said:
We're going off topic here, but I'll bite as I do love this stuff.

How do they know he destroyed Iliad era Troy/Troia if he actually destroyed it?

Were there fragments that dated to the appropriate era in his excavated rubble or something?

Basically he "dug too much", right?
He must have reached a strata that was formed during time period older than the time Troy should had existed at the place. So, he found Troy, but the "wrong" Troy.

amd6502 · Nov 24, 2019

Yes given what was shown at the scientific computing conference it seems SMT4 isn't so likely for Zen3. But this isn't saying anything about general 4-way multithreading. If such 4-way MT was closer to SMT2 than SMT4 it would pass off for all purposes of scientific computing as SMT2. It would be pretty pointless (distracting and confusing to the talk) to confuse the audience with an obscure feature irrelevant to this particular community.

Neither does it rule out optional SMT4 that either isn't available or isn't set as default on most SKUs---though that seems very very unlikely.

The likeliest (>50%) outcome still is vanilla SMT2.

There are other ways to save power on wide cores they might use in this case. Maybe have part of the execution units sleep during lower p-states and when low useage is detected.

itsmydamnation · Nov 24, 2019

amd6502 said:
The likeliest (>99.99%) outcome still is vanilla SMT2.

fixed it for you

Thunder 57 · Nov 24, 2019

itsmydamnation said:
fixed it for you

I had said 98% and thought that might have been conservative.I'd probably put it around 99.5%.

DrMrLordX · Nov 24, 2019

soresu said:
An interesting question is, would a significant increase in cache be beneficial for AVX512 if Zen3 has it?

Depends on which cache we're talking about here (L2 or L3) and how Zen3 would implement AVX512. If you look at Skylake-X/Skylake-SP, you'll see that Intel modified the cache architecture in two significant ways: L3 became mostly exclusive (and smaller) while L2 got bigger. My general impression is that aggressive use of large vector length SIMD heavily favors L1 and L2 since L3 is too slow. Not sure if it's a bandwidth or latency issue there. You want as much of your working set in L2 as possible. If the working set is too large, the time it takes to evict old data from l2 and shuffle in new data might slow down your application to the point that longer vector length SIMD is no longer of any particular benefit.

I'm not exactly sure how Intel's switch from inclusive L3 to (essentially) exclusive L3 affected AVX512 performance, if at all. But it does seem clear that Intel was working within a limited transistor budget (being constrained by 14nm), so they had to make some sacrifices to increase L2 size to accommodate AVX512.

That brings us back to Zen3. Zen3 is not constrained by process - it's moving to 7nm+ with an estimated 20% increase in density. We don't know if AMD will gain full advantage of that increase in density. Regardless, they have the option of simply increasing L2 size assuming doing so does not negatively impact bandwidth/latency. They're already told us what L3 is like, and they certainly aren't making it any smaller. That being said, AMD could "phone it in" on AVX512 performance and just run at 1/2 throughput sort of the same way Summit Ridge handled AVX2 (op fusion). That helps maintain compatibility with AVX512 applications without requiring 512bit FMACs (or doubled 256bit FMACs), larger L2, or anything else required to sustain that level of throughput per core. I am concerned by the different AVX512 sub-ISAs out there. Trying to support too many of those, or trying to support AVX512 at all, opens up a whole new can of worms that Intel never foisted on the x86 world with AVX or AVX2.

It would be much nicer if AMD could just support SVE2 instead.

kapulek · Nov 24, 2019

https://twitter.com/x/status/1198543242989637632

maddie · Nov 24, 2019

Panino Manino · Nov 24, 2019

A layman curiosity: SMT3 is possible? Three threads, an odd number instead of even?

Speculation: Ryzen 4000 series/Zen 3

Senior member

Lifer

Senior member

Senior member

Platinum Member

Golden Member

Lifer

Diamond Member

Lifer

Platinum Member

Golden Member

Platinum Member

Platinum Member

Member

Platinum Member

Platinum Member

Platinum Member

Senior member

Senior member

Platinum Member

Platinum Member

Lifer

Member

Diamond Member

Senior member