Speculation: Ryzen 4000 series/Zen 3

Page 80 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Thunder 57

Platinum Member
Aug 19, 2007
2,647
3,706
136
It does make sense for a game console. SMT increases variability due to resource sharing, making it harder to guarantee you will hit a locked 60fps.

I agree. I would not be surprised at all if the next gen consoles were both 8C/8T.
 
Mar 11, 2004
23,031
5,495
146
I agree. I would not be surprised at all if the next gen consoles were both 8C/8T.

I think both Sony and Microsoft have already stated 8C/16T. Perhaps that's still just rumor, but I believe they've outright said as much (Microsoft with regards to the Series X, and Sony the next Playstation). There was also a report that Microsoft had Xbox dev units that could do 3 threads per core via some special code path. Sounds like there's a push to make games much more multi-threaded.
 

moinmoin

Diamond Member
Jun 1, 2017
4,934
7,619
136
It does make sense for a game console. SMT increases variability due to resource sharing, making it harder to guarantee you will hit a locked 60fps.
Wonder if they could/will do a mix of both, disable SMT on cores for the games (or let developers handle that on a per core basis) but enable it for cores dedicated to the OS and all asynchronous background activity.
 
  • Like
Reactions: amd6502

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
There was also a report that Microsoft had Xbox dev units that could do 3 threads per core via some special code path.
Hyperscheduler is probably four threaded, btw.
8 cores => 8 logical threads and up to 24 hyperscheduled threads.

It is most likely API-triggered SMT.
Single threaded(8 threads) -> game does blank -> SMT2(16 threads) -> game does blank1 -> SMT3(24 threads) -> game does blank2 -> SMT4(32 threads)

OS/Game Audio core => one os/game audio thread, hs mode0 => +one game os audio thread, hs mode1 => +two game os audio threads, hs mode2 => +three game os audio threads.
 
Last edited:
  • Wow
Reactions: amd6502

Thunder 57

Platinum Member
Aug 19, 2007
2,647
3,706
136
I think both Sony and Microsoft have already stated 8C/16T. Perhaps that's still just rumor, but I believe they've outright said as much (Microsoft with regards to the Series X, and Sony the next Playstation). There was also a report that Microsoft had Xbox dev units that could do 3 threads per core via some special code path. Sounds like there's a push to make games much more multi-threaded.

I could be very wrong. I don't follow consoles much. Call it a gut feeling. It's not like when I say that Zen 3 will be SMT2 for sure which some people refuse to believe, if you know what I mean :D .
 

Adonisds

Member
Oct 27, 2019
98
33
51
All I'm saying is people invent things out of thin air, no ES bench leaks necessary.

PS: on the flip side I welcome the relative silence too, gives us heaps of time to endlessly debate SMT4 and 6xALU for ALL. /s
Yes! Keep discussing SMT4 folks, and we will writes the names of those who did
 
  • Like
Reactions: dorion

Tuna-Fish

Golden Member
Mar 4, 2011
1,324
1,462
136
I agree. I would not be surprised at all if the next gen consoles were both 8C/8T.

They are not. There is no point in limiting the hardware threads, because if any game dev wants less jittery threads, they can just choose not to schedule more than one thread per core.

Hyperscheduler is probably four threaded, btw.
8 cores => 8 logical threads and up to 24 hyperscheduled threads.

It is most likely API-triggered SMT.
Single threaded(8 threads) -> game does blank -> SMT2(16 threads) -> game does blank1 -> SMT3(24 threads) -> game does blank2 -> SMT4(32 threads)

OS/Game Audio core => one os/game audio thread, hs mode0 => +one game os audio thread, hs mode1 => +two game os audio threads, hs mode2 => +three game os audio threads.

Absolutely nothing like this exists on the next-gen consoles. Where do you even get this horse****?
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
Absolutely nothing like this exists on the next-gen consoles.
It was a reply to this:
darkswordsman17: " There was also a report that Microsoft had Xbox dev units that could do 3 threads per core via some special code path."

Which is a reference to the Hyperscheduler_mode=Enabled and SMT_mode=disabled.
Logical processors = 8
Hs_mode0 = 8(logical)+8(hyperscheduled threads)
Hs_mode1 = 8(logical)+16(hyperscheduled threads)
Hs_mode2 = 8(logical)+24(hyperscheduled threads)

A hyper scheduler is an architectural device that supposedly can accelerate context switching and better support concurrent contexts. In the context from the "leak", AMD/Microsoft was using this to improve consistency of multithreading performance.

Using Zen as an example:
SMT_mode = flat 96(T0) and 96(T1)
HS_mode0 = 192(T0) or 96(T0) and 96(T1) when SMT is needed.
HS_mode1 = " " " or 64(T0) and 64(T1) and 64(T2) when SMT3 is needed.
HS_mode2 = " " " " " or 48(T0) and 48(T1) and 48(T2) and 48(T3) when SMT4 is needed.

=> Fetch windows are tracked in a 64-entry (32 entries in SMT mode) FIFO from fetch until retirement.
=> In SMT mode each thread has 10 dedicated IBQ entries.
=> The op cache is organized as an associative cache with 32 sets and 8 ways. At each set-way intersection is an entry containing up to 8 instructions, so the maximum capacity of the op cache is then 2k instructions. The actual limit may be less due to efficiency considerations. Avoid hot code regions that approach this size for a single thread or half this size for two SMT threads.
=> The op cache is organized as an associative cache with 64 sets and 8 ways. At each set-way intersection is an entry containing up to 8 instructions, so the maximum capacity of the op cache is then 4K ops. The actual limit may be less due to efficiency considerations. Avoid hot code regions that approach this size for a single thread or half this size for two SMT threads.
=> The retire queue can hold up to 192 micro ops or 96 per thread in SMT mode.
=> The retire queue can hold up to 224 micro ops or 112 per thread in SMT mode.
=> It is expensive to transition between single-threaded (1T) mode and dual-threaded (2T) mode and vice versa, so software should restrict the number of transitions. If running in 2T mode, and one thread finishes execution, it may be beneficial to avoid transitioning to 1T mode if the second thread is also about to finish execution.
 
Last edited:
  • Like
Reactions: amd6502

tamz_msc

Diamond Member
Jan 5, 2017
3,726
3,554
136
Geekbench 5 for Renoir is now out:
Multi-core seems low for a 6-core part, but I'd not read too much into it.
Compared to Ice Lake in a similar chassis:
Which has faster memory in dual-channel, the ice lake part is 15% ahead in integer.

I don't think that Renoir would be up to par with Ice Lake in ST performance in laptops.
 
  • Like
Reactions: lightmanek

amd6502

Senior member
Apr 21, 2017
971
360
136
It was a reply to this:
darkswordsman17: " There was also a report that Microsoft had Xbox dev units that could do 3 threads per core via some special code path."

Which is a reference to the Hyperscheduler_mode=Enabled and SMT_mode=disabled.
Logical processors = 8
Hs_mode0 = 8(logical)+8(hyperscheduled threads)
Hs_mode1 = 8(logical)+16(hyperscheduled threads)
Hs_mode2 = 8(logical)+24(hyperscheduled threads)

Sounds a bit like asymmetric multithreading.

mode0: 1 big + 1 small (aSMT2)
mode1: 1 big + 2 small (aSMT3)
mode2: 1 big + 3 small (aSMT4)

I wonder how credible the rumor still is.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,726
3,554
136
multi-core doesn't seems low to me... after all, 4500U only has 6 core and 6 threads
I checked some 10710u results and you seem to be right. SMT should add another 25%. That said, it's interesting to note that all the leaks on Geekbench so far are all the non-SMT parts.
 
  • Like
Reactions: Tarkin77

Thunder 57

Platinum Member
Aug 19, 2007
2,647
3,706
136
I certainly would.

They are not. There is no point in limiting the hardware threads, because if any game dev wants less jittery threads, they can just choose not to schedule more than one thread per core.

I don't predict it or expect it, but at the same time I wouldn't be surprised is what I meant. They'll likely be 8/16, but it wouldn't shock me if they were 8/8.

Sounds a bit like asymmetric multithreading.

mode0: 1 big + 1 small (aSMT2)
mode1: 1 big + 2 small (aSMT3)
mode2: 1 big + 3 small (aSMT4)

I wonder how credible the rumor still is.

I'm going to go with not credible at all.
 
  • Like
Reactions: lobz

lobz

Platinum Member
Feb 10, 2017
2,057
2,856
136
(...)
I don't think that Renoir would be up to par with Ice Lake in ST performance in laptops.
It's long well known that ICL has higher IPC than both CML and Zen 2. ST performance, however, is not just IPC. We'll see when comparable laptops get tested equally, and by that I mean not the lowest-end Renoir vs the highest-end Ice Lake i7...
 

uzzi38

Platinum Member
Oct 16, 2019
2,565
5,575
146
I think going forward we can be sure those leaks will be drying up. Zen1 and Zen2 benchmarks were leaked months ahead of launch.
AMD will be more reluctant to share details going forward to avoid Osborning themselves.
Its 2 Q away and there is nothing on the internet for RDNA2/Zen3 in contrast to Zen benchmarks leaks of almost 9+ months before launch and 7+ months for Zen2

AMD are full paranoia mode now. Leaks will only happen when OEMs or reviewers gst the chips.

No more UserBenchmark results months in advance.
 

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,686
136
That said, it's interesting to note that all the leaks on Geekbench so far are all the non-SMT parts.
https://browser.geekbench.com/v4/cpu/15154654 - I'll assume you meant non-SMT for GB5 only.

Ice Lake is not a threat for Renoir, Comet Lake may actually end up performing better in [cough] real world benchmarks [/cough] than the shiny new 10nm chips.



The only interesting point in comparing ICL vs. Renoir performance is the upcoming TGL products and their presumably better efficiency and peak performance.
 
  • Like
Reactions: RetroZombie

uzzi38

Platinum Member
Oct 16, 2019
2,565
5,575
146
Geekbench 5 for Renoir is now out:
Multi-core seems low for a 6-core part, but I'd not read too much into it.
Compared to Ice Lake in a similar chassis:
Which has faster memory in dual-channel, the ice lake part is 15% ahead in integer.

I don't think that Renoir would be up to par with Ice Lake in ST performance in laptops.

The Renoir device here is same memory as Ice Lake systems. LPDDR4x-3733.

The average 3500X scores about 1200 single and 4800 multi. The average 3600 scores about 7000 multi. SMT yield on Ryzen is higher than that of Intel.

And no, it won't be up to par against Ice Lake in ST. What made you think it will, it's got lower IPC and clocks about the same.

It's stronger in other ways.
 

uzzi38

Platinum Member
Oct 16, 2019
2,565
5,575
146
Clocks around 7% higher.
And IPC difference between Skylake and Ice Lake is 18% on average.

It's not enough to make up that difference when you consider Zen 2 is much, much closer to Skylake than it is Ice Lake.
 

Hans Gruber

Platinum Member
Dec 23, 2006
2,092
1,065
136
I just want to point out that AMD missed the mark with Zen2. Intel is still #1 in gaming but they wiped the floor in everything else. They were expecting better clocks and performance in Zen2. I think Zen3 will put any arguments to rest. The problem. The market is flat and because of AMD there was growth again. Now the market is saturated. Server side is a different story because Intel has been sticking it to big business for a very long time.

I think that Zen3 will be a bigger step forward than Zen2 was.