Speculation: Ryzen 4000 series/Zen 3

Page 89 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Gideon

Golden Member
Nov 27, 2007
1,608
3,570
136
Gideon could you please explain those charts, i'm looking at them and don't understand them. What is the base? What kind of data is being compressed/decompressed?
Maybe it's obvious, but i simple don't understand the scale is it by time or by data compression size?

These charts list 4 newer algorithms for decompression from the same family (from slowest to fastest: Leviathan, Kraken, Mermaid, Selkie). These are compared to various competitors with zlib (the industry standard) being on every graph. The upper 3 bars denote how well a file is packed (more is better) the lower 3 bars are the decompression speed.

All of the algorithms are at least 2x - 14x faster than zlib and compress at least as well (Selkie) or considerably better (Kraken/Leviathan) at the cost of speed. As zlib is on every graph it's easy to see that Selkie is about 3.5x faster than Kraken at the cost of compression ratio.

PS4 uses Kraken for SSD data compression/decompression, we don't know about Xbox but it's probably something similar (e.g. zlib). I hypothesized Kraken is definitely ok for I/O but might be too heavyweight for memory compression. Selkie might be better but if not there are even lighter algorithms (that also compress less).
 
  • Like
Reactions: RetroZombie

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
These charts list 4 newer algorithms for decompression from the same family (from slowest to fastest: Leviathan, Kraken, Mermaid, Selkie). These are compared to various competitors with zlib (the industry standard) being on every graph. The upper 3 bars denote how well a file is packed (more is better) the lower 3 bars are the decompression speed.

All of the algorithms are at least 2x - 14x faster than zlib and compress at least as well (Selkie) or considerably better (Kraken/Leviathan) at the cost of speed. As zlib is on every graph it's easy to see that Selkie is about 3.5x faster than Kraken at the cost of compression ratio.

PS4 uses Kraken for SSD data compression/decompression, we don't know about Xbox but it's probably something similar (e.g. zlib). I hypothesized Kraken is definitely ok for I/O but might be too heavyweight for memory compression. Selkie might be better but if not there are even lighter algorithms (that also compress less).
Should be noted that those compression algorithms are proprietary and specific to game development due to being developed and sold by RAD Games Tools. For general usage I'd expect an open source algorithm to be used. https://quixdb.github.io/squash-benchmark/#compression-vs-decompression is a pretty great benchmark tool and site comparing all the newer algorithms. But what's already clear to see is that all of them would significantly slow down the usual RAM bandwidth so the algorithm used will very likely have to be more specialized to offer some compression at ideally no loss of speed with any kind of data.

Edit: A good academic paper from last year titled "Enabling Transparent Memory-Compression for Commodity Memory Systems" which discusses pitfalls that may make transparent memory compression less worth than we may imagine.
 
Last edited:

Adonisds

Member
Oct 27, 2019
98
33
51
These charts list 4 newer algorithms for decompression from the same family (from slowest to fastest: Leviathan, Kraken, Mermaid, Selkie). These are compared to various competitors with zlib (the industry standard) being on every graph. The upper 3 bars denote how well a file is packed (more is better) the lower 3 bars are the decompression speed.

All of the algorithms are at least 2x - 14x faster than zlib and compress at least as well (Selkie) or considerably better (Kraken/Leviathan) at the cost of speed. As zlib is on every graph it's easy to see that Selkie is about 3.5x faster than Kraken at the cost of compression ratio.

PS4 uses Kraken for SSD data compression/decompression, we don't know about Xbox but it's probably something similar (e.g. zlib). I hypothesized Kraken is definitely ok for I/O but might be too heavyweight for memory compression. Selkie might be better but if not there are even lighter algorithms (that also compress less).
What should happen with PC games in the next few years regarding SSD requirements? Will a comparable SSD to the consoles not be required because it would be expensive and decompressing data is uses a lot of cores?

Reading the DF article on the Xbox, it doesn't seem that it uses custom hardware for decompression. Could that algorithm that Microsoft developed be used in PC games?
 

Gideon

Golden Member
Nov 27, 2007
1,608
3,570
136
What should happen with PC games in the next few years regarding SSD requirements? Will a comparable SSD to the consoles not be required because it would be expensive and decompressing data is uses a lot of cores?

Reading the DF article on the Xbox, it doesn't seem that it uses custom hardware for decompression. Could that algorithm that Microsoft developed be used in PC games?

The DigitalFoundry article on Eurogamer actually directly mentions a hardware decompression block:
"Our second component is a high-speed hardware decompression block that can deliver over 6GB/s," reveals Andrew Goossen. "This is a dedicated silicon block that offloads decompression work from the CPU and is matched to the SSD so that decompression is never a bottleneck. The decompression hardware supports Zlib for general data and a new compression [system] called BCPack that is tailored to the GPU textures that typically comprise the vast majority of a game's package size."

However, I don't think it will be that bad that a hardware block is strictly required. On a PC, a PCIe 4.0 SSD (even without compression) should match the throughput of the Xbox SSD (with compression), so that should be enough. The new DirectStorage API should also reduce some overhead

On top of that, PCs can also simply have more RAM (which is always faster than SSDs), as an alternative. But yeah i think that you at the very least will need a NVMe, possibly PCIe 4.0 one.
 
Last edited:

IndyColtsFan

Lifer
Sep 22, 2007
33,656
687
126
So, do you guys think the 4950x will be released in Q1 2021 rather than Q4 of this year? My 8700k is having issues but I don’t want to build a new box now with Zen 3 around the corner. My normal upgrade cycle would fall in November of this year and my plan has always been to go with a 4900x or 4950x. I’m excited for these parts.

The AIO on my 8700k failed and it got pretty hot. I replaced it with a Noctua D15s, but stability has been an issue and It was crashing and freezing a couple times per week. I’ve downclocked it some but I still have seen some crashes, so I’ll probably do a BIOS upgrade next and just leave stock settings. I was planning on giving the 8700k to my brother when I upgrade so hopefully it will be fine at stock.
 

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
@IndyColtsFan

No idea when a 4950X will come out. It might be in the initial launch in July-Sept of this year (cross your fingers for July).

AMD has had a lot of time to look at what did and didn't work for Matisse, so Vermeer should be a less-problematic launch. It doesn't seem they're upping core count either, so I would expect TDPs and actual power usage to be more-reasonable. Hopefully they will have done something to mitigate hotspots by way of the new process. As long as they keep boost targets about the same, they should be able to get all-core clocks up by lowering power/voltage, which is permitted by the new process (assuming they use 7nm+).
 
  • Like
Reactions: IndyColtsFan

IndyColtsFan

Lifer
Sep 22, 2007
33,656
687
126
@IndyColtsFan

No idea when a 4950X will come out. It might be in the initial launch in July-Sept of this year (cross your fingers for July).

AMD has had a lot of time to look at what did and didn't work for Matisse, so Vermeer should be a less-problematic launch. It doesn't seem they're upping core count either, so I would expect TDPs and actual power usage to be more-reasonable. Hopefully they will have done something to mitigate hotspots by way of the new process. As long as they keep boost targets about the same, they should be able to get all-core clocks up by lowering power/voltage, which is permitted by the new process (assuming they use 7nm+).

Thanks! I don’t think I’m going to do a manual overclock this time, which is the end of an era for me (started overclocking in 1995 and all my PCs since then have been overclocked). I don’t want to do a custom loop, but I might in order to cool it enough to get higher boosts.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,226
9,990
126
Speaking of upping core counts, was this already discussed, would AMD choose to make the "floor" model of Ryzen 4000 CPU, an 8C/16T at a minimum, with 12C/24T, and 16C/32T, and possibly some models with same core/thread counts, as intermediate models? (Or make the minimum an 8C/8T?)

The thinking that I had behind that was, in 3000-series, they used their APUs for lower core-counts, and did not release any 4C/8T CPUs (that I am aware of).

Now that they have their 4000-series APUs, that have 8C/16T, it seems that lower core counts, will be supplied by cut-down/binned APUs, instead of CPUs.

Although, now that I think of it, it still I suppose makes some sense to have a cut-down CPU SKU, that allows them to have a dumping-ground for 8C/16T CPU dies that have a defective core or two in them.

But, being monolithic, and not requiring an I/O-die, would seem to make APU-derived products cheaper to make than their equivalent CPU products with equal core counts, wouldn't it, moving forward?

Or am I completely off-base here?

tl;dr: Is previous 6C/12T a dead configuration for CPUs? Will it instead be filled in by cut-down APUs? Heard some rumbling, especially with the new consoles, that 8C is going to be a minimum CPU core-count requirement for console ports on PC.
 

Veradun

Senior member
Jul 29, 2016
564
780
136
Although, now that I think of it, it still I suppose makes some sense to have a cut-down CPU SKU, that allows them to have a dumping-ground for 8C/16T CPU dies that have a defective core or two in them.

Keep in mind they can use 6c CCDs (as in CCDs with two cores disabled for yelds) in 12c SKUs.

Also: if CCXs have been expanded to the full 8c as rumors go they would also be able to make 10c and 14c SKUs adding new product lines when missing the old 6c line.

Before:
6c
8c
12c
16c

After:
8c
10c
12c
14c
16c

tl;dr: there is room to to do what you speculate.
 

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
would AMD choose to make the "floor" model of Ryzen 4000 CPU, an 8C/16T at a minimum,

Maybe. Bear in mind that a 6c/12t Vermeer might be faster in a lot of games/applications than whatever is going into the latest consoles. Relatively speaking.
 

Arzachel

Senior member
Apr 7, 2011
903
76
91
Also: if CCXs have been expanded to the full 8c as rumors go they would also be able to make 10c and 14c SKUs adding new product lines when missing the old 6c line.

CCX is still 4 cores but the L3 is now shared by the two CCX on the die.

New AAA titles are still going to be built around this console gen for a good while. I would be surprised if AMD did away with 6c/12t budget options, especially with how popular the 3600 ended up being.
 

NTMBK

Lifer
Nov 14, 2011
10,208
4,940
136
Maybe. Bear in mind that a 6c/12t Vermeer might be faster in a lot of games/applications than whatever is going into the latest consoles. Relatively speaking.

Maybe, but remember that the consoles will a) have highly optimized codepaths b) will use dedicated hardware to offload tasks like asset decompression, audio, etc, which reduces CPU load.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,590
5,722
136
I hope with more money AMD can afford to split off gaming, HPC and Cloud offerings.
Desktop/Gaming --> N7P HP Performance focussed.
HPC --> N7P HP. Efficiency + performance balance.
Cloud --> N7P HD. Efficiency focussed, dont need so much of those super wide vector stuffs.
Mobile --> N7P HD. Efficiency focussed

One size fits all is not going to cut it in my opinion. For managed cloud infrastructure your load balancer can dispatch vector and flops heavy jobs to HPC like nodes.

For the future, with chiplets and IF Arch they could mix and match stuffs.
Besides the CDNA2 unified memory model, I keep reading about some of the new stuffs they are working (at least from their patents) to integrate via IF with their CPUs
- FPGA
- PIM
- Memory class storage integration
 
Last edited:
  • Like
Reactions: Gideon

tamz_msc

Diamond Member
Jan 5, 2017
3,708
3,554
136
What is this based on?
I don't think that what he's saying is a proper interpretation of the diagram in the "leaked" AMD slide. A CCX by definition shares a pool of L3 and 8 cores sharing a pool means that the CCX also comprises of 8 cores.
 

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
Maybe, but remember that the consoles will a) have highly optimized codepaths b) will use dedicated hardware to offload tasks like asset decompression, audio, etc, which reduces CPU load.

Naturally. Just remember that we're discussing what desktop parts AMD will sell to the low-end DiY crowd and to OEMs as their baseline for Vermeer. The consoles sort of set the bar for where game development will go on AMD CPUs, so it's somewhat logical to think that if the consoles have 8c/16t CPUs, that the low-end on PC will have to move to 8c/16t. Which I think was @VirtualLarry 's point. But if 6c Vermeer is fast enough to handle console ports to PC due to clockspeed and uarch advantages, then AMD can still sell to that segment instead of having to move everything to 8c.
 
Mar 11, 2004
23,030
5,495
146
Naturally. Just remember that we're discussing what desktop parts AMD will sell to the low-end DiY crowd and to OEMs as their baseline for Vermeer. The consoles sort of set the bar for where game development will go on AMD CPUs, so it's somewhat logical to think that if the consoles have 8c/16t CPUs, that the low-end on PC will have to move to 8c/16t. Which I think was @VirtualLarry 's point. But if 6c Vermeer is fast enough to handle console ports to PC due to clockspeed and uarch advantages, then AMD can still sell to that segment instead of having to move everything to 8c.

I'm not entirely following that logic. The previous/current gen had 8 core CPUs, yet we got dual core Zen chips. I'd like to be wrong, but I'd guess we'll still see even quad core chips.

Guess my point is that I don't think consoles tell us much of anything about what AMD will do outside that market. I mean, Xbox One had eDRAM which we didn't see in any other AMD CPU, and PS4 had unified GDDR5 system memory.

I think it'd be smart for AMD to do that, and I hope we see something like that, I'm just not that hopeful.
 

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
I'm not entirely following that logic. The previous/current gen had 8 core CPUs, yet we got dual core Zen chips.

Previous gen was 8c Jaguar. On desktop, even the 4c/4t R3-1200 could give that CPU a run for its money. Not that you would want to play a lot of PC ports with an R3-1200, but it was doable, and probably at better overall performance than what you could get on an original PS4 or Xbox 1. I would like to point out that among the desktop Ryzen CPUs, 4c/4t was as low as AMD ever went. AMD has ditched quads altogether, so the question is whether there will be an introductory-level R5-4600 6c/12t chip or not this year.
 

eek2121

Platinum Member
Aug 2, 2005
2,904
3,906
136
AMD has done some interesting power optimizations for their mobile (Zen 2, Ryzen 4000) platform. Examining performance per watt in these chips tells me that Zen 3 will be quite an interesting architecture.
 
  • Like
Reactions: lightmanek

mopardude87

Diamond Member
Oct 22, 2018
3,348
1,575
96
Been some time since i heard or did anything research wise on 4000 series, what are the most likely specs and release date and price? If there is anything even like that out?
 

inf64

Diamond Member
Mar 11, 2011
3,684
3,957
136
I think Zen3 will be a brutal uarchitecture. I expect at least 20% IPC (averaged) jump and ~3-5% clock bump. I expect more cores but nothing radical (like with Zen->Zen2). When it will come to desktop? If nothing changes then late 2020 (hopefully). Fun times ahead!
 
  • Love
Reactions: mopardude87

mopardude87

Diamond Member
Oct 22, 2018
3,348
1,575
96
I think Zen3 will be a brutal uarchitecture. I expect at least 20% IPC (averaged) jump and ~3-5% clock bump. I expect more cores but nothing radical (like with Zen->Zen2). When it will come to desktop? If nothing changes then late 2020 (hopefully). Fun times ahead!

I hope i am holding out on a 7700k till then. :)