• Guest, The rules for the P & N subforum have been updated to prohibit "ad hominem" or personal attacks against other posters. See the full details in the post "Politics and News Rules & Guidelines."

Question Ryzen 5900X vs 5950X

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Justinus

Platinum Member
Oct 10, 2005
2,368
511
136
It likely comes down mostly to configuration, in combination with Zen's low per core memory bandwidth and inter CCX bandwidth + latency penalities. Handbrake is for casual encoders and uses a bunch of preset settings to achieve a desired outcome based on quality and speed. Using a lot of cores can increase speed, but reduce quality. Encoding/transcoding is complicated, so there are a plethora of settings which can affect scalability and quality.

This is actually why Intel developed their SVT line of codecs, because it allows massive parallelization without affecting quality, so the SVT codecs are eventually going to replace x265 as they continue to get better.

But if you look at the raw x265 codec, it can certainly scale to a large amount of cores as shown by that in depth review I linked to. If it couldn't scale well beyond 16 cores, then it would be an absolute failure. x265 was used to achieve real time 4K60 fps 10 bit HDR encoding several years ago by MulticoreWare on a single rack dual socket Xeon server. If it couldn't scale well beyond 16 cores, it would have never been able to do that.

Source
But you linked a benchmark that uses X265's encoder as evidence the 5950x is bottlenecked. The handbrake configuration tested in our thread also uses X265.

Both that benchmark and our thread show X265 doesn't scale well with increasing core counts.

Just because encoders exist that do doesn't invalidate my point that you're using a benchmark known to not scale well as evidence for your claim that the 5950x is not worth it because it's bottlenecked.
 
Last edited:

Carfax83

Diamond Member
Nov 1, 2010
6,051
850
126
But you linked a benchmark that uses X265's encoder as evidence the 5950x is bottlenecked. The handbrake configuration tested in our thread also uses X265.

Both that benchmark and our thread show X265 doesn't scale well with increasing core counts.
Have you ever encoded anything? I already explained this in my last post. Handbrake uses presets and those presets can affect the performance and quality of the output file. Handbrake could be limiting the scaling/performance to increase quality. Using too many threads in encoding can reduce quality. That's why hardware based encoding typically has lower quality and larger file size.

The handbrake thread on these forums use the Matroska h.265 mkv 2160p60 preset, which according to Handbrake itself is very slow. That in combination with the small source file, probably results in problems saturating big multicore CPUs.

Encoding is very complicated. Lots of factors are involved when it comes to performance, quality and file size, so it's not as clear cut as you think.

x265 in and of itself scales very well, linearly up to 32 cores and past that it's sublinear. It's been used commercially to provide real time broadcasts in 4K 60FPS HDR on Xeon servers, and the same for other HEVC codecs, but quality wise for offline encoding, it's probably affected by using too many threads.

There are much better codecs now that are being used that can scale even better than x265 without any loss in quality.

Just because encoders exist that do doesn't invalidate my point that you're using a benchmark known to not scale well as evidence for your claim that the 5950x is not worth it because it's bottlenecked.
This is my last post on this issue. If you want to believe that x265 scales poorly (even from 8 to 12 to 16 cores) using handbrake (which has presets and is geared towards casuals) that's fine. Of course the real world evidence rebukes your assertion.

If x265 had such awful scaling as you claim, it would never have been used commercially in real time broadcasting. And the in depth assessment I posted previously shows that it can scale up to 128 cores.
 

Justinus

Platinum Member
Oct 10, 2005
2,368
511
136
Have you ever encoded anything? I already explained this in my last post. Handbrake uses presets and those presets can affect the performance and quality of the output file. Handbrake could be limiting the scaling/performance to increase quality. Using too many threads in encoding can reduce quality. That's why hardware based encoding typically has lower quality and larger file size.

The handbrake thread on these forums use the Matroska h.265 mkv 2160p60 preset, which according to Handbrake itself is very slow. That in combination with the small source file, probably results in problems saturating big multicore CPUs.

Encoding is very complicated. Lots of factors are involved when it comes to performance, quality and file size, so it's not as clear cut as you think.

x265 in and of itself scales very well, linearly up to 32 cores and past that it's sublinear. It's been used commercially to provide real time broadcasts in 4K 60FPS HDR on Xeon servers, and the same for other HEVC codecs, but quality wise for offline encoding, it's probably affected by using too many threads.

There are much better codecs now that are being used that can scale even better than x265 without any loss in quality.



This is my last post on this issue. If you want to believe that x265 scales poorly (even from 8 to 12 to 16 cores) using handbrake (which has presets and is geared towards casuals) that's fine. Of course the real world evidence rebukes your assertion.

If x265 had such awful scaling as you claim, it would never have been used commercially in real time broadcasting. And the in depth assessment I posted previously shows that it can scale up to 128 cores.
The very benchmark you cited from TechPowerUp shows the same failure to scale well beyond 8 cores. If you're too blind to read your own source, I can't help you.

Go find a benchmark where they've used an encoder and configuration proven to scale linearly that shows the 5950x can't scale and maybe your point will be made valid. Until then, you've only cited unrelated sources and a 5950x benchmark that defies the very point you are trying to make.
 
Last edited:

Carfax83

Diamond Member
Nov 1, 2010
6,051
850
126
The very benchmark you cited from TechPowerUp shows the same failure to scale well beyond 8 cores. If you're too blind to read your own source, I can't help you.

Go find a benchmark where they've used an encoder and configuration proven to scale linearly that shows the 5950x can't scale and maybe your point will be made valid. Until then, you've only cited unrelated sources and a 5950x benchmark that defies the very point you are trying to make.
OK lets look at something else other than encoders, because they are too problematic to demonstrate what I am saying. Renderers are much more parallel and reliant on bandwidth. I got this benchmark from Openbenchmarking.org. If you look at the scaling in this benchmark from the 5600x to the 5800x, it's 22.4% for a 33% increase in cores. If you look at the scaling from a 5800x to a 5900x, you get 32% in performance for a 50% increase in cores. And finally, if you look at the scaling from the 5900x to the 5950x, you get 19% for 33% increase in cores.

Another thing, look at how close the 3950x is to the 5950x (margin is 3.4%), while the 5900x is 12% faster than the 3900x. It's because the 5950x is hitting a wall due to lack of memory bandwidth.

And you can't say it's because the application has problems scaling beyond 8 cores or whatever, as blender will suck in as many cores as you can give it. In fact, if you compare the 32 core Threadripper with quad channel DDR4 against the dual channel 3950x, you get a 47% increase in performance for a 100% increase in core count. Then look at the 5950x with dual channel memory and compare it to the 5800x with dual channel, and you get a 45% increase for 100% more cores.

And this is considering that the Threadripper and 3950x use the older 4 core CCXs rather than the newer 8 core CCXs like the Zen 3 parts which brings additional penalties to inter CCX bandwidth and latency.

 
Last edited:

Justinus

Platinum Member
Oct 10, 2005
2,368
511
136
OK lets look at something else other than encoders, because they are too problematic to demonstrate what I am saying. Renderers are much more parallel and reliant on bandwidth. I got this benchmark from Openbenchmarking.org. If you look at the scaling in this benchmark from the 5600x to the 5800x, it's 22.4% for a 33% increase in cores. If you look at the scaling from a 5800x to a 5900x, you get 32% in performance for a 50% increase in cores. And finally, if you look at the scaling from the 5900x to the 5950x, you get 19% for 33% increase in cores.

Another thing, look at how close the 3950x is to the 5950x (margin is 3.4%), while the 5900x is 12% faster than the 3900x. It's because the 5950x is hitting a wall due to lack of memory bandwidth.

And you can't say it's because the application has problems scaling beyond 8 cores or whatever, as blender will suck in as many cores as you can give it. In fact, if you compare the 32 core Threadripper with quad channel DDR4 against the dual channel 3950x, you get a 47% increase in performance for a 100% increase in core count. Then look at the 5950x with dual channel memory and compare it to the 5800x with dual channel, and you get a 45% increase for 100% more cores.

And this is considering that the Threadripper and 3950x use the older 4 core CCXs rather than the newer 8 core CCXs like the Zen 3 parts which brings additional penalties to inter CCX bandwidth and latency.

This is a good example that demonstrates over linear scaling, which could be attributed to memory bandwidth limitations. This isn't encoding, and it's still not evidence encoding is bandwidth bottlenecked on a 5950x. I agreed with you initially that the 5950x will be memory bandwidth bottlenecked in specific workloads and you should know your workload before choosing a platform and CPU.

Still, it doesn't really mean "the 5950x is not worth it because it's severely bottlenecked".

Also there's no evidence the CCX latency penalty affected workloads such as these.
 
Last edited:

Carfax83

Diamond Member
Nov 1, 2010
6,051
850
126
This is a good example that demonstrates over linear scaling, which could be attributed to memory bandwidth limitations. This isn't encoding, and it's still not evidence encoding is bandwidth bottlenecked on a 5950x. I agreed with you initially that the 5950x will be memory bandwidth bottlenecked in specific workloads and you should know your workload before choosing a platform and CPU.
Check post #18. I never specified encoding only. I said "bandwidth intensive workloads like compression, encoding, rendering etcetera...." In hindsight I should have used rendering from the beginning as it it much easier to demonstrate. There are too many issues with using encoding as it depends on the source file, settings and all kinds of other factors in addition to the codec.

Still, it doesn't really mean "the 5950x is not worth it because it's severely bottlenecked".
I hope you're not trying to attribute what you wrote in those quotation marks to me, because I never said that the 5950x wasn't worth it. I initially said that I would recommend staying away from it (I think the 5900x is better balanced) as it seemed to be severely bandwidth limited, and then I specified the types of workloads.

Most people don't run those types of workloads, but lots of power users do. If you don't use your PC for any of those workloads, then you probably shouldn't care. Still, I see it as a flaw in the Zen architecture, which should be resolved with Zen 4 which will support DDR5 and a much better I/O die.

Also there's no evidence the CCX latency penalty affected workloads such as these.
Yeah I'll give you this point. The CCX latency penalty affects latency sensitive workloads like gaming more than anything. Bandwidth sensitive apps are minimally affected if at all.

Too bad I couldn't get any Epyc 2 scores for this particular benchmark, as Epyc has a octa channel memory controller.
 

lightmanek

Senior member
Feb 19, 2017
275
502
136
OK lets look at something else other than encoders, because they are too problematic to demonstrate what I am saying. Renderers are much more parallel and reliant on bandwidth. I got this benchmark from Openbenchmarking.org. If you look at the scaling in this benchmark from the 5600x to the 5800x, it's 22.4% for a 33% increase in cores. If you look at the scaling from a 5800x to a 5900x, you get 32% in performance for a 50% increase in cores. And finally, if you look at the scaling from the 5900x to the 5950x, you get 19% for 33% increase in cores.

Another thing, look at how close the 3950x is to the 5950x (margin is 3.4%), while the 5900x is 12% faster than the 3900x. It's because the 5950x is hitting a wall due to lack of memory bandwidth.

And you can't say it's because the application has problems scaling beyond 8 cores or whatever, as blender will suck in as many cores as you can give it. In fact, if you compare the 32 core Threadripper with quad channel DDR4 against the dual channel 3950x, you get a 47% increase in performance for a 100% increase in core count. Then look at the 5950x with dual channel memory and compare it to the 5800x with dual channel, and you get a 45% increase for 100% more cores.

And this is considering that the Threadripper and 3950x use the older 4 core CCXs rather than the newer 8 core CCXs like the Zen 3 parts which brings additional penalties to inter CCX bandwidth and latency.

This benchmark shows power limits before it can show memory bandwidth limits.

I will test this at locked freq. when I find 5min.
 

moinmoin

Platinum Member
Jun 1, 2017
2,306
2,792
106
It's because the 5950x is hitting a wall due to lack of memory bandwidth.
You still need to prove first that memory bandwidth is actually the bottleneck here and not PPT, TDC, EDC or something else altogether.

Look at following unassuming graph from AT's review:


Why would 5950X suddenly drop below 3950X at more than 14 cores if it's clearly above it using fewer cores? This frequency regression is affecting all kinds of results that load more than 14 cores, so is also distorting any kind of scalability test beyond that point unless you use a fixed frequency to begin with. And only then one can say what the actual bottleneck is.
 

Carfax83

Diamond Member
Nov 1, 2010
6,051
850
126
You still need to prove first that memory bandwidth is actually the bottleneck here and not PPT, TDC, EDC or something else altogether.

Look at following unassuming graph from AT's review:


Why would 5950X suddenly drop below 3950X at more than 14 cores if it's clearly above it using fewer cores? This frequency regression is affecting all kinds of results that load more than 14 cores, so is also distorting any kind of scalability test beyond that point unless you use a fixed frequency to begin with. And only then one can say what the actual bottleneck is.
But this is rendering, an embarrassingly parallel workload. Minor clock speed differences between the 5950x and the 3950x shouldn't nullify the 5950x's IPC advantage to that degree if you ask me. But I could be wrong I suppose.
 

ASK THE COMMUNITY