Official AMD Ryzen Benchmarks, Reviews, Prices, and Discussion

Page 214 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
What are you talking about? I don't think you follow the development of h264 encoders as well as you think you do. x264 already has been utilizing avx2 and x264's ability to use more threads has been improved some few times in the last couple or few years.

x264 is one of, if not the most, worked on video encoder and it's staying power is pretty immense. 265 has a long way to go to improve on everything x264 already has BEEN doing. There's not a lot of advantage in even using 265, yet, and 265 is SLOW.

Well I'm far from an expert on this subject, but I do read the doom9 forums and that's where I got it from. That said, I NEVER said at any point that x264 didn't utilize AVX2, or couldn't use multithreading. I just said that its ability to utilize them was limited, especially compared to X265. But you don't have to take my word for it, just look at the benchmarks that I've posted which demonstrate it perfectly:

The largest performance increase stems from AVX2 implementation and efficiency. I mean, just look at the FX-9590. It's obliterated under H265, because it lacks AVX2. Under H264 though, it's still competitive more or less.

The Ryzen which has half the AVX2 throughput of the Intel CPUs goes from winning in H264, to being behind the 6900K and 6950X with H265. And a longer test would magnify these margins even more.

Fact is, H265 even at this stage has a much heavier integration of AVX2 than H264, and slightly better multithreading support. This will continue to improve though as time goes by. Relatively speaking, it's still in the early stages of optimization.
pg5Kje.png
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
True, but as tamz_msc mentioned new version of AIDA shows different results. Also double sized L2 comparing Intel gives AMD some advantage. However, the biggest drawback with Zen is its CCX specific L3 cache as only 8MB can be used per CCX. If one core would need the whole cache, it would be forced to use Infinity Fabric for reaching the whole L3, which is fast, but way slower than the normal Lx cache requests. I assume that somewhere here is also the reason why we haven't seen 4+2 or 4+0 CCX combinations yet eg. 1500X have four cores (could fit to one CCX) but 16MB L3. Still wondering why the 4c variants with 8MB L3 are not using only one CCX, but maybe it has something to do with the Infinity Fabric, yields...only AMD knows.

AIDA64 measures exactly half the L1 bandwidth of the 6900K on the Ryzen 8-core, which is expected. However, the L2 and L3 bandwidth as measured properly by the newest version is vastly superior on Ryzen.

Most of the latency issues seem to stem from the inter-CCX penalty more than anything else.

@ sushukka and tamz_msc, I think you've got it wrong for the L2 and L3 cache in Intel CPUs versus the Ryzen processors. From what I've read, the L2 cache in Intel CPUs basically just feed the L1 instruction cache and help mitigate cache misses along with the branch predictors. The L3 cache in Intel CPUs doesn't even run at the speed of the cores. I think Intel uses it for inter-core communication, and to buffer the L2 cache. So from this perspective, it makes sense for Intel to have the fastest and most accurate L1 cache they can design to feed their 256 bit operations, and significantly slower but larger L2 and L3 caches.

I don't know if it works the same way in Ryzen CPUs though.
 

SunburstLP

Member
Jun 15, 2014
86
20
81
Fact is, H265 even at this stage has a much heavier integration of AVX2 than H264, and slightly better multithreading support. This will continue to improve though as time goes by. Relatively speaking, it's still in the early stages of optimization.

I think your last sentence supports the opposing argument of the first two-thirds of your paragraph as well. None of us going into launch, if you had been following the info anyway, should be surprised much by Ryzen's AVX2 performance. Once details came out about how AVX2 would run on it, we could guess how it would perform relative to the rest of the chip. As you said, it's early in the optimization game. Things are subject to change on both sides of the fence.
 

lolfail9001

Golden Member
Sep 9, 2016
1,056
353
96
Was that the latest Prime95?
Review i took screen from did not mention that, though you can see that it has relatively low package power there as well.
Running an AVX workload with 1.35v vcore and Level 3 LLC (Prime95) showed a package power of 125W. Upping LLC to level 1 (which raised vcore to ~1.4v during heavy load) running an AVX workload showed a package power of 145W.
Did you try Linpack? Granted 145W is indeed a "furnace" in my book, because i use similar amount of heat as i used to use literal furnace: to warm myself in winter.
 

Blake_86

Junior Member
Mar 13, 2017
21
3
36
I went to college in 1992 and bought a computer with my parents' monetary assistance $500 from me and $500 from each of them. I was the only one in my dorm with an AMD system but I could actually game on that thing because AMD represented value - 386dx/40 and 1mb video card while my roommate had a 386sx16 and a POS video card for the same amount.

I suspect the R5 will be a tremendous value for those who don't want to spend or just done have piles of cash.

It's nice to have competition back.
I think they will be great value also for those who play whit pc, dont need so many core in the immediate and want to whait for a more refined 8core cpu aiming at zen 2 or 3
 

tamz_msc

Diamond Member
Jan 5, 2017
3,770
3,590
136
@ sushukka and tamz_msc, I think you've got it wrong for the L2 and L3 cache in Intel CPUs versus the Ryzen processors. From what I've read, the L2 cache in Intel CPUs basically just feed the L1 instruction cache and help mitigate cache misses along with the branch predictors. The L3 cache in Intel CPUs doesn't even run at the speed of the cores. I think Intel uses it for inter-core communication, and to buffer the L2 cache. So from this perspective, it makes sense for Intel to have the fastest and most accurate L1 cache they can design to feed their 256 bit operations, and significantly slower but larger L2 and L3 caches.

I don't know if it works the same way in Ryzen CPUs though.
L1 on Ryzen is half the width of L1 on Haswell as you pointed out, hence half bandwidth as measured by AIDA64 is what you'd expect from it. However, Ryzen has twice the L2 per core(512KB vs 256KB), and both L2 and L3 are denser than what Intel can achieve on Skylake - all these details are there in the ISSCC presentation. L3 on Ryzen runs on its own clock domain as well. Thus the faster L2 and L3 on Ryzen, in terms of raw bandwidth, is in line with the specs.

There are differences in the implementation of caches, and you cannot call one implementation better than the other at this moment, given the details on Ryzen's implementation is sparse.

The only limitation AMD has with regard to the size of the caches has to do with the CCX design.
 
  • Like
Reactions: sushukka

Schmide

Diamond Member
Mar 7, 2002
5,586
718
126
Fact is, H265 even at this stage has a much heavier integration of AVX2 than H264, and slightly better multithreading support. This will continue to improve though as time goes by. Relatively speaking, it's still in the early stages of optimization.

https://bitbucket.org/multicoreware...389bda8a63f5f7d/source/common/x86/?at=default

The ASM source here is directly pulled from the x264 project with two
changes:

1 - FENC_STRIDE must be increased to 64 in x86util.asm because of HEVC's
larger CU sizes
2 - Because of #1, we must rebrand the functions with x265_ prefixes in
x86inc.asm (private_prefix) and pixel-a.asm (mangle(x265_pixel_ssd))
3 - We have modified the MMX SSD primitives to use EMMS before returning
4 - We have added some new SATD block sizes for SSE3

Current assembly is based on x264 revision:
configure: Support cygwin64
Diogo Franco (Kovensky) <diogomfranco@gmail.com>
2013-07-23 22:17:44 -0300

It's almost all mmx to ss3. The biggest change from 264 to 265 is the increase of block size to 64.

The big caches on intel most likely lend themselves to this. If any AVX is used it is done so by the compiler in the c++ code.
 

unseenmorbidity

Golden Member
Nov 27, 2016
1,395
967
96
Strange results pairing ryzen and nvida gpus.

2 * 480's within a few % of 7700k in dx12 rottr.

Nvidia's drivers might not be working properly with Ryzen.

 
Last edited:

unseenmorbidity

Golden Member
Nov 27, 2016
1,395
967
96
More like strange results pairing nVidia cards and Ryzen, actually, 480s results are kind of fine.
True. I would like to say I was shocked that no one else tested this sooner, but given the state of tech reviwers, I am not surprised.

Almost makes you wonder if it was intentional on nvidia's part. Maybe not direct sabotage, but I could easily see them being indifferent as far as patching is concerned. An AMD not hemorrhaging money on their cpu division is bad news for Nvidia.
 
Last edited:

lolfail9001

Golden Member
Sep 9, 2016
1,056
353
96
True. I would like to say I was shocked that no one else tested this sooner, but given the state of tech reviwers, I am not surprised.
I mean, tested what?

Almost makes you wonder if it was intentional on nvidia's part.
Intentionally ruining their own Dx12 driver? It's not like it is the first time it regresses over Dx11, used to do it on Skylakes before that though.
 

Glo.

Diamond Member
Apr 25, 2015
5,705
4,549
136
True. I would like to say I was shocked that no one else tested this sooner, but given the state of tech reviwers, I am not surprised.

Almost makes you wonder if it was intentional on nvidia's part. Maybe not direct sabotage, but I could easily see them being indifferent as far as patching is concerned. An AMD not hemorrhaging money on their cpu division is bad news for Nvidia.
The reason for this is different. Pascal consumer uArchitecture is the same as Maxwell, when you take out high-level schemes, from them both. Only true new Pascal Architecture GPU is GP100 from Tesla P40, for example.

You have diversify the difference, between name of the GPU, and the name of architecture. GP100 uArchitecture jump is similar or bigger than Kepler->Maxwell Was. Pascal GP10X GPUs are not different on uArchitecture level from Maxwel GPUs. They are just... heavily OC'ed with new, faster memory.

Funniest part. P40 is 30% faster in compute than Titan X, and Quadro GPUs based on GP102 chip. Despite having lower core clock, than Titan X, and the similar core count.

P.S. AdoredTV has tested the difference between GTX 1080 and GTX 980 Ti around a year ago. Its worth watching his film on this.
 
  • Like
Reactions: Drazick

Topweasel

Diamond Member
Oct 19, 2000
5,436
1,654
136
I mean, tested what?


Intentionally ruining their own Dx12 driver? It's not like it is the first time it regresses over Dx11, used to do it on Skylakes before that though.

Even in Adored's testing it was regressing on the 7700 as well. In certain area's their were some neutral positive scores in DX12, but a couple showed sizable drops even on Intel. Just not as severe as the ones on Ryzen. Where the 480CF increased performance across the board on everything in DX12. You still don't completely know if the closeness in the DX12 performance is a GPU bottleneck since the 1080 is so much faster. But it basically invalidates all DX12 benches and really invalidates all game benches for a CPU review. Remember the CPU bottlenecking is about "future performance". Well DX12 is the future (and Vulkan) and DX11is basically the past through 2017, indicative. So what information is really going to be gleaned from DX11 testing?
 

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,328
4,913
136
So apparently there is a technical issue where AotS performance tanks on 6 and 8 core CPUs (both Intel and AMD) as it displays added detail for higher core counts (article updated to reflect these findings):

http://www.pcgameshardware.de/Ryzen-7-1 ... 24503/#idx

Short summary of important points for those of you who don't read German:
-Also affected i7-6900K framerates, which made them suspect something was off
-The results they saw were weird, so they retested, including SMT on/off. No difference.
-Tested AVX2 by manually downclocking AVX2 on the Intel HEDT chip and on the Kaby Lake 7700K. No difference.
-Finally discovered that fewer than six physical cores (disclaimer, they did not test 5 physical cores) results in the game removing entire elements such as enemy defensive fire particle effects from the game.
-Tested on a simulated 4 core Ryzen "1500X" and discovered same performance uplift. Performance versus 7700K within margin of error.
-Conclusion: In-game performance (not canned benchmark) are only comparable quad to quad and octa to octa.
 
Last edited:

KompuKare

Golden Member
Jul 28, 2009
1,013
924
136
-Conclusion: Benchmarks are only comparable quad to quad and octa to octa.
Slight correct if I have understood this correctly. The built-in benchmark works fine (doesn't change the detail depending on detected core count), but because PCGH made their benchmark using a savegame it is not possible to compare <6 and >6 core CPUs as the game will change the details making the results not like-for-like.
In the regular game, but not in the integrated benchmark, AotS will - on its own and without the possibility of the player being able to change this - reduce the picture quality [complexity] on CPUs with less than six physical cores (SMT does not play a role in this - a hexcore which has had SMT disabled in the UEFI has to calculate the full complexity; it should be pointed out that we were not able to try this with a CPU with five physical cores.

EDIT: Ah, just read all of IEC's other comments in the other threads, and of course I had to read and reply to one post which didn't make the distinction between the built-in bench and normal in-game.
 
Last edited:
  • Like
Reactions: IEC

Gikaseixas

Platinum Member
Jul 1, 2004
2,836
218
106
Ryzen + AMD GPU = Epic
But reviewers wont give it much of a chance until Vega / or if Vega even performs reasonably well
 
  • Like
Reactions: inf64

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,328
4,913
136
Slight correct if I have understood this correctly. The built-in benchmark works fine (doesn't change the detail depending on detected core count), but because PCGH made their benchmark using a savegame it is not possible to compare <6 and >6 core CPUs as the game will change the details making the results not like-for-like.

EDIT:
Ah, just read all of IEC's other comments in the other threads, and of course I had to read and reply to one post which didn't make the distinction between the built-in bench and normal in-game.

Good catch, I edited my comment to clarify that this applies in-game, not to the canned/built-in benchmark.
 
  • Like
Reactions: inf64

lolfail9001

Golden Member
Sep 9, 2016
1,056
353
96
Even in Adored's testing it was regressing on the 7700 as well
Minor regressions are characteristic of nV Dx12 performance and are not anywhere near as dramatic as they used to be few months ago with losses in performance comparable to what Ryzen + nV posts presently.
Where the 480CF increased performance across the board on everything in DX12.
Of course it did, the dim Scot forgot to enable Crossfire in Dx11 drivers, as result he compared a single 480 in Dx11 against 2 480s in Dx12. So, his comparisons for AMD bear no relevance. At all.

Well DX12 is the future (and Vulkan) and DX11is basically the past through 2017, indicative.
Dx12 is the future? To me it looks like the past, if anything. So no, it does not invalidate anything except making sure that Dx12 benches are presently irrelevant.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,770
3,590
136
Of course it did, the dim Scot forgot to enable Crossfire in Dx11 drivers, as result he compared a single 480 in Dx11 against 2 480s in Dx12. So, his comparisons for AMD bear no relevance. At all.
Proof?

For what its worth NV might have just added game-specific optimizations for Hitman DX12 and spun a story how its DX12 drivers are now better.
 

lolfail9001

Golden Member
Sep 9, 2016
1,056
353
96
Easy, https://www.hardocp.com/image/MTQ2ODIxNjQyNm9BYnNZQkhEdUdfNl8yX2wuZ2lm

Here is how 2 480s stack up to a single 1070 in DirectX 11. Yes, 1440p, i know, but since his runs were clearly not CPU limited, it suffices.

For what its worth NV might have just added game-specific optimizations for Hitman DX12 and spun a story how its DX12 drivers are now better.
For what it's worth nV used to have similar performance losses with DirectX 12 when running Intel CPUs few months ago.