Official AMD Ryzen Benchmarks, Reviews, Prices, and Discussion

Page 161 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Shivansps

Diamond Member
Sep 11, 2013
3,835
1,514
136
Hasn't considering the hardware that will execute code sort of always been pretty important to code-monkeys? I'm obviously not a programmer, so there is most likely a disconnect between what I think and reality.

It seems to me that if you're going to invest yourself heavily in a project for months/years, you would try to ensure that your efforts are as performant as possible to as wide of a market as is reasonable. What I'm getting at is that I think it's pretty much a core responsibility to make accommodations to your market within reason.

Cost, if you are gonna do that you need to account for ALL cpus and try to build the software on top of it. Them tomorrow a 2 more cores cpus shows up and your code is outdated. The way is done today is the correct way, let the operative system decide were to place threads, the scheduler also do load balancing, that you cant do.. this is the correct and most efficient way, they just need to get rid of the main thread bottleneck and we are home free.
 

Shivansps

Diamond Member
Sep 11, 2013
3,835
1,514
136
I thought it was pretty clear that AMD want the scheduler updated to consider the effect of trashing cache across the CCX.

Obviously it wasn't clear enough.

Maybe AMD need to break out the crayons next time to explain it...

Sure, them why is affecting production software and games in diferent ways?

Why is Lisa Su in damage control blaming game performance on developers? And saying future games will be optimise for it?
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
AMD wants developers to going back to a old way of programing, devs do not babysit threads anymore, thats the scheduler work. They wants devs to account for every current cpu and future cpu and write proper MT code and manually set affinity to best utilize every one of them? they are crazy or what, devs just dont do that anymore, they just start threads and let the scheduler figure out what to do with them, that is also cross platform friendly.
One the main bottleneck today is removed, the main thread, we all gonna win from it.

What are you talking about? NUMA awareness is critical to many applications - and isn't quite what is needed for Ryzen, but it would work just fine. This was a giant issue for gaming on the Core 2 Quad.

We're only talking about a 10% performance deficit from expectation, with a few outliers as high as 25%. That is nothing you would ever notice without a FPS overlay.
 
  • Like
Reactions: Drazick and CatMerc

SunburstLP

Member
Jun 15, 2014
86
20
81
Someone needs to run a heavy, very low prio, process that loads up every core and see if gaming performance improves - that should trick the Windows kernel into not load leveling as it won't find a lesser used core.

Wasn't this hack used semi-recently for something? Ie, running a background task affected performance positively. I wanna say it was to stream more smoothly, but if I could remember....
 

lolfail9001

Golden Member
Sep 9, 2016
1,056
353
96
Someone needs to run a heavy, very low prio, process that loads up every core and see if gaming performance improves - that should trick the Windows kernel into not load leveling as it won't find a lesser used core.
Uhm, would not the penalty from having such process take it's toll on used threads [by game] easily outweigh any possible benefit from avoiding shuffling? Sounds hard to control for, imho.

Besides, even if scheduler update solves the shuffling issue, there is still a "small" problem with DRAM being LLC and DRAM latency being horrid. I guess that remains to be solved for Zenver 2,3,4? After all, Ryzen only really performs like a proper beast in software that is embarrassingly parallel from what i pick up in reviews.
 

SunburstLP

Member
Jun 15, 2014
86
20
81
If thats the problem them why a 6800K is faster? And this was on AMD own slides.
And we're back to inter-CCX communication and cache thrashing being negatively affected (potentially, theoretically) by the scheduler.

ETA: I'm sorry if that reads poorly. We should keep in mind that we're all having a speculative conversation based largely on unsubstantiated information and theorycrafting mixed with a healthy dose of varying levels of expertise and backgrounds.
 
Last edited:

scannall

Golden Member
Jan 1, 2012
1,944
1,638
136
Well, i guess i learned my lesson to not act a goat against a slew of others.

But since i have not been keeping up with the thread last few days: did anyone actually test what happens when you try to workaround supposed issue with scheduling by forcing affinities and comparing results?

Also, i am surprised to learn that Ryzens are actually binned. That is actually a scary discovery.
I find these results interesting.
Ryzen on Linux
Ryzen on Windows 10

Same clock, same motherboard. Very close in ST. But the Windows 10 MT the 17% slower. I'd be very curious to know what the Windows 7 MT score is.
 

scannall

Golden Member
Jan 1, 2012
1,944
1,638
136
On closer examination: they make 0 sense. Literally 0. Win10 runs faster memory, but benchmark results swing heavily from favoring one or another side.
Linux doesn't seem to have the scheduler issue. Nor does Windows 7. Seems there is free performance still on the table waiting on Microsoft to fix it.
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
Uhm, would not the penalty from having such process take it's toll on used threads [by game] easily outweigh any possible benefit from avoiding shuffling? Sounds hard to control for, imho.

Besides, even if scheduler update solves the shuffling issue, there is still a "small" problem with DRAM being LLC and DRAM latency being horrid. I guess that remains to be solved for Zenver 2,3,4? After all, Ryzen only really performs like a proper beast in software that is embarrassingly parallel from what i pick up in reviews.

It shouldn't pose much of a problem, actually. I used to run SETI full-time in the background and game. The hit was, at worst, like 2%.

My guess is that the benefit of killing thread shuffling and load balancing will outweigh the costs.

It is, of course, extremely vital that the threads are set to the lowest possible priority.

Yes, DRAM latency and the general design is holding back optimal performance - but that holds back productivity performance just as much.... so it's assumed as a constant.
 
  • Like
Reactions: Drazick

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
Yes all those things could be aplied to AMD and more.
Phenom X6 "ergh the problem is that you need to make your software more MT", FX8150"ergh the problem is that you need to make your software even more MT", CMT, 754,939, AM2, AM2+,AM3, AM3+ FM1, FM2, FM2+, AM1, XOP, FMA4, 3DNOW, SSE4A, Polaris "errgh the problem is you need to make your game DX12/Vulkan", Ryzen "arrrgh there is no way you gonna belive me if i say the problem is MT in software for the 3rd time right?".

Seriusly, this is getting old... its like Intel screaming at developers why they dont use AVX in every new generation since SB.

Take a reality check:
My i remind zen 1800x is faster than a 6900 in r15 and several other workloads and does so in a 95 w tdp.
Its an insanely powerfull and efficient arch out the gate.
Ofcource there is a few things to get fixed.
But no need for the sarcasm of a processor that beats a 6900 in r15. Man. Wake up.
 

lolfail9001

Golden Member
Sep 9, 2016
1,056
353
96
Yes, DRAM latency and the general design is holding back optimal performance - but that holds back productivity performance just as much.... so it's assumed as a constant.
But does it? Simplifying, for picture's sake, something like Cinebench just spawns a bunch of worker threads and waits for them to finish, maybe even by the plain join. Only once does each of worker threads need to communicate to final one: to report being done. And since Cinebench does not scale that much with memory on any platform, it can be presumed that memory accesses of individual workers are fairly sparse and are not impacted by slower IMC.

OTOH any well threaded game expects to communicate with it's worker threads on every iteration of main loop.

Wait a minute... Semaphores and general thread management stuff is done via syscalls, isn't it? Because now i remember that Ryzen had pretty subpar performance in SC2 as well and neither thread shuffling nor SMT nor deficit of ST performance can explain it. But abundance of syscalls in heavy scenarios can.

My i remind zen 1800x is faster than a 6900 in r15 and several other workloads and does so in a 95 w tdp.
Listen, i like Ryzen, but let's not fool ourselves here and pretend 1800x is a 95w tdp CPU.
 
Last edited:
  • Like
Reactions: Conroe

moonbogg

Lifer
Jan 8, 2011
10,635
3,095
136
Disabling SMT helps in games, but its not enough to fully explain why Ryzen is on par or better in threaded work apps but falls behind in gaming. My guess is that AMD designed the chip for server workloads and the cost effectiveness of the design process was optimized for those workloads. I think they likely spent their time where the real money is and game performance was going to end up where ever it ended up. That's the simplest explanation I could think of and it makes a lot of sense, at least to me it does. I think Intel has just had a LOT more time (and money) to design a great server chip while making sure it does as well as possible in consumer apps. AMD hasn't had the luxury of time or money to work out all the compromises. That's what I think at least.
I swear though, If Ryzen 2.0 actually gets fixed, clocks decently and is on par or better than Intel chips and at a low cost like these current chips are, I will switch to Ryzen right away. Who wouldn't?
 
  • Like
Reactions: Conroe

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
On closer examination: they make 0 sense. Literally 0. Win10 runs faster memory, but benchmark results swing heavily from favoring one or another side.

I suspect that will be a result of the single thread analyses running on 1 core.

Then, for multi-thread, the schedulers doing their own thing. Linux being more NUMA sympathetic than Windows for certain applications.
 

dogen1

Senior member
Oct 14, 2014
739
40
91
no. people do tests on realistic setups for determining current/TODAY performance . low res is to see which CPU will bottleneck first in the future (which is an invalid assumption).

Not if you only consider the results applicable for the game you're testing with.
 

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
Disabling SMT helps in games, but its not enough to fully explain why Ryzen is on par or better in threaded work apps but falls behind in gaming. My guess is that AMD designed the chip for server workloads and the cost effectiveness of the design process was optimized for those workloads. I think they likely spent their time where the real money is and game performance was going to end up where ever it ended up. That's the simplest explanation I could think of and it makes a lot of sense, at least to me it does. I think Intel has just had a LOT more time (and money) to design a great server chip while making sure it does as well as possible in consumer apps. AMD hasn't had the luxury of time or money to work out all the compromises. That's what I think at least.
I swear though, If Ryzen 2.0 actually gets fixed, clocks decently and is on par or better than Intel chips and at a low cost like these current chips are, I will switch to Ryzen right away. Who wouldn't?
I think we know the drill and i agree with you a long way.
Do you game at +120 Hz btw?
There is a few new games of relevance that needs some work for zen even at 90fps but imo Its what is comming in a year or two that is of importance here. The few games will get a patch.
The dual ccx is there to stay for the next gens of zen and with it the limitations. But ram latency will be improved and ram speed.
The new windows schduler is soon here in its basic form and the most fringe new games will be patched and then we are 1 year into zen and multithreaded games march on.
So zen plus will look far nicer on the graphs. 150 vs 100 fps. But frankly i dont think it matters except for the minor 144Hz segment.
In 4 years we are 2 bf games newer from dice and how does a 4c 8t processor perform then? It will be unplayable. Heck even the next gen bf is beeing a stretc for 4c8t if its tailored for the next gen consoles. You can even bring s skl i7 to the knees today in bf1.

I dont know peoples budget for cpu. But if you play 144 there is much good to be argued for buying a 6800 and oc it if you have the doe.
For a slightly bit lesser cost a non x 1700 oc seems to me like a 5 year safe buy if 60hz/75hz is used. I think it will outlast even a sb 2500 because of next gen consoles using the same tech.
 

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
Not if you only consider the results applicable for the game you're testing with.
Well not even there actually. Lol. As new gfx and driver changes the outcome here too.
Its a sad state.

But anyway. Considering how much visual impact graphs gives its nowhere near the relevance of the information for the casual consumer. I mean unless you have some outright slow cpu its mostly theoretical information.
The effect for 99% readers is just misinformation.
The energy and time was used better elsewhere.
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
But does it? Simplifying, for picture's sake, something like Cinebench just spawns a bunch of worker threads and waits for them to finish, maybe even by the plain join. Only once does each of worker threads need to communicate to final one: to report being done. And since Cinebench does not scale that much with memory on any platform, it can be presumed that memory accesses of individual workers are fairly sparse and are not impacted by slower IMC.

Cinebench needs to constantly access the scene data from every thread. That data doesn't change, so there's no lock contention for the scene data, but that's all that is saved over something like a game. Games usually divide up their tasks such that they have little to no lock contention as well and everything happens in a cadence.

It's that cadence that's responsible for the exaggerated drop in performance - it's driven by what is usually the heaviest thread, which is the thread most impacted by load leveling and is usually the most cache-aware and heavily optimized.

Wait a minute... Semaphores and general thread management stuff is done via syscalls, isn't it? Because now i remember that Ryzen had pretty subpar performance in SC2 as well and neither thread shuffling nor SMT nor deficit of ST performance can explain it. But abundance of syscalls in heavy scenarios can.

Not as much as in the past. Fast syscall, benaphores, user-mode thread management, etc... have greatly reduced the need for the syscall instruction (CPU level instruction / kernel mode / ring 0 / whatever).

That said, it can't be taken for granted. I'll try to find a good benchmark and run it on each of my systems to test syscall performance. Probably will end up being on a bootable Linux drive, since Windows isn't uniform in my house.
 
  • Like
Reactions: Drazick and CatMerc

looncraz

Senior member
Sep 12, 2011
722
1,651
136
Not if you only consider the results applicable for the game you're testing with.

Which the reviewers, themselves, do NOT do. They apply it overall to the CPU's overall gaming ability by using unrealistic settings.

Someone buying a $300+ CPU will NOT be running less than 1080p - simple as that.

If the game is putting out 120FPS consistently - then it doesn't matter if another CPU will get you 140FPS... it just isn't anything you'd ever notice... ever.

Ryzen is a perfectly acceptable CPU for gaming - but, just like the 6900k, it trades single threaded performance (frequency, inter-core bandwidth) for multi-threaded performance. Ryzen just goes a little further than the 6900k. Fair trade for the money.

And if you start looking at the few games that don't push > 100FPS @ 1080p ultra with a GTX 1080 then you need to do a review on why those games are such poor performers and not make that a part of a CPU review.

Not that games don't belong as part of a CPU review - they absolutely do - but the methodology should be to determine if the CPU holds you back in a set of games. It's only not done that way because it is VERY time consuming.
 
  • Like
Reactions: Drazick

Head1985

Golden Member
Jul 8, 2014
1,863
685
136
Disabling SMT helps in games, but its not enough to fully explain why Ryzen is on par or better in threaded work apps but falls behind in gaming.
Problem is 2x4cores and because off that 2x8MB L3 cache and slow IMC.If cpu miss instructions in L1 and L2 then last save is L3 which is unfortunately somehow bad with High latency and also is not 16MB but 2x8MB so each CPU have only 8MB.If cpu miss even L3 cache then it need access to System memory and here is another problem:Ryzen have weak IMC with very HIGH latency.
Work apps dont need fast memory at all.Games need them.
This is also why skylake/kabylake scale so good with faster memory-they have small 8MB L3 cache and they miss alot of instructions and then they have to access system memory, but they have excelent IMC with very low latency so they can gain 30% performance just from faster memory.

Btw hardware fr done some more testing L3 have insane latency with 8MB.
http://www.hardware.fr/articles/956-24/retour-sous-systeme-memoire-suite.html