Ryzen: Strictly technical

Page 28 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

unseenmorbidity

Golden Member
Nov 27, 2016
1,395
967
96
Or, AMD's PR arm isn't talking to the engineers...
Or, MS said that will cost millions, and AMD didn't want to pay for the update, so, now, all is fine, brush it under the rug, nothing new here, carry on!
But they are tied to this for years. A few million shouldn't stop them from fixing it.
 

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,323
4,904
136

unseenmorbidity

Golden Member
Nov 27, 2016
1,395
967
96
I think AMD is preferring to work with the specific game companies whose engines have issues rather than dealing with Wintel.

I've found at least two bugs in Windows related to Ryzen (setting groupsize disables half of Ryzen cores, but acts properly on Intel (Windows 7 - haven't tested with 10, yet) and setting affinity to every other logical core forces workloads to just two cores (Windows 10, about to test on 7)).
I feel like the only way you give that statement is if you know it cannot feasibly be fixed. Kill expectations to minimize future backlash.

Perhaps,if they treat the R7 as 2 discrete CPUs/NUMA nodes will harm performance on a global scale as it will cause RtR timings to go through the roof.

Best case scenario in that situation would be hoping game devs jump through hoops to give amd a fighting chance.

Very disappointing...
 
Last edited:

vbored

Junior Member
Sep 7, 2015
12
2
41
AMD supply the hardware for the current xbox and the one due this year, I don't fully buy microsoft wouldn't lift a finger to help fix it if they could push a solution that wasn't to extreme. If it's not the scheduler like amd has said maybe ms might be able to help with something in that new game mode thing that's in beta, has anyone using the fast ring tried testing ryzen cpus in the games with the performance issues in game mode?

I know it's a long shot seeing the intel with nvidia/amd gpu comparisons ive seen don't seem to do much performance wise but its probably the only place at an OS level you can expect to see improvements now.
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
AMD supply the hardware for the current xbox and the one due this year, I don't fully buy microsoft wouldn't lift a finger to help fix it if they could push a solution that wasn't to extreme. If it's not the scheduler like amd has said maybe ms might be able to help with something in that new game mode thing that's in beta, has anyone using the fast ring tried testing ryzen cpus in the games with the performance issues in game mode?

I know it's a long shot seeing the intel with nvidia/amd gpu comparisons ive seen don't seem to do much performance wise but its probably the only place at an OS level you can expect to see improvements now.

It's DEFINITELY the scheduler. But there are other bugs that are much more important related to Windows and Ryzen.

I would not recommend anyone to run Windows 7 and Ryzen. If those are your plans - BUY INTEL... or learn to modify Windows 10 to be more like Windows 7.

So far, Windows 10 is better across the board in my testing using Radeon R9 Fury. Others seem to be finding similar results - the problem seems to be that nVidia + Ryzen + Windows 10 isn't as good as nVidia + Ryzen + Windows 7.

And that same situation may apply to nVidia + Intel + Windows 7... Ryzen may not be a contributing factor in those discoveries.

BF4 Windows 7, Ryzen 1700X Stock, R9 Fury 1050MHz:
Win7_Default_BF4_GPU_Usage.jpg


BF4 Windows 10, Ryzen 1700X Stock, R9 Fury 1050Mhz:
Win10_Default_Ryzen_BF4_GPU_Usage.jpg


Same map, same server, same driver version, same everything...

6C results on Windows 10, for kicks:
Win10_HighPerformance_6C_12T_Ryzen_BF4_GPU_Usage.jpg


EDIT:

Not that the game is problematic while running:

Win7_Default_BF4_FPS_Chart.jpg


Oh, and, I'm using the fast ring, Build 1607, all updates for Windows 10. It performs the same as Build 1511.
 
Last edited:

JimmiG

Platinum Member
Feb 24, 2005
2,024
112
106
If you follow the OCN thread you can get it before it gets published to official channels. Elmor works for ASUS ROG R&D and distributed the 0902 BIOS on Friday (when I installed it).

Do you really want to install a beta BIOS from some random Dropbox link in a forum thread on your brand new $1000+ Ryzen build, though? Unless you have serious stability or performance issues, I'd say wait until the BIOSes become officially validated (which takes more than one guy working in the R&D department) before installing. Not worth it for 5% higher memory speeds or whatever...
I think I'm beginning to understand why all those Asus boards suddenly started bricking themselves...
 
  • Like
Reactions: looncraz

iBoMbY

Member
Nov 23, 2016
175
103
86
Do you really want to install a beta BIOS from some random Dropbox link in a forum thread on your brand new $1000+ Ryzen build, though? Unless you have serious stability or performance issues, I'd say wait until the BIOSes become officially validated (which takes more than one guy working in the R&D department) before installing. Not worth it for 5% higher memory speeds or whatever...
I think I'm beginning to understand why all those Asus boards suddenly started bricking themselves...

Yes, they are bricking because every BIOS before 0902 has a serious bug. And if you don't trust a known ASUS developer and overclocker with his uploads, you can of course wait for the official publish, but that takes at least 24-48 hours longer because of formalities.
 
  • Like
Reactions: ZGR

imported_jjj

Senior member
Feb 14, 2009
660
430
136
New AMD blog post https://community.amd.com/community...4/tips-for-building-a-better-amd-ryzen-system

"as part of AMDs ongoing development of the new AM4 platform, AMD will increase support for overclocked memory configurations with higher memory multipliers. We intend to issue updates to motherboard partners in May that will enable them, on whatever products they choose, to support speeds higher than the current DDR4-3200 limit without refclk adjustments."

Wish they would have mentioned expectations for 2x16GB too as 2666MHz max is a severe limitation today.
Anyway, they seem to have figured out why SMT was problematic in F1 2016
Edit: It also explains why some tests have the 7700k on top (Techspot) in this game while others (Computerbase) have SKUs with more cores doing better.

  • AMD Ryzen™ 7 1800X (8C16T/3.6-4.0GHz)
  • 16GB G.Skill (2x8) DDR4-3200
    • Clocked to 2133MT/s: 15-15-15-35-1t
    • Clocked to 2933MT/s: 14-14-14-30-1t
  • ASUS Crosshair VI Hero (5704 BIOS)
  • 1x AMD Radeon™ RX 480 GPU (Radeon Software 17.2.1)
  • Windows 10 Anniversary Update (Build 14393.10)

    Throughout this process we also discovered that F1™ 2016 generates a CPU topology map (hardware_settings_config.xml) when the game is installed. This file tells the game how many cores and threads the system’s processor supports. This settings file is stored in the Steam™ Cloud and appears to get resynced on any PC that installs F1™ 2016 from the same Steam account. Therefore: if a user had a 4-core processor without SMT, then reused that same game install on a new AMD Ryzen™ PC, the game would re-sync with the cloud and believe the new system is also the same old quad core CPU.
    Only a fresh install of the game allowed for a new topology map that better interpreted the architecture of our AMD Ryzen™ processor. Score one for clean computing! But it wasn’t a complete victory. We also discovered that the new and better topology map still viewed Ryzen™ as a 16-core processor, rather than an 8-core processor with 16 threads. Even so, performance was noticeably improved with the updated topology map, and performance went up from there as we threw additional changes into the system.
    As an ultimate maneuver, we asked the question: “Can we edit this file?” The answer is yes! As a final step, we configured F1™ 2016 to use 8 physical CPU cores, rather than the 16 it was detecting by default. Performance went up again! After all was said and done, we gained a whopping 35.53% from our baseline configuration showing how a series of little changes can add up to something big.

    pastedImage_92.png
 
Last edited:

looncraz

Senior member
Sep 12, 2011
722
1,651
136
After a long couple of days of testing, I can make the following statements with extreme confidence:

Relative Performance:
  • Ryzen has 11% higher IPC than Sandy Bridge
  • Ryzen has a whopping 28% higher multi-threaded performance per clock than Sandy Bridge
  • Ryzen has 52.5% higher IPC than Excavator
  • Ryzen has a gargantuan 82.05% higher multi-threaded performance per clock than Excavator
Memory Sensitivity:
  • Ryzen has some memory latency issues... but they only rarely impact application performance.
  • Multi-threading performance is most sensitive to memory frequency
  • Cinebench is actually memory sensitive on Ryzen!!! Not much, but it's there!
Stability:
  • In my days of testing, I've not had one application crash.
  • Ryzen seems to have built-in safe-guards that may be hiding true over-clocking potential.
Curios:
  • Ryzen employs a self-learning and correction system... it makes the system seem like it has entered an endless boot loop. It follows an exacting pattern: Five full power cycles, two warm reboots, a partial boot, then a normal boot.
  • Memory compatibility issues seem to be almost completely related to not being able to select 2T command rate.
  • Performance seems to have positive scaling with frequency in some scenarios - mean 5% clock speed brings 7% performance increase. I am trying to track down if this is an aberration or if this is due to time-based latencies.

I will be testing clock scaling tomorrow as well as verifying a couple of these numbers. I will also work on getting my results online here.
 

JimmiG

Platinum Member
Feb 24, 2005
2,024
112
106
Yes, they are bricking because every BIOS before 0902 has a serious bug. And if you don't trust a known ASUS developer and overclocker with his uploads, you can of course wait for the official publish, but that takes at least 24-48 hours longer because of formalities.

He might be the janitor sneaking up to unlocked development systems with a USB stick for all we know. Those "formalities" include validating it with a wide range of peripherals, memories etc. to make sure it's stable and won't get stuck in a boot loop (bricking itself) on some configuration. That's worth a 24 hour wait, IMO.
 
  • Like
Reactions: looncraz

imported_jjj

Senior member
Feb 14, 2009
660
430
136
He might be the janitor sneaking up to unlocked development systems with a USB stick for all we know. Those "formalities" include validating it with a wide range of peripherals, memories etc. to make sure it's stable and won't get stuck in a boot loop (bricking itself) on some configuration. That's worth a 24 hour wait, IMO.

Elmor is a well known overclocker and this is not a new practice by any means, people do it on their own risk and they are well aware of that.
 
  • Like
Reactions: lightmanek

ndtech

Junior Member
Mar 14, 2017
8
2
51
For Ryzen owners.
Please test Ryzen with WinRAR benchmark in single-channel mode and dual-channel mode.
We need WinRAR benchmark results with both modes: Multithreading option off / on.

With dual-channel mode:
Test 1 - WinRAR
Test 2 - WinRAR with Affinity to cores 0-7 (CCX0)
Test 3 - WinRAR with Affinity to cores 8-15 (CCX1)

Then run benchmark with single-channel mode.
You must remove one RAM module. So system will use only one channel.
Test 4 - single-channel - WinRAR
Test 5 - single-channel - WinRAR with Affinity to cores 0-7 (CCX0)
Test 6 - single-channel - WinRAR with Affinity to cores 8-15 (CCX1)

If results of single-channel tests are better, then Ryzen uses separated memory controllers with one memory controller per one CCX.
 

Kromaatikse

Member
Mar 4, 2017
83
169
56
Throughout this process we also discovered that F1™ 2016 generates a CPU topology map (hardware_settings_config.xml) when the game is installed. This file tells the game how many cores and threads the system’s processor supports. This settings file is stored in the Steam™ Cloud and appears to get resynced on any PC that installs F1™ 2016 from the same Steam account. Therefore: if a user had a 4-core processor without SMT, then reused that same game install on a new AMD Ryzen™ PC, the game would re-sync with the cloud and believe the new system is also the same old quad core CPU.

*FACEPALM*

It seems that Microsoft isn't the only company capable of making utterly boneheaded design decisions. But the good news is that a game developer has a better chance of fixing them.

As a final step, we configured F1™ 2016 to use 8 physical CPU cores, rather than the 16 it was detecting by default. Performance went up again!

That does seem to confirm that inter-CCX bandwidth and/or latency is a problem for games - as long as this game is setting affinity for its threads to CPUs 0-7, rather than to all the even-numbered CPUs.

  • Memory compatibility issues seem to be almost completely related to not being able to select 2T command rate.

This is useful information indeed.
 

OrangeKhrush

Senior member
Feb 11, 2017
220
343
96
I feel like the only way you give that statement is if you know it cannot feasibly be fixed. Kill expectations to minimize future backlash.

Perhaps,if they treat the R7 as 2 discrete CPUs/NUMA nodes will harm performance on a global scale as it will cause RtR timings to go through the roof.

Best case scenario in that situation would be hoping game devs jump through hoops to give amd a fighting chance.

Very disappointing...

It ultimately depends where people expected Ryzen to be, I always thought something like a Haswell or thereabout which if you aggregate the tests is about on par. Ryzen is step 1 which was necessary and AMD did well with that. I think the harder will be step 2 which would require AMD to show progress on a baked in uarch.

What I have been unhappy about is the motherboard shortages and the BIOS issues, but that is at least fixable in this turn.

Where I think AMD can improve for PR is on the IMC side obviously and they need to ditch 14nm LPP like a bad habit, it will not scale frequency well seems to hit the wall mid to low 3Ghz range. I think a new node will be able to support truly comparable clock speeds. During this turn 14nm LP may mature as the stepping come but I don't think it is a node designed for High performance.
 
  • Like
Reactions: Minkoff

Kromaatikse

Member
Mar 4, 2017
83
169
56
I haven't read here yet about a crash with some FMA3 instructions on Ryzen (see http://forum.hwbot.org/showthread.php?t=167605). This sounds familiar to a similar problem with Skylake and Prime95 when it was released, there was also a crash because of some AVX instructions in Prime95.

From late in that thread: "Was told this issue will be fixed in a new AGESA code."
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
It ultimately depends where people expected Ryzen to be, I always thought something like a Haswell or thereabout which if you aggregate the tests is about on par. Ryzen is step 1 which was necessary and AMD did well with that. I think the harder will be step 2 which would require AMD to show progress on a baked in uarch.
They still have some interesting options for Zen2, which in part didn't make it into Zen1 due to complexity and available time and other resources.

As I wrote before, they might use Zen1 as a more general purpose core to be used from top to bottom, servers to mobile. But Zen2 might be added to the portfolio as a more specialized core, improving on the remaining weaknesses of Zen1. For example a likely K12 related AMD patent (covering an AArch64 CPU) showed a third AGU. Schedulers, FPRF read ports for FMA, renamer, buffer sizes, SMT partitioning, etc. could still be improved on. Mind you, that Ryzen is where it is with all those trade offs.
 

naukkis

Senior member
Jun 5, 2002
701
569
136
I'm pretty certain the DRAM IP isn't supplied by Rambus.
That's because the IPs used in Steamroller and Excavator weren't and Zeppelin has almost identical controller structure (at interface register level) as SR and XV had.

Rambus has made DDR4 and HBM2 memory controllers for GF 14lpp. I can't imagine other GF's customer than AMD to need one.

And as third party memory controllers are usually DFI-compliant they naturally share that standardized interface register level.
 
Last edited:

CrazyElf

Member
May 28, 2013
88
21
81
Quick questions

  • UCLK memory controller
  • DFICLK and FCLK - both are data fabric clocks? What is the difference?
  • Then the GMI is 4x the FCLK and NOT the DFICLK?
These 3 (UCLK, DFICLK, and FCLK) run at 50% of MEMCLK (RAM speed)l, although UCLK can be modified to run at 100%.
 
Last edited:
  • Like
Reactions: looncraz

CrazyElf

Member
May 28, 2013
88
21
81
They still have some interesting options for Zen2, which in part didn't make it into Zen1 due to complexity and available time and other resources.

As I wrote before, they might use Zen1 as a more general purpose core to be used from top to bottom, servers to mobile. But Zen2 might be added to the portfolio as a more specialized core, improving on the remaining weaknesses of Zen1. For example a likely K12 related AMD patent (covering an AArch64 CPU) showed a third AGU. Schedulers, FPRF read ports for FMA, renamer, buffer sizes, SMT partitioning, etc. could still be improved on. Mind you, that Ryzen is where it is with all those trade offs.



Yeah that would be good.

The SMT performance drops we are seeing would disappear if that were to happen.

I'm thinking:

  • L4 cache
  • If they could get the clocks up to 14LPU, we're looking at a few hundred more MHz - very good if they want to go head to head with Skylake E (out in a few months). We expect Zen+ to be out in the first half of 2018.
  • I really think they need to find a way to reduce memory controller latency
  • Increase buffer sizes as you've discussed
  • Maybe the other improvements (Ex: third AGU)
 
Status
Not open for further replies.