Ryzen: Strictly technical

Discussion in 'CPUs and Overclocking' started by The Stilt, Mar 2, 2017.

  1. dfk7677

    dfk7677 Member

    Joined:
    Sep 6, 2007
    Messages:
    59
    Likes Received:
    20
    @The Stilt What is your explanation of Ryzen not being fully utilized in games, even when there is no GPU bottleneck?
     
    PotatoWithEarsOnSide likes this.
  2. CatMerc

    CatMerc Senior member

    Joined:
    Jul 16, 2016
    Messages:
    466
    Likes Received:
    444
    Is it something you've heard or just saying that for posterity? ;)
     
  3. Insert_Nickname

    Insert_Nickname Platinum Member

    Joined:
    May 6, 2012
    Messages:
    2,846
    Likes Received:
    78
    This is a most informative thread, and an impressive piece of work by The Stilt. Thank you...!

    Just got back from building and field testing my Ryzen build, so this write-up is much appreciated. I think you did a lot better then most reviews. ;)

    That said, the only issue I've had so far is the Crosshair has a serious dislike of the Crucial memory I got. I simply cannot get it above 2400MHz (rated 2666MHz) no matter what timings and voltage, but it chucks along just fine at 2400/15-15-15-36 completely stable. Well done for a brand new platform. Did have a single crash, but that was due to the dumbs, not the system itself.
     
    inf64 likes this.
  4. TheELF

    TheELF Platinum Member

    Joined:
    Dec 22, 2012
    Messages:
    2,062
    Likes Received:
    77
    Games have to respond to your input,this means that no matter how multithreaded the game is,there is one thread that will always run at 100% of available one-core/single threaded and slowing down the multithreaded part of the game,if you want to see close to full utilization look at pure gpu benchmarks like the division or tomb raider with the fastest card possible and the smallest resolution possible.
     
  5. dfk7677

    dfk7677 Member

    Joined:
    Sep 6, 2007
    Messages:
    59
    Likes Received:
    20
    That wasn't the case though. No thread went up to 100% in most cases.


    Why would the 1700 have less load (on average and in most case in most loaded thread), with no GPU bottleneck but less output (FPS)?
     
    ButtMagician likes this.
  6. imported_jjj

    imported_jjj Senior member

    Joined:
    Feb 14, 2009
    Messages:
    589
    Likes Received:
    399
    Do you have the means to test the PCIe slot and verify that it isn't the cause of the gaming complications?
     
  7. TheELF

    TheELF Platinum Member

    Joined:
    Dec 22, 2012
    Messages:
    2,062
    Likes Received:
    77
    Because there are enough cores to juggle the threads around so none reaches 100% usage in task manager/msi because they are measuring over time.
    Run process hacker and look at what the actual threads are doing in real time.
     
    looncraz likes this.
  8. PotatoWithEarsOnSide

    Joined:
    Feb 23, 2017
    Messages:
    102
    Likes Received:
    105
    The combination of being:
    a) nowhere near full load on CPU,
    b) noweher near full load on GPU,
    c) using 3x more memory at that particular point, and
    d) having circa 10-15 fps lower than the 7700k at the same time...

    ...is what I found strange.

    What woud cause the memory usage to spike like that?
    L1>L2>L3>DRAM, right...?
    They both have similar L1 cache, R7 1700 has double L2 cache, and R7 1700 has access to more L3 cache, though through 2*CCXs.
    The problem has to be with the L3, right?
     
  9. dfk7677

    dfk7677 Member

    Joined:
    Sep 6, 2007
    Messages:
    59
    Likes Received:
    20
    I think the memory refers to VRAM and not system memory.
     
  10. Rngwn

    Rngwn Member

    Joined:
    Dec 17, 2015
    Messages:
    126
    Likes Received:
    18
    My $0.02, this is from a layman so take this with a huge pile of salt:

    If the wonky coreinfo dump on Ryzen CPU really is how Windows sees the logical cores being mapped to the cache and handles the the way it is (as opposed to being just a "wrong presentation"), then the Ryzen right now may suffer from serious cache thrashing. The L3 are supposed to be seen as 2x8 MB, each shared by the cores of the CCX instead of each logical processors having their own 8 MB or L3 as seen in the dump. This bug is likely makes the windows sees 8x16 = 128 MB of L3. I could imagine the "excess cache" actually goes down in the RAM or trashing like crazy within each of the 8 MB cache.
     
  11. The Stilt

    The Stilt Golden Member

    Joined:
    Dec 5, 2015
    Messages:
    1,216
    Likes Received:
    1,415
    The power consumption reported by HWInfo for Zeppelin should be pretty accurate (I haven't validated it fully vs. DCR), however there are few conditions. The data displayed by HWInfo is based on SVI2 telemetry. The data originates from the VRM controller, however the issue is that it passes through the SMU where it technically could be altered / skewed to either direction. This is the exact reason why I never use the SVI2 telemetry for AMD or SVID data for Intel, for power consumption. Additionally, if you use these figures you must not change the Rll (load-line) resistance.

    cTDP caps the total power consumption to a certain value.
    The power consumption will never be exceeded, no matter the workload or number of utilized cores. It works exactly like a rev limiter in engines.
    The performance impact of limiting the power consumption will naturally depend on the number of utilized cores and the workload. Obviously at e.g. 30W you will be able to run a single core close to its maximum XFR ceiling, while the "n" core stress frequency will be more limited.
    On Zeppelin the capped figure is the "Package Power" (PP), unlike with all of the previous designs (excl. Carrizo / Bristol Ridge). This means that all of the different domains (e.g. PCIe Phys, peripherals, etc) are included to this power limit, not just the CPU cores & northbridge like with designs such as Orochi (PD), Kaveri (SR), etc. It is truly a total package power limit.
     
    T1beriu, Drazick and looncraz like this.
  12. PPB

    PPB Golden Member

    Joined:
    Jul 5, 2013
    Messages:
    1,092
    Likes Received:
    135
    Stilt, regarding base clock overclocking. I usually avoid it myself on Intel platforms because its not worth it as you cant really go over 110 BCLK in most cases, and SATA integrity begins to be a concern for me. But on AM4, does the base clock also alter SATA? Or just PCI-E? Because some people are sugesting to use base clock to overclock say a 1700 from 100 to 120, and make it work in PCI-E 2.0 mode. Won't the SATA CLK generator also be at 120, thus rapidly increasing the chances for data corruption? Base clock seems a good way to avoid the OC mode when overclocking nonetheless.

    Second and last, does Zen play by the AMD book regarding tCTL? Are we still with the "these are not real temps, it's just an internal scale made by AMD and you can't just go above X arbritrary number (think it was 72 on Vishera/FX forE.G)"?. I take tCASE became irrelevant this round as the socket infraestructure isn't as compromised as it was with FX chips on AM3+.
     
  13. The Stilt

    The Stilt Golden Member

    Joined:
    Dec 5, 2015
    Messages:
    1,216
    Likes Received:
    1,415
    I'm confident that most of the issues seen are just initial issues, which occur on each and every platform. There are plenty of potential issues, there is no way to deny that. However considering that the whole software- and firmware-stack was basically rewritten in less than four months, my personal opinion is that AMD did extremely well, regardless of all the minor issues. If it was up to me, I would have had postponed the launch by 1-2 months. This would have given both the ODMs and AMD to refine their software and firmwares to a point where most of these issues would have no longer existed.

    Regardless, I am certain that all of the minor issues will be ironed out, within the next month or a two. I don't believe there are any actual hardware issues in Zeppelin.
    Aside from just fixing the actual issues, I'm confident that the performance will somewhat improve as well ;)

    This is my personal point of view on the subject.
     
    #163 The Stilt, Mar 4, 2017
    Last edited: Mar 4, 2017
  14. DisEnchantment

    Joined:
    Mar 3, 2017
    Messages:
    100
    Likes Received:
    91
    I hope so, because I plan to get myself a Ryzen system, since I cannot afford 6900K.
     
  15. The Stilt

    The Stilt Golden Member

    Joined:
    Dec 5, 2015
    Messages:
    1,216
    Likes Received:
    1,415
    SATA is not affected, at least the ones which are located in Promontory (external FCH).
    I'm not certain if Taishan's (internal FCH) SATAs are affected by the BCLK, but there is a chance they are.

    tCTL should no longer be on linearized scale on Zeppelin. AFAIK it is the ROS (alternating, highest sensor reading within the CCXs) in °C scale, similar to AMD K10 cores or GPUs.
    In my experience the temperatures reported are quite realistic.

    I'm not fully certain about the retail parts, however the tCTLMax should be 95°C on all SKUs.
    Zeppelin supports cHTC so the ODMs might be configuring the limit to a lower figure if they want, however the default is 95°C.

    tCaseMax is the actual maximum package temperature, measured from surface center of the IHS.
    This figure is significantly lower (IIRC 71-56°C), as it is an external temperature (a major delta is to be expect).
     
  16. KTE

    KTE Senior member

    Joined:
    May 26, 2016
    Messages:
    477
    Likes Received:
    130
    I highly doubt 14nm LPP was what AMD wanted and it was definitely not suitable for HEDT. It looks like it was just what was available for HVM, and is heavily suitable for Mobile.

    Sent from HTC 10
    (Opinions are own)
     
  17. DisEnchantment

    Joined:
    Mar 3, 2017
    Messages:
    100
    Likes Received:
    91
    @TheStilt, what kind of software fixes would be needed for the OS? scheduler/kernel patches? or rather the USB etc kind of driver support for the integrated chipset?

    Also at BIOS level which kind of patches are applied, Timings and Circuitry logic or also micro code?
     
  18. The Stilt

    The Stilt Golden Member

    Joined:
    Dec 5, 2015
    Messages:
    1,216
    Likes Received:
    1,415
    I'm not familiar with the potential issues at OS side.

    I'd say SMU, PMU (DRAM) firmwares and the microcode are the ones which will have most of the potential improvements & refinements.
    Unless there are clear configuration errors made by the ODMs themselves, there isn't too many ways they (the ODMs) can affect performance.
     
  19. cytg111

    cytg111 Diamond Member

    Joined:
    Mar 17, 2008
    Messages:
    3,363
    Likes Received:
    271
    This is golden, thank you!
     
  20. thigobr

    thigobr Junior Member

    Joined:
    Sep 4, 2016
    Messages:
    11
    Likes Received:
    2
    Thanks @The Stilt for this amazing work you have done!

    Would be possible to test PCIE bandwidth to the graphics card?
     
  21. cytg111

    cytg111 Diamond Member

    Joined:
    Mar 17, 2008
    Messages:
    3,363
    Likes Received:
    271
    This is the dealmaker as a server chip right?
    (also, why is this review in at forums? should have its own space imo)
     
  22. PPB

    PPB Golden Member

    Joined:
    Jul 5, 2013
    Messages:
    1,092
    Likes Received:
    135
    Yes, it is 100x more informative and thorough than the garbage that some sites spew in the form of "reviews"
     
  23. dnavas

    dnavas Member

    Joined:
    Feb 25, 2017
    Messages:
    74
    Likes Received:
    33
    The large difference you're seeing in draw-call performance between Win7 and Win10 does make me wonder what is going on in Win10. I assume you've tried SMT on/off and HPET on/off.

    Given the way memory clocking ties into the rest of Ryzen, I'm thinking that the "best" RAM is going to be the fastest, and if G.Skill's Flare is tested on the ASUS board, maybe the 3200 is going to let the system shine just that little bit extra and is worth paying for? Is there info covering differences in performance due to memory speeds? [Also, maybe GSkill's announcement portends future firmware updates coming from ASUS?]
     
    #173 dnavas, Mar 4, 2017
    Last edited: Mar 4, 2017
  24. DisEnchantment

    Joined:
    Mar 3, 2017
    Messages:
    100
    Likes Received:
    91
    @The Stilt , Looking forward to these improvements in the platform.

    So the victim L3 is because the Infinity Fabric cannot be used to snoop the L2 of the other CCX?
    Is the Infinity Fabric used for intra - CCX also or only for CCX to outside i.e other CCX / Chipset / GPU etc?
    I am also curious if the data fabric is also scalable in Bandwidth, like PCIe lanes for example.

    Sorry, I ask too many questions...

    I am starting to understand how scalable this architecture is.
     
  25. PotatoWithEarsOnSide

    Joined:
    Feb 23, 2017
    Messages:
    102
    Likes Received:
    105
    Good spot. Very sloppy of me.
    Even so, that is an awful lot of VRAM in comparison to the 7700k under less load.
    His earlier videos, at 4k, saw VRAM pretty consistent across both systems...as you'd expect.