Ryzen: Strictly technical

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

Ajay

Lifer
Jan 8, 2001
15,431
7,849
136
This will not be good for the 4-6 cores if true, hopefully they can get another stepping that clocks higher out for at least the 4 core parts.

Yep. OT: This will be very important for Raven Ridge. AMD/GLF need better than a 4 GHz 'Turbo' clock. I hope they can work it out.
 

Ajay

Lifer
Jan 8, 2001
15,431
7,849
136
Raven isn't necessarily made on 14nm LPP ;)

:eek:. I'm pretty sure you won't tell, but, what node might it be implemented on?

Edit:

Hmm, hadn't read about 14nm LPU - I've been slacking...
10% higher perf @ iso power (supposedly).
 
Last edited:

imported_jjj

Senior member
Feb 14, 2009
660
430
136
Raven Ridge is notebook first and i don't see why they need high clocks in desktop as it would be aimed at folks that don't buy discrete GPUs or do demanding things on it. Summit Ridge is the workhorse , RR is for less demanding users. Would make a difference for marketing and ASPs i guess.
 

PPB

Golden Member
Jul 5, 2013
1,118
168
106
In "OC-Mode" there is no need to undervolt any other PState but the P0. This is because P0 PState is the only one which is used in "OC-Mode". Basically Ryzen is overclocked the same way all of the previous chips were, just remember not to increase the voltage immediately.

E.G.
"Normal-Mode" - P0 PState VID = 1.36250V, SMU voltage offset = ~ -120mV, effective voltage = 1.24250V.
"OC-Mode" - P0 PState VID = 1.36250V, SMU voltage offset = ±0mV, effective voltage = 1.36250V
But arent other pstates affected by the nornal mode voltage offset, too? I thought all of them were affected by the undervolt offset of normal mode, and that caused more power consumption in all scenarios besides in P0

Sent from my XT1040 using Tapatalk
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
But arent other pstates affected by the nornal mode voltage offset, too? I thought all of them were affected by the undervolt offset of normal mode, and that caused more power consumption in all scenarios besides in P0

Sent from my XT1040 using Tapatalk

They are, but the other PStates are only used in "Normal-Mode", not in "OC-Mode".
The lower PStates could specify 0.000V and it would make no difference, since they're never used in "OC-Mode".

"OC-Mode" = P0 PState.
 
  • Like
Reactions: Drazick

PPB

Golden Member
Jul 5, 2013
1,118
168
106
They are, but the other PStates are only used in "Normal-Mode", not in "OC-Mode".
The lower PStates could specify 0.000V and it would make no difference, since they're never used in "OC-Mode".

"OC-Mode" = P0 PState.
Hmmm. That IMO seems like a poor decision, I mean, its almost like skylake no K overclocking where you are only left with p0 and idle and mixed load suffer power consumption suffer a lot from this.

Sent from my XT1040 using Tapatalk
 

dpnelson

Junior Member
Mar 3, 2017
3
6
51
My apologies if this has been asked and answered already, but does anybody know how the cores are ordered in Windows? On Linux, a 8C/16T device would have its cores ordered as ABCDEFGHabcdefgh. where "A" and "a" map to the same physical core. I haven't done affinity work on Windows, but my understanding is that Intel logical cores would be ordered AaBbCcDcEeFfGgHh. If Ryzen is representing them some other way (such as by core-complex, ABCDabcdEFGHefgh) that could cause all sorts of problems for applications setting affinity. If an application has one heavy thread (such as the "main" thread in many games), and locks it to one core while blocking other threads from using that core or its SMT-paired core, having the ordering wrong could lead to reduced performance of the main thread's core when other threads are accidentally sent to its SMT paired core.
 

zir_blazer

Golden Member
Jun 6, 2013
1,164
406
136
Sorry The Stilt for hijacking a bit your interesing Thread, but for anyone that is into Passthrough as I do, I got the lspci output from Patrick from ServeTheHome from a Ryzen in an ASUS PRIME B350-PLUS. The PCI topology looks like this. I actually find it rather clean and flexible.
I hear from this video that there seems to be poor isolation, and everything gets in the same IOMMU Group, or something like that (I don't see that they show log info or anything related to see how it actually looks). However, that means that they DID enable the IOMMU for the Linux Kernel (Else you won't get the IOMMU Groups constructs) and the Kernel didn't panicked when doing so. Since its an isolation issue, what seems to be broken is PCIe ACS, AMD-Vi/IOMMU itself works. However, with no further info is impossible to know what is broken.
For reference, Skylake and Kaby Lake Chipsets also initially had ugly IOMMU Groups because the Chipset has a PCIe ACS related Errata, the ACS is found at an slight offset compared to the what the specifications says that it should be found at. Thus, for proper functionality, they require that you're using a Linux Kernel that had the fix to the quirk included so you get proper IOMMU Grouping. Since most people focuses on Ubuntu, chances are that if there was some last minute work or fixes for Ryzen that got include in the latest Linux Kernel, they are missing them. A bleeding edge distribution like Arch Linux, or a self compiled Linux Kernel from git, could be more interesing for Ryzen.


A thing which extremely surprises me is Ryzen performance-per-Watt, is like AMD not only got competitive with Intel, but can actually beat it if you focus in that metric. I think that Ryzen weakness is the low Frequency ceiling, 4-4.2 GHz is not enough to put a good show against high Frequency Kaby Lakes in classical desktop workloads and gaming, but for Linux Server workloads it is AMAZINGLY IMPRESSIVE. Also, the binning that 1800X may require puts it extremely close to the "factory overclock" definition, taking away the performance-per-Watt and the fun of overclocking. Ryzen could put an excellent show if it stays at the 3.3 GHz Critical 1 Point, which is why the 1700 looks soo good.
On Server workloads where performance-per-Watt is more important, is probable that Ryzen may be dramatically better. I think that ultimately, that is what will help AMD the most, especially from a profitability point of view, since AMD marketshare on Servers (Which got higher profit magins) was pretty much nonexistant since Sandy Bridge, and Ryzen could help get in with force. Desktop workloads doesn't showcase what it can truly do, it just shows its weakness. It could have made more sense if they focused on Server-first, as the original AMD K8 where Opterons came like 6 months before Athlons 64, but all the present issues shows instead that it would have been a product too inmature for that...
I believe that Ryzen is "Bulldozer done right". I had the same expectatives from Ryzen that I had from Bulldozer (Reducing the ST gap with Intel to the "good enough" point, but dramatically better MT performance at similar price points), just that this time AMD delivered.

BTW, what are the chances that we may see a midterm Ryzen refresh? I remember the Phenom II C2 vs C3 and Piledriver FX 8320E (Which was a more optimized Stepping. I recall a Thread from The Stilt talking about it some years ago). If Ryzen 4C/6C parts gets an extra 300-500 MHz headroom, it will seriously threat mainstream Kaby Lakes in ST. But at that point, it may have to face Coffee Lake instead...
 
  • Like
Reactions: lightmanek

Asterox

Golden Member
May 15, 2012
1,026
1,775
136
850 points in Cinebench 15 at 30W therefore 8 active CPU cores, what is CPU operating frequency at 30W?:cool:
 

hojnikb

Senior member
Sep 18, 2014
562
45
91
Is it possible for mobo makers to implement this "cTDP" feature ?

Also, if you "lock" tdp down, does single thread or lightly threaded performance also suffer that much or is it smart enough to boost to over 3Ghz ?
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
Is it possible for mobo makers to implement this "cTDP" feature ?

Also, if you "lock" tdp down, does single thread or lightly threaded performance also suffer that much or is it smart enough to boost to over 3Ghz ?

It would be very easy for the ODMs to implement, if they had the knowledge to program the SMU outside the standard AGESA functions. Also since the feature isn't officially supported in consumer SKUs, AMD might simply tell the ODMs not to implement it in their bioses. I'm not saying they would, but it is entirely possible.

Regarding the ST: 1800X at default, with turbo & XFR enabled scores 162 in Cinebench 15. With the TDP (PPT) limited to 30W the score is 155.

Also once I figure out the way to safely (without risking the stability) undervolt the CPU in "Normal-Mode", then the performance at lower TDPs will be even more impressive.
 

someEEguy

Member
Jun 5, 2013
71
31
91
There are no guarantees that you will be able to do that, regardless of the motherboard.
ASUS PRIME X370-PRO and Crosshair VI Hero are one of your best bets.

It seems reviewers have had some issues with early Asus and to a lesser extent MSI boards.
Tim Schiesser (Scorpus) @techspot said:
I will put this here for people wondering about the gaming results:

We had a lot of trouble benchmarking games with Ryzen. It seems we weren't the only ones, and many other reviewers have reported strangely low performance here. Our initial Asus board was plagued with bugs, and we saw some gains simply by switching to a Gigabyte or Asrock board. This really isn't the sort of behavior you'd expect, and AMD even acknowledged there were some issues with some Asus boards.

While we are pretty confident in our application test results, there could be some unresolved early issues with Ryzen and AM4 boards that is leading to strangely low gaming performance. We're not 100% sure what is going on there; Steve and I spent a while discussing what could be up, and we ended up confused more than anything else.

So if you're a gamer that's looking at our gaming results and thinking "that's disappointing", there could be an unresolved story here.

Of course one possible conclusion is simply that Ryzen isn't that amazing in games, but we're just not fully sure that is truly the case if all hardware was working correctly

EDIT: Don't get your hopes up about a potential fix. The results we achieved could be it, and you should make any buying decisions accordingly at this stage. The best thing may be to wait a few weeks just to make sure ;)
http://www.techspot.com/review/1345-amd-ryzen-7-1800x-1700x/

AMD@Gamers Nexus:
https://youtu.be/TBf0lwikXyU?t=405
 

lopri

Elite Member
Jul 27, 2002
13,209
594
126
Thank you, The Stilt, for this thorough and enlightening work. I read a few other "high-profile" reviews that are mostly low-quality (yet high on drama/hyperbole and eager to judge), and none of them gave me the insights The Stilt provided in this focused review. I am still trying to digest information provided in this thread and will most likely have some questions later, but for now I wanted to show my gratitude for The Stilt's high quality data as well as the sensible presentation.
 
  • Like
Reactions: lightmanek

Valantar

Golden Member
Aug 26, 2014
1,792
508
136
It would be very easy for the ODMs to implement, if they had the knowledge to program the SMU outside the standard AGESA functions. Also since the feature isn't officially supported in consumer SKUs, AMD might simply tell the ODMs not to implement it in their bioses. I'm not saying they would, but it is entirely possible.

Regarding the ST: 1800X at default, with turbo & XFR enabled scores 162 in Cinebench 15. With the TDP (PPT) limited to 30W the score is 155.

Also once I figure out the way to safely (without risking the stability) undervolt the CPU in "Normal-Mode", then the performance at lower TDPs will be even more impressive.
So you're saying that cTDP limits total power draw over time, but only marginally limits peak turbo speeds. That's very interesting. Might we see mobile SKUs with specs along the lines of Ivy Bridge/Haswell 17W CPUs, with sub-2GHz Base clocks and ~+50% boost clocks (just 2-4x,the cores)? That would be very interesting. I'd gladly see 4c8t chips moving into the 15-25W mobile space, although with an iGPU thrown into the mix you'd probably need another 10+W of thermal headroom.
 

majord

Senior member
Jul 26, 2015
433
523
136
I have a query or 2!

1. Re SMU data.. I notice HWinfo is reporting Core powers, package power. Is this accurate when in 'OC' mode, with custom Vcore? I ask because it seems to be reading lower than expected at higher vcores.

2. CPU-NB voltage. Does this have a new name? are still listing it as CPU-NB., and does athis plane acctually supply the DF?
 

Rngwn

Member
Dec 17, 2015
143
24
36
Glad you asked...

YTN3SVH.png


Same driver version, same everything else.
17.8% faster than Win 10 :rolleyes:

So, this confirms that Ryzen does really supports windows 7, even better so than windows 10. Oh the irony. :p


Code:
AMD Ryzen: ZD3601BAM88F4_40/36_Y            
AMD64 Family 23 Model 1 Stepping 1, AuthenticAMD
HTT           *   Multicore
HYPERVISOR   -   Hypervisor is present
VMX           -   Supports Intel hardware-assisted virtualization
SVM           *   Supports AMD hardware-assisted virtualization
X64           *   Supports 64-bit mode

SMX           -   Supports Intel trusted execution
SKINIT       *   Supports AMD SKINIT

NX           *   Supports no-execute page protection
SMEP         *   Supports Supervisor Mode Execution Prevention
SMAP         *   Supports Supervisor Mode Access Prevention
PAGE1GB       *   Supports 1 GB large pages
PAE           *   Supports > 32-bit physical addresses
PAT           *   Supports Page Attribute Table
PSE           *   Supports 4 MB pages
PSE36         *   Supports > 32-bit address 4 MB pages
PGE           *   Supports global bit in page tables
SS           -   Supports bus snooping for cache operations
VME           *   Supports Virtual-8086 mode
RDWRFSGSBASE   *   Supports direct GS/FS base access

FPU           *   Implements i387 floating point instructions
MMX           *   Supports MMX instruction set
MMXEXT       *   Implements AMD MMX extensions
3DNOW         -   Supports 3DNow! instructions
3DNOWEXT      -   Supports 3DNow! extension instructions
SSE           *   Supports Streaming SIMD Extensions
SSE2         *   Supports Streaming SIMD Extensions 2
SSE3         *   Supports Streaming SIMD Extensions 3
SSSE3         *   Supports Supplemental SIMD Extensions 3
SSE4a         *   Supports Streaming SIMDR Extensions 4a
SSE4.1       *   Supports Streaming SIMD Extensions 4.1
SSE4.2       *   Supports Streaming SIMD Extensions 4.2

AES           *   Supports AES extensions
AVX           *   Supports AVX intruction extensions
FMA           *   Supports FMA extensions using YMM state
MSR           *   Implements RDMSR/WRMSR instructions
MTRR         *   Supports Memory Type Range Registers
XSAVE         *   Supports XSAVE/XRSTOR instructions
OSXSAVE       *   Supports XSETBV/XGETBV instructions
RDRAND       *   Supports RDRAND instruction
RDSEED       *   Supports RDSEED instruction

CMOV         *   Supports CMOVcc instruction
CLFSH         *   Supports CLFLUSH instruction
CX8           *   Supports compare and exchange 8-byte instructions
CX16         *   Supports CMPXCHG16B instruction
BMI1         *   Supports bit manipulation extensions 1
BMI2         *   Supports bit manipulation extensions 2
ADX           *   Supports ADCX/ADOX instructions
DCA           -   Supports prefetch from memory-mapped device
F16C         *   Supports half-precision instruction
FXSR         *   Supports FXSAVE/FXSTOR instructions
FFXSR         *   Supports optimized FXSAVE/FSRSTOR instruction
MONITOR       *   Supports MONITOR and MWAIT instructions
MOVBE         *   Supports MOVBE instruction
ERMSB         -   Supports Enhanced REP MOVSB/STOSB
PCLMULDQ      *   Supports PCLMULDQ instruction
POPCNT       *   Supports POPCNT instruction
LZCNT         *   Supports LZCNT instruction
SEP           *   Supports fast system call instructions
LAHF-SAHF    *   Supports LAHF/SAHF instructions in 64-bit mode
HLE           -   Supports Hardware Lock Elision instructions
RTM           -   Supports Restricted Transactional Memory instructions

DE           *   Supports I/O breakpoints including CR4.DE
DTES64       -   Can write history of 64-bit branch addresses
DS           -   Implements memory-resident debug buffer
DS-CPL       -   Supports Debug Store feature with CPL
PCID         -   Supports PCIDs and settable CR4.PCIDE
INVPCID       -   Supports INVPCID instruction
PDCM         -   Supports Performance Capabilities MSR
RDTSCP       *   Supports RDTSCP instruction
TSC           *   Supports RDTSC instruction
TSC-DEADLINE   -   Local APIC supports one-shot deadline timer
TSC-INVARIANT   *   TSC runs at constant rate
xTPR         -   Supports disabling task priority messages

EIST         -   Supports Enhanced Intel Speedstep
ACPI         -   Implements MSR for power management
TM           -   Implements thermal monitor circuitry
TM2           -   Implements Thermal Monitor 2 control
APIC         *   Implements software-accessible local APIC
x2APIC       -   Supports x2APIC

CNXT-ID       -   L1 data cache mode adaptive or BIOS

MCE           *   Supports Machine Check, INT18 and CR4.MCE
MCA           *   Implements Machine Check Architecture
PBE           -   Supports use of FERR#/PBE# pin

PSN           -   Implements 96-bit processor serial number

PREFETCHW    *   Supports PREFETCHW instruction

Maximum implemented CPUID leaves: 0000000D (Basic), 8000001F (Extended).

Logical to Physical Processor Map:
**--------------  Physical Processor 0 (Hyperthreaded)
--**------------  Physical Processor 1 (Hyperthreaded)
----**----------  Physical Processor 2 (Hyperthreaded)
------**--------  Physical Processor 3 (Hyperthreaded)
--------**------  Physical Processor 4 (Hyperthreaded)
----------**----  Physical Processor 5 (Hyperthreaded)
------------**--  Physical Processor 6 (Hyperthreaded)
--------------**  Physical Processor 7 (Hyperthreaded)

Logical Processor to Socket Map:
****************  Socket 0

Logical Processor to NUMA Node Map:
****************  NUMA Node 0

No NUMA nodes.

Logical Processor to Cache Map:
*---------------  Data Cache          0, Level 1,   32 KB, Assoc   8, LineSize  64
*---------------  Instruction Cache   0, Level 1,   64 KB, Assoc   4, LineSize  64
*---------------  Unified Cache       0, Level 2,  512 KB, Assoc   8, LineSize  64
*---------------  Unified Cache       1, Level 3,   16 MB, Assoc  16, LineSize  64
-*--------------  Data Cache          1, Level 1,   32 KB, Assoc   8, LineSize  64
-*--------------  Instruction Cache   1, Level 1,   64 KB, Assoc   4, LineSize  64
-*--------------  Unified Cache       2, Level 2,  512 KB, Assoc   8, LineSize  64
-*--------------  Unified Cache       3, Level 3,   16 MB, Assoc  16, LineSize  64
--*-------------  Data Cache          2, Level 1,   32 KB, Assoc   8, LineSize  64
--*-------------  Instruction Cache   2, Level 1,   64 KB, Assoc   4, LineSize  64
--*-------------  Unified Cache       4, Level 2,  512 KB, Assoc   8, LineSize  64
--*-------------  Unified Cache       5, Level 3,   16 MB, Assoc  16, LineSize  64
---*------------  Data Cache          3, Level 1,   32 KB, Assoc   8, LineSize  64
---*------------  Instruction Cache   3, Level 1,   64 KB, Assoc   4, LineSize  64
---*------------  Unified Cache       6, Level 2,  512 KB, Assoc   8, LineSize  64
---*------------  Unified Cache       7, Level 3,   16 MB, Assoc  16, LineSize  64
----*-----------  Data Cache          4, Level 1,   32 KB, Assoc   8, LineSize  64
----*-----------  Instruction Cache   4, Level 1,   64 KB, Assoc   4, LineSize  64
----*-----------  Unified Cache       8, Level 2,  512 KB, Assoc   8, LineSize  64
----*-----------  Unified Cache       9, Level 3,   16 MB, Assoc  16, LineSize  64
-----*----------  Data Cache          5, Level 1,   32 KB, Assoc   8, LineSize  64
-----*----------  Instruction Cache   5, Level 1,   64 KB, Assoc   4, LineSize  64
-----*----------  Unified Cache      10, Level 2,  512 KB, Assoc   8, LineSize  64
-----*----------  Unified Cache      11, Level 3,   16 MB, Assoc  16, LineSize  64
------*---------  Data Cache          6, Level 1,   32 KB, Assoc   8, LineSize  64
------*---------  Instruction Cache   6, Level 1,   64 KB, Assoc   4, LineSize  64
------*---------  Unified Cache      12, Level 2,  512 KB, Assoc   8, LineSize  64
------*---------  Unified Cache      13, Level 3,   16 MB, Assoc  16, LineSize  64
-------*--------  Data Cache          7, Level 1,   32 KB, Assoc   8, LineSize  64
-------*--------  Instruction Cache   7, Level 1,   64 KB, Assoc   4, LineSize  64
-------*--------  Unified Cache      14, Level 2,  512 KB, Assoc   8, LineSize  64
-------*--------  Unified Cache      15, Level 3,   16 MB, Assoc  16, LineSize  64
--------*-------  Data Cache          8, Level 1,   32 KB, Assoc   8, LineSize  64
--------*-------  Instruction Cache   8, Level 1,   64 KB, Assoc   4, LineSize  64
--------*-------  Unified Cache      16, Level 2,  512 KB, Assoc   8, LineSize  64
--------*-------  Unified Cache      17, Level 3,   16 MB, Assoc  16, LineSize  64
---------*------  Data Cache          9, Level 1,   32 KB, Assoc   8, LineSize  64
---------*------  Instruction Cache   9, Level 1,   64 KB, Assoc   4, LineSize  64
---------*------  Unified Cache      18, Level 2,  512 KB, Assoc   8, LineSize  64
---------*------  Unified Cache      19, Level 3,   16 MB, Assoc  16, LineSize  64
----------*-----  Data Cache         10, Level 1,   32 KB, Assoc   8, LineSize  64
----------*-----  Instruction Cache  10, Level 1,   64 KB, Assoc   4, LineSize  64
----------*-----  Unified Cache      20, Level 2,  512 KB, Assoc   8, LineSize  64
----------*-----  Unified Cache      21, Level 3,   16 MB, Assoc  16, LineSize  64
-----------*----  Data Cache         11, Level 1,   32 KB, Assoc   8, LineSize  64
-----------*----  Instruction Cache  11, Level 1,   64 KB, Assoc   4, LineSize  64
-----------*----  Unified Cache      22, Level 2,  512 KB, Assoc   8, LineSize  64
-----------*----  Unified Cache      23, Level 3,   16 MB, Assoc  16, LineSize  64
------------*---  Data Cache         12, Level 1,   32 KB, Assoc   8, LineSize  64
------------*---  Instruction Cache  12, Level 1,   64 KB, Assoc   4, LineSize  64
------------*---  Unified Cache      24, Level 2,  512 KB, Assoc   8, LineSize  64
------------*---  Unified Cache      25, Level 3,   16 MB, Assoc  16, LineSize  64
-------------*--  Data Cache         13, Level 1,   32 KB, Assoc   8, LineSize  64
-------------*--  Instruction Cache  13, Level 1,   64 KB, Assoc   4, LineSize  64
-------------*--  Unified Cache      26, Level 2,  512 KB, Assoc   8, LineSize  64
-------------*--  Unified Cache      27, Level 3,   16 MB, Assoc  16, LineSize  64
--------------*-  Data Cache         14, Level 1,   32 KB, Assoc   8, LineSize  64
--------------*-  Instruction Cache  14, Level 1,   64 KB, Assoc   4, LineSize  64
--------------*-  Unified Cache      28, Level 2,  512 KB, Assoc   8, LineSize  64
--------------*-  Unified Cache      29, Level 3,   16 MB, Assoc  16, LineSize  64
---------------*  Data Cache         15, Level 1,   32 KB, Assoc   8, LineSize  64
---------------*  Instruction Cache  15, Level 1,   64 KB, Assoc   4, LineSize  64
---------------*  Unified Cache      30, Level 2,  512 KB, Assoc   8, LineSize  64
---------------*  Unified Cache      31, Level 3,   16 MB, Assoc  16, LineSize  64

Logical Processor to Group Map:
****************  Group 0

Any idea when this thing is going to get fixed?
 
Status
Not open for further replies.