Ryzen: Strictly technical

Page 8 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

looncraz

Senior member
Sep 12, 2011
722
1,651
136
The combination of being:
a) nowhere near full load on CPU,
b) noweher near full load on GPU,
c) using 3x more memory at that particular point, and
d) having circa 10-15 fps lower than the 7700k at the same time...

...is what I found strange.

What woud cause the memory usage to spike like that?
L1>L2>L3>DRAM, right...?
They both have similar L1 cache, R7 1700 has double L2 cache, and R7 1700 has access to more L3 cache, though through 2*CCXs.
The problem has to be with the L3, right?

Windows load-balances the cores, so the heavy-hitter threads are being moved around between differing cores (but not the SMT thread on the same core) every 10ms or so (Windows kernel scheduling interrupt interval). As was mentioned, you're seeing an average over 0.5 second or more, so it will appear that no core is being fully utilized - but they are... momentarily.

This process, though, makes a few issues with Ryzen.

1. It effectively prevents 'AI' prefetch adaptation
- so 10~15% of its total performance is lost right there (if AMD is to be believed)

2. It shuffles data across CCXes about 50% of the time.
- This damages data locality and causes new fetches from memory.

3a. A driver may detect this cache behavior and then load up VRAM to the max for better performance...
OR
3b. nVidia is intentionally loading more data on AMD CPUs... for whatever reason.
 

Valantar

Golden Member
Aug 26, 2014
1,792
508
136
cTDP caps the total power consumption to a certain value.
The power consumption will never be exceeded, no matter the workload or number of utilized cores. It works exactly like a rev limiter in engines.
The performance impact of limiting the power consumption will naturally depend on the number of utilized cores and the workload. Obviously at e.g. 30W you will be able to run a single core close to its maximum XFR ceiling, while the "n" core stress frequency will be more limited.
On Zeppelin the capped figure is the "Package Power" (PP), unlike with all of the previous designs (excl. Carrizo / Bristol Ridge). This means that all of the different domains (e.g. PCIe Phys, peripherals, etc) are included to this power limit, not just the CPU cores & northbridge like with designs such as Orochi (PD), Kaveri (SR), etc. It is truly a total package power limit.
Interesting. In my eyes that's a change for the better, as it makes the terminology clearer. But you're saying this isn't officially supported? So there's no chance we'll see motherboards allowing us to run, say, a 1700 at 35W for UCFF-style builds? At least it might bode well for later low-power skus, I suppose, especially in Raven Ridge.
 

Harney

Junior Member
Mar 4, 2017
3
1
36
This is a most informative thread, and an impressive piece of work by The Stilt. Thank you...!

Just got back from building and field testing my Ryzen build, so this write-up is much appreciated. I think you did a lot better then most reviews. ;)

That said, the only issue I've had so far is the Crosshair has a serious dislike of the Crucial memory I got. I simply cannot get it above 2400MHz (rated 2666MHz) no matter what timings and voltage, but it chucks along just fine at 2400/15-15-15-36 completely stable. Well done for a brand new platform. Did have a single crash, but that was due to the dumbs, not the system itself.

Agree Stilts work is better than most reviews i have seen too

Thanks Stilt for this great work ....long live win 7
 

starheap

Junior Member
Mar 4, 2017
5
0
1
Here is my Coreinfo output on windows 10 with an 1800x

Code:
Logical to Physical Processor Map:
**--------------  Physical Processor 0 (Hyperthreaded)
--**------------  Physical Processor 1 (Hyperthreaded)
----**----------  Physical Processor 2 (Hyperthreaded)
------**--------  Physical Processor 3 (Hyperthreaded)
--------**------  Physical Processor 4 (Hyperthreaded)
----------**----  Physical Processor 5 (Hyperthreaded)
------------**--  Physical Processor 6 (Hyperthreaded)
--------------**  Physical Processor 7 (Hyperthreaded)

Logical Processor to Socket Map:
****************  Socket 0

Logical Processor to NUMA Node Map:
****************  NUMA Node 0

No NUMA nodes.

Logical Processor to Cache Map:
**--------------  Data Cache          0, Level 1,   32 KB, Assoc   8, LineSize  64
**--------------  Instruction Cache   0, Level 1,   64 KB, Assoc   4, LineSize  64
**--------------  Unified Cache       0, Level 2,  512 KB, Assoc   8, LineSize  64
********--------  Unified Cache       1, Level 3,    8 MB, Assoc  16, LineSize  64
--**------------  Data Cache          1, Level 1,   32 KB, Assoc   8, LineSize  64
--**------------  Instruction Cache   1, Level 1,   64 KB, Assoc   4, LineSize  64
--**------------  Unified Cache       2, Level 2,  512 KB, Assoc   8, LineSize  64
----**----------  Data Cache          2, Level 1,   32 KB, Assoc   8, LineSize  64
----**----------  Instruction Cache   2, Level 1,   64 KB, Assoc   4, LineSize  64
----**----------  Unified Cache       3, Level 2,  512 KB, Assoc   8, LineSize  64
------**--------  Data Cache          3, Level 1,   32 KB, Assoc   8, LineSize  64
------**--------  Instruction Cache   3, Level 1,   64 KB, Assoc   4, LineSize  64
------**--------  Unified Cache       4, Level 2,  512 KB, Assoc   8, LineSize  64
--------**------  Data Cache          4, Level 1,   32 KB, Assoc   8, LineSize  64
--------**------  Instruction Cache   4, Level 1,   64 KB, Assoc   4, LineSize  64
--------**------  Unified Cache       5, Level 2,  512 KB, Assoc   8, LineSize  64
--------********  Unified Cache       6, Level 3,    8 MB, Assoc  16, LineSize  64
----------**----  Data Cache          5, Level 1,   32 KB, Assoc   8, LineSize  64
----------**----  Instruction Cache   5, Level 1,   64 KB, Assoc   4, LineSize  64
----------**----  Unified Cache       7, Level 2,  512 KB, Assoc   8, LineSize  64
------------**--  Data Cache          6, Level 1,   32 KB, Assoc   8, LineSize  64
------------**--  Instruction Cache   6, Level 1,   64 KB, Assoc   4, LineSize  64
------------**--  Unified Cache       8, Level 2,  512 KB, Assoc   8, LineSize  64
--------------**  Data Cache          7, Level 1,   32 KB, Assoc   8, LineSize  64
--------------**  Instruction Cache   7, Level 1,   64 KB, Assoc   4, LineSize  64
--------------**  Unified Cache       9, Level 2,  512 KB, Assoc   8, LineSize  64
 

iBoMbY

Member
Nov 23, 2016
175
103
86
Here is my Coreinfo output on windows 10 with an 1800x[/code]

Yes, I got mine today as well, and my Coreinfo looks exactly like yours, and it seems to be correct so far (only maybe a NUMA node per CCX could be nice). Seems like they fixed that cache thing with some microcode update?
 

Mockingbird

Senior member
Feb 12, 2017
733
741
106
Here is my Coreinfo output on windows 10 with an 1800x

Code:
Logical to Physical Processor Map:
**--------------  Physical Processor 0 (Hyperthreaded)
--**------------  Physical Processor 1 (Hyperthreaded)
----**----------  Physical Processor 2 (Hyperthreaded)
------**--------  Physical Processor 3 (Hyperthreaded)
--------**------  Physical Processor 4 (Hyperthreaded)
----------**----  Physical Processor 5 (Hyperthreaded)
------------**--  Physical Processor 6 (Hyperthreaded)
--------------**  Physical Processor 7 (Hyperthreaded)

Logical Processor to Socket Map:
****************  Socket 0

Logical Processor to NUMA Node Map:
****************  NUMA Node 0

No NUMA nodes.

Logical Processor to Cache Map:
**--------------  Data Cache          0, Level 1,   32 KB, Assoc   8, LineSize  64
**--------------  Instruction Cache   0, Level 1,   64 KB, Assoc   4, LineSize  64
**--------------  Unified Cache       0, Level 2,  512 KB, Assoc   8, LineSize  64
********--------  Unified Cache       1, Level 3,    8 MB, Assoc  16, LineSize  64
--**------------  Data Cache          1, Level 1,   32 KB, Assoc   8, LineSize  64
--**------------  Instruction Cache   1, Level 1,   64 KB, Assoc   4, LineSize  64
--**------------  Unified Cache       2, Level 2,  512 KB, Assoc   8, LineSize  64
----**----------  Data Cache          2, Level 1,   32 KB, Assoc   8, LineSize  64
----**----------  Instruction Cache   2, Level 1,   64 KB, Assoc   4, LineSize  64
----**----------  Unified Cache       3, Level 2,  512 KB, Assoc   8, LineSize  64
------**--------  Data Cache          3, Level 1,   32 KB, Assoc   8, LineSize  64
------**--------  Instruction Cache   3, Level 1,   64 KB, Assoc   4, LineSize  64
------**--------  Unified Cache       4, Level 2,  512 KB, Assoc   8, LineSize  64
--------**------  Data Cache          4, Level 1,   32 KB, Assoc   8, LineSize  64
--------**------  Instruction Cache   4, Level 1,   64 KB, Assoc   4, LineSize  64
--------**------  Unified Cache       5, Level 2,  512 KB, Assoc   8, LineSize  64
--------********  Unified Cache       6, Level 3,    8 MB, Assoc  16, LineSize  64
----------**----  Data Cache          5, Level 1,   32 KB, Assoc   8, LineSize  64
----------**----  Instruction Cache   5, Level 1,   64 KB, Assoc   4, LineSize  64
----------**----  Unified Cache       7, Level 2,  512 KB, Assoc   8, LineSize  64
------------**--  Data Cache          6, Level 1,   32 KB, Assoc   8, LineSize  64
------------**--  Instruction Cache   6, Level 1,   64 KB, Assoc   4, LineSize  64
------------**--  Unified Cache       8, Level 2,  512 KB, Assoc   8, LineSize  64
--------------**  Data Cache          7, Level 1,   32 KB, Assoc   8, LineSize  64
--------------**  Instruction Cache   7, Level 1,   64 KB, Assoc   4, LineSize  64
--------------**  Unified Cache       9, Level 2,  512 KB, Assoc   8, LineSize  64

Yours is correct. I wonder why it's different from OP's (The Stilt's).

Did MSFT released a stealth update to Windows 10?

Code:
AMD Ryzen: ZD3601BAM88F4_40/36_Y               
AMD64 Family 23 Model 1 Stepping 1, AuthenticAMD
HTT           *   Multicore
HYPERVISOR   -   Hypervisor is present
VMX           -   Supports Intel hardware-assisted virtualization
SVM           *   Supports AMD hardware-assisted virtualization
X64           *   Supports 64-bit mode

SMX           -   Supports Intel trusted execution
SKINIT       *   Supports AMD SKINIT

NX           *   Supports no-execute page protection
SMEP         *   Supports Supervisor Mode Execution Prevention
SMAP         *   Supports Supervisor Mode Access Prevention
PAGE1GB       *   Supports 1 GB large pages
PAE           *   Supports > 32-bit physical addresses
PAT           *   Supports Page Attribute Table
PSE           *   Supports 4 MB pages
PSE36         *   Supports > 32-bit address 4 MB pages
PGE           *   Supports global bit in page tables
SS           -   Supports bus snooping for cache operations
VME           *   Supports Virtual-8086 mode
RDWRFSGSBASE   *   Supports direct GS/FS base access

FPU           *   Implements i387 floating point instructions
MMX           *   Supports MMX instruction set
MMXEXT       *   Implements AMD MMX extensions
3DNOW         -   Supports 3DNow! instructions
3DNOWEXT      -   Supports 3DNow! extension instructions
SSE           *   Supports Streaming SIMD Extensions
SSE2         *   Supports Streaming SIMD Extensions 2
SSE3         *   Supports Streaming SIMD Extensions 3
SSSE3         *   Supports Supplemental SIMD Extensions 3
SSE4a         *   Supports Streaming SIMDR Extensions 4a
SSE4.1       *   Supports Streaming SIMD Extensions 4.1
SSE4.2       *   Supports Streaming SIMD Extensions 4.2

AES           *   Supports AES extensions
AVX           *   Supports AVX intruction extensions
FMA           *   Supports FMA extensions using YMM state
MSR           *   Implements RDMSR/WRMSR instructions
MTRR         *   Supports Memory Type Range Registers
XSAVE         *   Supports XSAVE/XRSTOR instructions
OSXSAVE       *   Supports XSETBV/XGETBV instructions
RDRAND       *   Supports RDRAND instruction
RDSEED       *   Supports RDSEED instruction

CMOV         *   Supports CMOVcc instruction
CLFSH         *   Supports CLFLUSH instruction
CX8           *   Supports compare and exchange 8-byte instructions
CX16         *   Supports CMPXCHG16B instruction
BMI1         *   Supports bit manipulation extensions 1
BMI2         *   Supports bit manipulation extensions 2
ADX           *   Supports ADCX/ADOX instructions
DCA           -   Supports prefetch from memory-mapped device
F16C         *   Supports half-precision instruction
FXSR         *   Supports FXSAVE/FXSTOR instructions
FFXSR         *   Supports optimized FXSAVE/FSRSTOR instruction
MONITOR       *   Supports MONITOR and MWAIT instructions
MOVBE         *   Supports MOVBE instruction
ERMSB         -   Supports Enhanced REP MOVSB/STOSB
PCLMULDQ      *   Supports PCLMULDQ instruction
POPCNT       *   Supports POPCNT instruction
LZCNT         *   Supports LZCNT instruction
SEP           *   Supports fast system call instructions
LAHF-SAHF    *   Supports LAHF/SAHF instructions in 64-bit mode
HLE           -   Supports Hardware Lock Elision instructions
RTM           -   Supports Restricted Transactional Memory instructions

DE           *   Supports I/O breakpoints including CR4.DE
DTES64       -   Can write history of 64-bit branch addresses
DS           -   Implements memory-resident debug buffer
DS-CPL       -   Supports Debug Store feature with CPL
PCID         -   Supports PCIDs and settable CR4.PCIDE
INVPCID       -   Supports INVPCID instruction
PDCM         -   Supports Performance Capabilities MSR
RDTSCP       *   Supports RDTSCP instruction
TSC           *   Supports RDTSC instruction
TSC-DEADLINE   -   Local APIC supports one-shot deadline timer
TSC-INVARIANT   *   TSC runs at constant rate
xTPR         -   Supports disabling task priority messages

EIST         -   Supports Enhanced Intel Speedstep
ACPI         -   Implements MSR for power management
TM           -   Implements thermal monitor circuitry
TM2           -   Implements Thermal Monitor 2 control
APIC         *   Implements software-accessible local APIC
x2APIC       -   Supports x2APIC

CNXT-ID       -   L1 data cache mode adaptive or BIOS

MCE           *   Supports Machine Check, INT18 and CR4.MCE
MCA           *   Implements Machine Check Architecture
PBE           -   Supports use of FERR#/PBE# pin

PSN           -   Implements 96-bit processor serial number

PREFETCHW    *   Supports PREFETCHW instruction

Maximum implemented CPUID leaves: 0000000D (Basic), 8000001F (Extended).

Logical to Physical Processor Map:
**--------------  Physical Processor 0 (Hyperthreaded)
--**------------  Physical Processor 1 (Hyperthreaded)
----**----------  Physical Processor 2 (Hyperthreaded)
------**--------  Physical Processor 3 (Hyperthreaded)
--------**------  Physical Processor 4 (Hyperthreaded)
----------**----  Physical Processor 5 (Hyperthreaded)
------------**--  Physical Processor 6 (Hyperthreaded)
--------------**  Physical Processor 7 (Hyperthreaded)

Logical Processor to Socket Map:
****************  Socket 0

Logical Processor to NUMA Node Map:
****************  NUMA Node 0

No NUMA nodes.

Logical Processor to Cache Map:
*---------------  Data Cache          0, Level 1,   32 KB, Assoc   8, LineSize  64
*---------------  Instruction Cache   0, Level 1,   64 KB, Assoc   4, LineSize  64
*---------------  Unified Cache       0, Level 2,  512 KB, Assoc   8, LineSize  64
*---------------  Unified Cache       1, Level 3,   16 MB, Assoc  16, LineSize  64
-*--------------  Data Cache          1, Level 1,   32 KB, Assoc   8, LineSize  64
-*--------------  Instruction Cache   1, Level 1,   64 KB, Assoc   4, LineSize  64
-*--------------  Unified Cache       2, Level 2,  512 KB, Assoc   8, LineSize  64
-*--------------  Unified Cache       3, Level 3,   16 MB, Assoc  16, LineSize  64
--*-------------  Data Cache          2, Level 1,   32 KB, Assoc   8, LineSize  64
--*-------------  Instruction Cache   2, Level 1,   64 KB, Assoc   4, LineSize  64
--*-------------  Unified Cache       4, Level 2,  512 KB, Assoc   8, LineSize  64
--*-------------  Unified Cache       5, Level 3,   16 MB, Assoc  16, LineSize  64
---*------------  Data Cache          3, Level 1,   32 KB, Assoc   8, LineSize  64
---*------------  Instruction Cache   3, Level 1,   64 KB, Assoc   4, LineSize  64
---*------------  Unified Cache       6, Level 2,  512 KB, Assoc   8, LineSize  64
---*------------  Unified Cache       7, Level 3,   16 MB, Assoc  16, LineSize  64
----*-----------  Data Cache          4, Level 1,   32 KB, Assoc   8, LineSize  64
----*-----------  Instruction Cache   4, Level 1,   64 KB, Assoc   4, LineSize  64
----*-----------  Unified Cache       8, Level 2,  512 KB, Assoc   8, LineSize  64
----*-----------  Unified Cache       9, Level 3,   16 MB, Assoc  16, LineSize  64
-----*----------  Data Cache          5, Level 1,   32 KB, Assoc   8, LineSize  64
-----*----------  Instruction Cache   5, Level 1,   64 KB, Assoc   4, LineSize  64
-----*----------  Unified Cache      10, Level 2,  512 KB, Assoc   8, LineSize  64
-----*----------  Unified Cache      11, Level 3,   16 MB, Assoc  16, LineSize  64
------*---------  Data Cache          6, Level 1,   32 KB, Assoc   8, LineSize  64
------*---------  Instruction Cache   6, Level 1,   64 KB, Assoc   4, LineSize  64
------*---------  Unified Cache      12, Level 2,  512 KB, Assoc   8, LineSize  64
------*---------  Unified Cache      13, Level 3,   16 MB, Assoc  16, LineSize  64
-------*--------  Data Cache          7, Level 1,   32 KB, Assoc   8, LineSize  64
-------*--------  Instruction Cache   7, Level 1,   64 KB, Assoc   4, LineSize  64
-------*--------  Unified Cache      14, Level 2,  512 KB, Assoc   8, LineSize  64
-------*--------  Unified Cache      15, Level 3,   16 MB, Assoc  16, LineSize  64
--------*-------  Data Cache          8, Level 1,   32 KB, Assoc   8, LineSize  64
--------*-------  Instruction Cache   8, Level 1,   64 KB, Assoc   4, LineSize  64
--------*-------  Unified Cache      16, Level 2,  512 KB, Assoc   8, LineSize  64
--------*-------  Unified Cache      17, Level 3,   16 MB, Assoc  16, LineSize  64
---------*------  Data Cache          9, Level 1,   32 KB, Assoc   8, LineSize  64
---------*------  Instruction Cache   9, Level 1,   64 KB, Assoc   4, LineSize  64
---------*------  Unified Cache      18, Level 2,  512 KB, Assoc   8, LineSize  64
---------*------  Unified Cache      19, Level 3,   16 MB, Assoc  16, LineSize  64
----------*-----  Data Cache         10, Level 1,   32 KB, Assoc   8, LineSize  64
----------*-----  Instruction Cache  10, Level 1,   64 KB, Assoc   4, LineSize  64
----------*-----  Unified Cache      20, Level 2,  512 KB, Assoc   8, LineSize  64
----------*-----  Unified Cache      21, Level 3,   16 MB, Assoc  16, LineSize  64
-----------*----  Data Cache         11, Level 1,   32 KB, Assoc   8, LineSize  64
-----------*----  Instruction Cache  11, Level 1,   64 KB, Assoc   4, LineSize  64
-----------*----  Unified Cache      22, Level 2,  512 KB, Assoc   8, LineSize  64
-----------*----  Unified Cache      23, Level 3,   16 MB, Assoc  16, LineSize  64
------------*---  Data Cache         12, Level 1,   32 KB, Assoc   8, LineSize  64
------------*---  Instruction Cache  12, Level 1,   64 KB, Assoc   4, LineSize  64
------------*---  Unified Cache      24, Level 2,  512 KB, Assoc   8, LineSize  64
------------*---  Unified Cache      25, Level 3,   16 MB, Assoc  16, LineSize  64
-------------*--  Data Cache         13, Level 1,   32 KB, Assoc   8, LineSize  64
-------------*--  Instruction Cache  13, Level 1,   64 KB, Assoc   4, LineSize  64
-------------*--  Unified Cache      26, Level 2,  512 KB, Assoc   8, LineSize  64
-------------*--  Unified Cache      27, Level 3,   16 MB, Assoc  16, LineSize  64
--------------*-  Data Cache         14, Level 1,   32 KB, Assoc   8, LineSize  64
--------------*-  Instruction Cache  14, Level 1,   64 KB, Assoc   4, LineSize  64
--------------*-  Unified Cache      28, Level 2,  512 KB, Assoc   8, LineSize  64
--------------*-  Unified Cache      29, Level 3,   16 MB, Assoc  16, LineSize  64
---------------*  Data Cache         15, Level 1,   32 KB, Assoc   8, LineSize  64
---------------*  Instruction Cache  15, Level 1,   64 KB, Assoc   4, LineSize  64
---------------*  Unified Cache      30, Level 2,  512 KB, Assoc   8, LineSize  64
---------------*  Unified Cache      31, Level 3,   16 MB, Assoc  16, LineSize  64

Logical Processor to Group Map:
****************  Group 0
 

Mockingbird

Senior member
Feb 12, 2017
733
741
106
Yes, I got mine today as well, and my Coreinfo looks exactly like yours, and it seems to be correct so far (only maybe a NUMA node per CCX could be nice). Seems like they fixed that cache thing with some microcode update?

Did MSFT released a stealth update to Windows 10 or something? :eek:
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
I did some 3D testing and eventhou there is not nearly enough data to confirm it, I'd say the SMT regression is infact a Windows 10 related issue.
In 3D testing I did recently on Windows 10, the title which illustrated the biggest SMT regression was Total War: Warhammer.

All of these were recorded at 3.5GHz, 2133MHz MEMCLK with R9 Nano:

Windows 10 - 1080 Ultra DX11:

8C/16T - 49.39fps (Min), 72.36fps (Avg)
8C/8T - 57.16fps (Min), 72.46fps (Avg)

Windows 7 - 1080 Ultra DX11:

8C/16T - 62.33fps (Min), 78.18fps (Avg)
8C/8T - 62.00fps (Min), 73.22fps (Avg)

At the moment this is just pure speculation as there were variables, which could not be isolated.
Windows 10 figures were recorded using PresentMon (OCAT), however with Windows 7 it was necessary to use Fraps.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
But you're saying this isn't officially supported? So there's no chance we'll see motherboards allowing us to run, say, a 1700 at 35W for UCFF-style builds? At least it might bode well for later low-power skus, I suppose, especially in Raven Ridge.

At least at the moment that is the case.
However, since the feature can be easily "added" I wouldn't be too surprised if it would become officially available at some point.
 

Kromaatikse

Member
Mar 4, 2017
83
169
56
Interestingly, my Kaveri APU on Win7 also has an incorrect cache mapping according to coreinfo. Can anyone with Kaveri, Carrizo, Bristol Ridge or Vishera confirm this on Win10?

Code:
Coreinfo v3.31 - Dump information on system CPU and memory topology
Copyright (C) 2008-2014 Mark Russinovich
Sysinternals - www.sysinternals.com

AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G
AMD64 Family 21 Model 48 Stepping 1, AuthenticAMD

<snip>

Maximum implemented CPUID leaves: 0000000D (Basic), 8000001E (Extended).

Logical to Physical Processor Map:
**--  Physical Processor 0 (Hyperthreaded)
--**  Physical Processor 1 (Hyperthreaded)

Logical Processor to Socket Map:
****  Socket 0

Logical Processor to NUMA Node Map:
****  NUMA Node 0

No NUMA nodes.

Logical Processor to Cache Map:
*---  Data Cache          0, Level 1,   16 KB, Assoc   4, LineSize  64
*---  Instruction Cache   0, Level 1,   96 KB, Assoc   3, LineSize  64
*---  Unified Cache       0, Level 2,    2 MB, Assoc  16, LineSize  64
-*--  Data Cache          1, Level 1,   16 KB, Assoc   4, LineSize  64
-*--  Instruction Cache   1, Level 1,   96 KB, Assoc   3, LineSize  64
-*--  Unified Cache       1, Level 2,    2 MB, Assoc  16, LineSize  64
--*-  Data Cache          2, Level 1,   16 KB, Assoc   4, LineSize  64
--*-  Instruction Cache   2, Level 1,   96 KB, Assoc   3, LineSize  64
--*-  Unified Cache       2, Level 2,    2 MB, Assoc  16, LineSize  64
---*  Data Cache          3, Level 1,   16 KB, Assoc   4, LineSize  64
---*  Instruction Cache   3, Level 1,   96 KB, Assoc   3, LineSize  64
---*  Unified Cache       3, Level 2,    2 MB, Assoc  16, LineSize  64

Logical Processor to Group Map:
****  Group 0

The correct layout should be something like:

Code:
Logical Processor to Cache Map:
*---  Data Cache          0, Level 1,   16 KB, Assoc   4, LineSize  64
**--  Instruction Cache   0, Level 1,   96 KB, Assoc   3, LineSize  64
**--  Unified Cache       0, Level 2,    2 MB, Assoc  16, LineSize  64
-*--  Data Cache          1, Level 1,   16 KB, Assoc   4, LineSize  64
--*-  Data Cache          2, Level 1,   16 KB, Assoc   4, LineSize  64
--**  Instruction Cache   2, Level 1,   96 KB, Assoc   3, LineSize  64
--**  Unified Cache       2, Level 2,    2 MB, Assoc  16, LineSize  64
---*  Data Cache          3, Level 1,   16 KB, Assoc   4, LineSize  64
 

imported_jjj

Senior member
Feb 14, 2009
660
430
136
I did some 3D testing and eventhou there is not nearly enough data to confirm it, I'd say the SMT regression is infact a Windows 10 related issue.
In 3D testing I did recently on Windows 10, the title which illustrated the biggest SMT regression was Total War: Warhammer.

All of these were recorded at 3.5GHz, 2133MHz MEMCLK with R9 Nano:

Windows 10 - 1080 Ultra DX11:

8C/16T - 49.39fps (Min), 72.36fps (Avg)
8C/8T - 57.16fps (Min), 72.46fps (Avg)

Windows 7 - 1080 Ultra DX11:

8C/16T - 62.33fps (Min), 78.18fps (Avg)
8C/8T - 62.00fps (Min), 73.22fps (Avg)

At the moment this is just pure speculation as there were variables, which could not be isolated.
Windows 10 figures were recorded using PresentMon (OCAT), however with Windows 7 it was necessary to use Fraps.

I am gonna report your post, for being awesome lol.

EDIT: you are not getting higher avg FPS with SMT disabled in Win 10.
Might be better to go lower res, it increases the diff and makes it easier to quantify.
The memory could be a bottleneck at 2133 considering that the data fabric is running at same clocks , might pollute the data.
 
Last edited:

PotatoWithEarsOnSide

Senior member
Feb 23, 2017
664
701
106
Code:
Coreinfo v3.31 - Dump information on system CPU and memory topology
Copyright (C) 2008-2014 Mark Russinovich
Sysinternals - www.sysinternals.com

AMD A10-7300 Radeon R6, 10 Compute Cores 4C+6G
AMD64 Family 21 Model 48 Stepping 1, AuthenticAMD
Microcode signature: 06003106

Maximum implemented CPUID leaves: 0000000D (Basic), 8000001E (Extended).

Logical to Physical Processor Map:
*---  Physical Processor 0
-*--  Physical Processor 1
--*-  Physical Processor 2
---*  Physical Processor 3

Logical Processor to Socket Map:
****  Socket 0

Logical Processor to NUMA Node Map:
****  NUMA Node 0

No NUMA nodes.

Logical Processor to Cache Map:
*---  Data Cache          0, Level 1,   16 KB, Assoc   4, LineSize  64
*---  Instruction Cache   0, Level 1,   96 KB, Assoc   3, LineSize  64
*---  Unified Cache       0, Level 2,    2 MB, Assoc  16, LineSize  64
-*--  Data Cache          1, Level 1,   16 KB, Assoc   4, LineSize  64
-*--  Instruction Cache   1, Level 1,   96 KB, Assoc   3, LineSize  64
-*--  Unified Cache       1, Level 2,    2 MB, Assoc  16, LineSize  64
--*-  Data Cache          2, Level 1,   16 KB, Assoc   4, LineSize  64
--*-  Instruction Cache   2, Level 1,   96 KB, Assoc   3, LineSize  64
--*-  Unified Cache       2, Level 2,    2 MB, Assoc  16, LineSize  64
---*  Data Cache          3, Level 1,   16 KB, Assoc   4, LineSize  64
---*  Instruction Cache   3, Level 1,   96 KB, Assoc   3, LineSize  64
---*  Unified Cache       3, Level 2,    2 MB, Assoc  16, LineSize  64

Logical Processor to Group Map:
****  Group 0


This is Kaveri on Win 10
 

inf64

Diamond Member
Mar 11, 2011
3,685
3,957
136
At least at the moment that is the case.
However, since the feature can be easily "added" I wouldn't be too surprised if it would become officially available at some point.
Any chance "Someone" might release 3rd party app that could enable this functionality? :)
Or this has to be done at low level (firmware)?
 
  • Like
Reactions: Drazick

Jan Olšan

Senior member
Jan 12, 2017
273
276
136
Is it possible that the W10 performance regressions could be related to some of those potential performance problem sources that AMD's reviewer guides mention?
Specifically,
1) use of balanced power profile in W10 (only "best performance" profile is supposed to be optimal)
2) HPET enabled in BIOS (supposedly harms performance too, which I hope eventually gets fixed).

Sorry if this was asked already. I also suspect you probably know these things already, sorry for doubting you.
 

CatMerc

Golden Member
Jul 16, 2016
1,114
1,149
136
I'm confident that most of the issues seen are just initial issues, which occur on each and every platform. There are plenty of potential issues, there is no way to deny that. However considering that the whole software- and firmware-stack was basically rewritten in less than four months, my personal opinion is that AMD did extremely well, regardless of all the minor issues. If it was up to me, I would have had postponed the launch by 1-2 months. This would have given both the ODMs and AMD to refine their software and firmwares to a point where most of these issues would have no longer existed.

Regardless, I am certain that all of the minor issues will be ironed out, within the next month or a two. I don't believe there are any actual hardware issues in Zeppelin.
Aside from just fixing the actual issues, I'm confident that the performance will somewhat improve as well ;)

This is my personal point of view on the subject.
So, you believe that in 2 months time or so, 1800X will perform in games relative to the 6900K like it performs in, most synthetics?
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
Is it possible that the W10 performance regressions could be related to some of those potential performance problem sources that AMD's reviewer guides mention?
Specifically,
1) use of balanced power profile in W10 (only "best performance" profile is supposed to be optimal)
2) HPET enabled in BIOS (supposedly harms performance too, which I hope eventually gets fixed).

Sorry if this was asked already. I also suspect you probably know these things already, sorry for doubting you.

No.
I've checked the performance with both "Balanced" & "High-Performance" profiles and with both HPET and TSC timing. The minor differences are mutual for both SMT On & Off conditions.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
So, you believe that in 2 months time or so, 1800X will perform in games relative to the 6900K like it performs in, most synthetics?

There will be improvements, but it is impossible to say how large or tiny they might be.
There are various, completely isolated regions where the improvements will occur.
 
  • Like
Reactions: T1beriu and Drazick

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
Any chance "Someone" might release 3rd party app that could enable this functionality? :)
Or this has to be done at low level (firmware)?

Personally I would like to have it implemented in the bios, so that the end-users would have no need to use 3rd party provided tools with ring 0 access in them.
Time will tell ;)
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
I did some 3D testing and eventhou there is not nearly enough data to confirm it, I'd say the SMT regression is infact a Windows 10 related issue.
In 3D testing I did recently on Windows 10, the title which illustrated the biggest SMT regression was Total War: Warhammer.

All of these were recorded at 3.5GHz, 2133MHz MEMCLK with R9 Nano:

Windows 10 - 1080 Ultra DX11:

8C/16T - 49.39fps (Min), 72.36fps (Avg)
8C/8T - 57.16fps (Min), 72.46fps (Avg)

Windows 7 - 1080 Ultra DX11:

8C/16T - 62.33fps (Min), 78.18fps (Avg)
8C/8T - 62.00fps (Min), 73.22fps (Avg)

At the moment this is just pure speculation as there were variables, which could not be isolated.
Windows 10 figures were recorded using PresentMon (OCAT), however with Windows 7 it was necessary to use Fraps.

Can you see if applying all Windows 10 updates makes any difference in performance?

Maybe try the fast track as well?
 
  • Like
Reactions: T1beriu and Drazick
Status
Not open for further replies.