• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Ryzen: Strictly technical

Page 44 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.
This seems to be fake, probably pinned manually. If I create 8 threads on my Ryzen, it still distributes them over all logical cores. It only looks like it is trying to avoid SMT core, but the spikes are all over the place.

Edit: This is what it looks like with 8 Prime95 threads for me:

XXjwGF6.png

This is because you are using high performance mode. Don't do that. You need core parking enabled to park those logical cores when they do more harm than good.
 
Why does the BCLK fluctuate when you OC with the multiplier?

Ryzen (and Excavator before it) will "stretch" the clock to even out power demand over short intervals to improve power usage and stability on lower core voltage at higher clocks. The only way you'll see this is by some jitters in the base clock as the multiplier doesn't change - the stretch can be as much as 7%, but only lasts about a millisecond (it basically reacts to instantaneous Vdroop).

AMD now calls this "Pure Power," now, I think.
 
Some are saying the update is only part of the latest Fast Ring Windows Insider Build. I'm calling BS on this since there's no documentation anywhere about any changes to the scheduler. Just a forum post on some Chinese forum.

KB4015438 is the latest regular Windows update, and it just replaces KB4013429.



Changing the scheduler to work better with Ryzen would be a "key change", no?
For all we know microsoft did not change anything except update some drivers , i vaguely remember amd stating the windows ryzen driver would be updated like a month or so after release.
 
Looks normal to me, basically load being spread on 1 physical core per thread. I would say it is painfully obvious even.

You can't and won't get rid of thread migration in Windows without manual affinity setting, so forget it.

Since power management is SMT aware, maybe MS will do something there that helps there. At this point I'm kinda done with the lack of knowledge about vis-a-vis details of Ryzen's cache/DF/memory subsystems. I'll wait till the next stepping of Zeppelin shows up and take a look again (hopefully there will be some tweaks to the IMC and a few other features).
 
For all we know microsoft did not change anything except update some drivers , i vaguely remember amd stating the windows ryzen driver would be updated like a month or so after release.

Like an ACPI device driver?

Edit: duh.
 
Last edited:
Some are saying the update is only part of the latest Fast Ring Windows Insider Build. I'm calling BS on this since there's no documentation anywhere about any changes to the scheduler. Just a forum post on some Chinese forum.

KB4015438 is the latest regular Windows update, and it just replaces KB4013429.



Changing the scheduler to work better with Ryzen would be a "key change", no?

I have an old laptop with celeron dual core and intel 4 series chipset, and this update cut in less than half the time to recover from hybernation. No other enhancement on my old system, but still noticeable...
 
This is because you are using high performance mode. Don't do that. You need core parking enabled to park those logical cores when they do more harm than good.

Why would anyone want to enable core parking? Core Parking is known to cause all kinds of problems, and has almost no benefit at all.
 
Why would anyone want to enable core parking? Core Parking is known to cause all kinds of problems, and has almost no benefit at all.
Core parking works if/when you want to save power, it's mostly useless otherwise, even for Intel processors. The fact that something like a process lasso does a better job at handling core parking & unparking, not to mention setting affinities or even I/O & memory priority, than Windows itself is testament to the shellacking Microsoft's done to their OS. The Android governors on the other hand are much better at handling multiple cores, thanks in part due to big little, remember how octa cores were ridiculed on phones & yet Windows is the one that can't handle 8 cores properly.
 
Dresdenboy, what is the bandwidth between CCX?
Have to answer again. Somehow you look familar, BTW. 😉

iBoMbY made the aforementioned list.
DDR4-2666: 41.7 GB/s
DDR4-3000: 46.9 GB/s
DDR4-3200: 50.0 GB/s
DDR4-3400: 53.1 GB/s
DDR4-3600: 56.3 GB/s
DDR4-4000: 62.5 GB/s
It's not exactly, what one gets in GB/s by calculating mem clock [GHz] * 32 because of a GB being 1.074 billion bytes, but it gives a good impression.
 
Zen/Zeppelin seem obviously geared towards absolute power savings. The decision to tightly couple the IMCs and uncore was a move for efficiency of data transfer, and this appears to corroborate with Zen's considerably higher actual vs. theoretical sustained transfer compared to Intel's products. I think that just about every piece of that silicon was designed with a goal of meeting or exceeding the power characteristics, with few exceptions, Intel is currently capable of while still on an inferior node. This has created a chip with great potential in deployments scaling from a workstation to massive clusters, great potential going forward.
Zen is AMD's Core2/Nehalem in a sense.

My assumptions:
The GMI/IMC (and coupled PCIe/BCLK strap) is clearly the major limiting factor for this design for home users. Zen2 will again not "win" in the home/gamer market unless AMD had planned on significantly revamping the chips clock domains.
Zen2 certainly won't be on 14LPU (at first), much too late in the design phase. No easy 10% clock improvement there.
2x inter-CCX > 1x IMC speeds may be a pipe dream for Zen F4 stepping, but man would that solve a heck of a lot of latency issues (while skyrocketing uncore power draw)

The Stilt, have you gotten your hands on any other 1800x samples to compare Vmin/Fmax? Would be interesting to see possible ranges of binning for a single SKU.
 
What about Raven Ridge CPUs (APUs)? They will have only one CCX (+ up to 11 GPU CUs). So there will be no "issues" with CCX communication

So those SKUs with less GPU CUs might be better choice for gaming?
 
Why would anyone want to enable core parking? Core Parking is known to cause all kinds of problems, and has almost no benefit at all.

Core parking is absolutely vital for Windows scheduling with SMT. Windows' scheduler is likely completely unaware of SMT or much of anything else. Everything is done with external processes/daemons simply creating and sorting lists which the scheduler uses.

Core parking will preferentially park logical cores - giving the scheduler the appearance of being SMT aware.
Without it, you'll never see dual core turbo clocks or XFR on Ryzen and you'll see logical cores being used as if they're real cores.

This is why some people are seeing a sudden improvement in a few cases with the new Windows update - core parking was re-enabled and they didn't notice.
 
Windows scheduler seems perfectly aware of Ryzen's SMT.

It's applications doing their own scheduling that aren't.

Applications can't do their own scheduling (with the exception of those rare kernel-on-kernel designs), they spawn threads and apportion their workload amongst those threads. Some apps are simply spawning too many worker threads and seeing negative scaling as a result because they assume AMD has no SMT and treat it like it has 16 cores.

Some apps may use thread affinity to prevent threads from roaming around, but most don't bother.
 
Applications can't do their own scheduling (with the exception of those rare kernel-on-kernel designs), they spawn threads and apportion their workload amongst those threads. Some apps are simply spawning too many worker threads and seeing negative scaling as a result because they assume AMD has no SMT and treat it like it has 16 cores.

Some apps may use thread affinity to prevent threads from roaming around, but most don't bother.

i wonder how many games use User Mode Scheduling.

https://msdn.microsoft.com/en-us/library/windows/desktop/dd627187(v=vs.85).aspx
 
I always forget about UMS. Not sure how commonly it would be used by games - considering how poorly most scale with core, I'd imagine not that often (since the investment in implementing its use would not be worthwhile unless you are using a task-base processing model).

And it seems to be 64-bit only. I never really bothered to check but how many games actually are 64-bit only nowadays? Ok, probably most but it is a rather new development. Are the game engines pure 64-bit? Don't know.
 
Wouldn't it be fun if AMD had their ARM version of Zen on existing sockets? In both consumer and server.
Would be so much easier to sell ARM, especially in consumer and maybe they could even get Microsoft on board.
TSMC N7 HPC should be ready for tapeouts in June 2017 so a nice 8-12 ARM cores for AM4 next year could be quite interesting.

They wouldn't sell much in consumer but it's important to have ARM as a deskstop platform, otherwise ARM in server remains an uphill battle.
Using the existing infrastructure (sockets) would reduce costs for both AMD and customers.
Ofc it would be a substantial financial effort, unless they already need the ARM core on TSMC's N7 HPC for semi-custom.
 
Status
Not open for further replies.
Back
Top