So, basically it still looks SMT aware to me, just that disabling core parking increases amount of jumps between cores significantly.Full core parking, using my tweaks:
https://www.dropbox.com/s/hle0l0reegxoumr/FullParking.png
Core parking a la Win7, with all physical cores unparked:
https://www.dropbox.com/s/v9q6w9bfeknu89v/CoreOverride.png
Core parking disabled:
https://www.dropbox.com/s/gmp2x7dd45ug3e3/NoParking.png
This is on Steamroller - Windows treats CMT the same way as SMT/HT.
This is what I was going over in my post above yours..We get about 45ns or so within cores. We get 98ns or so within RAM. 45 ns (from within the CCX) + 98 ns (from the memory latency) = the 143 ns, which is about what PC Perspective is getting.
My point is that *core parking* is SMT aware, but the core-parking algorithm is *not* part of the scheduler.So, basically it still looks SMT aware to me, just that disabling core parking increases amount of jumps between cores significantly.
I don't think the 142ns figure is from a single cross-CCX access. I think it might be a more complex operation, such as a semaphore, which requires multiple accesses. That's why we need the code performing this test, so we know what it's actually measuring.If Ryzen had a lattency of 142 every time it tried to read from memory, that would mean it's checking L3 cache then memory first each time and it's truly 98-42ns, since you'd be factoring in the seeking from L3 cache before it actually. Which I don't believe it's that fast, I believe they're attempted simutaniously...
Agreed.I don't think the 142ns figure is from a single cross-CCX access. I think it might be a more complex operation, such as a semaphore, which requires multiple accesses. That's why we need the code performing this test, so we know what it's actually measuring.
Fair enough.My point is that *core parking* is SMT aware, but the core-parking algorithm is *not* part of the scheduler.
There is no 1400XI will wait for some time to be certain. I would like the 1400x instead of a possible 1400. Although if i can believe the news, the next one lower in line will be the 1300.
When i buy, i want the hardware to be stable with a matured bios.
I will go for a gigabyte motherboard again. I have good experiences with gigabyte. And as it seems for ryzen, the gigabyte is a good choice for a stable board since the release of ryzen.
I also would like a different custom cooler and i still have to do thorough research about what memory is best.
Will also be including a nvme and that is going to be the most expensive part of the whole new system, a samsung 960 pro 512GB.
I will be updating from a A10-6700 cpu , so it will be a big win in the end.
I can reuse the RX480 card that i have. And use the A10-6700 as a home theater system or backup pc.
There is as the 1400 has 50MHz XFR while the 1500X has 200MHz so a 450MHz gap.There is no 1400X
Ryzen 5 1400 : 4/8 @ 3.2~3.4: $169
Ryzen 5 1500X : 4/8 @ 3.5~3.7: $189
There's no gap to place a 1400X..
Yeah, i found that out as well while reading the anandtech site.There is no 1400X
Ryzen 5 1400 : 4/8 @ 3.2~3.4: $169
Ryzen 5 1500X : 4/8 @ 3.5~3.7: $189
There's no gap to place a 1400X..
I love the logic, but I think the memory latency figures for Ryzen are (at lest partly) inclusive of the CCX and DF latencies. Otherwise increasing core frequency shouldn't reduce it much (if at all) - but I saw 10ns drop going from a core clock of 3GHz to 3.8Ghz and 9GB/s more bandwidth - without touching memory settings.I might have an answer for you.
Memory latency is 98 ns on Ryzen.
https://www.techpowerup.com/231268/amds-ryzen-cache-analyzed-improvements-improveable-ccx-compromises
![]()
Let's think about this.
We get about 45ns or so within cores. We get 98ns or so within RAM. 45 ns (from within the CCX) + 98 ns (from the memory latency) = the 143 ns, which is about what PC Perspective is getting.
So it looks like it is going to DRAM. An L4 cache might be helpful here.
For a comparison, here's Skylake: https://techreport.com/review/31179/intel-core-i7-7700k-kaby-lake-cpu-reviewed/4
![]()
About 45ns, which is less than half of AMD's 98 ns.
If AMD could get memory latency down to Skylake levels, that would be awesome, because that would be about the same speed as the 5960X in terms of CCX + memory latency.
We expect a penalty in quad channel latency, but AMD's memory controller is slow - slower than even Bulldozer it seems.
I think it all leads back to how AIDA64 benchmarks the cache.I love the logic, but I think the memory latency figures for Ryzen are (at lest partly) inclusive of the CCX and DF latencies. Otherwise increasing core frequency shouldn't reduce it much (if at all) - but I saw 10ns drop going from a core clock of 3GHz to 3.8Ghz and 9GB/s more bandwidth - without touching memory settings.
How would a program work to isolate just the IMC to RAM latency? This is something I've never explored (well, creating benchmarks at all, actually, is something I've never had cause to do outside of critical program code).
Just benchmarking memcpy() performance and the average time for accesses to return (using the TSC for timing) is all that comes to mind. Calculate how long it takes to access some memory address you haven't dirtied from a single core.
How Intel systems can show latencies of 19ns is beyond me. There's some black magic going on there... that's lower than the time it takes to get data into a core from memory.
The accurate way to test it would be holding a spin-lock in one thread to protect a structure that simply stores the TSC value, setting that TSC value and releasing the spinlock from another thread. This release would need to happen at an interval MUCH greater than the largest possible latency, so something like 1ms should be fine. Real time thread priorities must be used.I don't think the 142ns figure is from a single cross-CCX access. I think it might be a more complex operation, such as a semaphore, which requires multiple accesses. That's why we need the code performing this test, so we know what it's actually measuring.
XFR was AMD's biggest mistake.Yeah, i found that out as well while reading the anandtech site.
My previous information came from wccftech.
What does the X stand for ?
The 1400 seems to have XFR as well.
It just seems artificially limited.
AMD K15.6, K15.7, K16.6, K17, K17.1 PM2 fan sensor support
Why do folks complain about this, it's pretty much the default way to go about it as it scales with memory BW needs. Running it faster to reduce latency can be an upside but is less efficient.All the problem with Ryzen is that "coherent" data fabric. It runs at half the ram speed.
I really think they should have made a 1400X with the 1500X clocks(4/8 @ 3.5~3.7: $189)There is no 1400X
Ryzen 5 1400 : 4/8 @ 3.2~3.4: $169
Ryzen 5 1500X : 4/8 @ 3.5~3.7: $189
There's no gap to place a 1400X..
AIDA64 wasn't giving correct numbers for L2 and L3, I know, but they said that L1 and RAM were ok.I suggest you read that particular page of the Hardware.fr review. They had AIDA64 engineers design a benchmark to test sequential L3 data access for different block sizes for a more detailed analysis. This is not incorporated in the software release of AIDA64, even in the beta that officially enabled support for Ryzen. Performance is comparable to a 6900K, even a bit faster, up to the ~6MB mark.
True enough. Right now we are working with fixed memory multipliers and locked timings. No idea what Ryzen's memory controller is capable of once fully unlocked.Skylake and Kaby were using 3866MHz CL18 in those TR tests. Ryzen would get to about 60ns with such RAM settings - may be achievable after the May update- and right now it gets to as low as 70ns with 3200 CL14 but there is no access to secondary timings yet. After all is said and done and BIOS settles, 60+ns might be doable with 3200MHz DRAM, close enough to Broadwell-E.
Not sure either.I love the logic, but I think the memory latency figures for Ryzen are (at lest partly) inclusive of the CCX and DF latencies. Otherwise increasing core frequency shouldn't reduce it much (if at all) - but I saw 10ns drop going from a core clock of 3GHz to 3.8Ghz and 9GB/s more bandwidth - without touching memory settings.
How would a program work to isolate just the IMC to RAM latency? This is something I've never explored (well, creating benchmarks at all, actually, is something I've never had cause to do outside of critical program code).
Just benchmarking memcpy() performance and the average time for accesses to return (using the TSC for timing) is all that comes to mind. Calculate how long it takes to access some memory address you haven't dirtied from a single core.
How Intel systems can show latencies of 19ns is beyond me. There's some black magic going on there... that's lower than the time it takes to get data into a core from memory.
When a request or a ping is made to L3 in a different CCX, both L3 cache in that second CCX and the unified IMC are pinged/strobed/checked at the same time.That no, this can't be it because it'd be 98ns to write to memory from the pinging core.
But when it comes to reading.. so say it instantly starts to read, well wouldn't it try to read from memory simultaneously to looking in its L3 cache? It should catch it a moment after it was just written.
If Ryzen had a lattency of 142 every time it tried to read from memory, that would mean it's checking L3 cache then memory first each time and it's truly 98-42ns, since you'd be factoring in the seeking from L3 cache before it actually. Which I don't believe it's that fast, I believe they're attempted simutaniously...
I'm explaining this very poorly..
Basically if the MMU checks L3 before even trying to read memory, that'd mean that the real memory latency is 98-42ns.
So the write speed should have been 56ns. And the read speed on the other core sould have been 98ns instead of 142ns. You're missing 44ns.
None of that can be true as far as I see, but I think I'm still wording this poorly.
https://forums.anandtech.com/threads/ryzen-does-support-ecc-but-mainboards-need-to-enable-it-and.2500613/Has anyone tried ECC on Ryzen? Asrock's boards should support X370. No idea about the other vendors.
XFR was AMD's biggest mistake.
They should have just marked them as 3.45Ghz and 3.9Ghz "precision boost". Instead they make them look lower clocked and spread this lie that they overclocked them as good as your cooling allowed.
"X" just stands for "it's better xD"