Ryzen: Strictly technical

Page 52 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

dnavas

Senior member
Feb 25, 2017
355
190
116
Lower SKU(Probably 12/24) 140W TDP, Higher SKU (Probably 16/32) 180W TDP.
[...]
ES's are 3,3 or 3,4 Ghz base and 3,7 Ghz Boost
It is aimed for Retail SKU to have 3,6 Base/4 Ghz Boost
[...]
Announcement; COMPUTEX at Taiwan, sales will start after 2-3 weeks following COMPUTEX.

Wow

3.6Ghz 16 core would be ... phenomenal [I wasn't expecting over 3.3], and getting it inside of 180W would mean they've improved efficiency a bit (assuming they're using a definition of tdp that crosses to the R7s). How much loss should we expect out of an MCM?
 

OrangeKhrush

Senior member
Feb 11, 2017
220
343
96
Will someone buy my R7 1700X?

What does your Crystal Ball says re B2 revisions. It seems pretty clairvoyant thus far.

Theae are not my predictions, I don't do those my name is not Juanrga.

From what i know B2 is more tuned and stable.

I would buy it but for the import duty and stuff but given the market and ryzens day to day improvement many people will want it
 

OrangeKhrush

Senior member
Feb 11, 2017
220
343
96
3.6Ghz 16 core would be ... phenomenal [I wasn't expecting over 3.3], and getting it inside of 180W would mean they've improved efficiency a bit (assuming they're using a definition of tdp that crosses to the R7s). How much loss should we expect out of an MCM?

There are many here more learned in that information that could better answer. I am just a fly on the wall that loves what AMD are up to
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
Why does the last core of one CCX take so much longer to talk to the 2nd last core of the opposite CCX? Is that repeatable?

It's the third core in the CCX that nearly always takes a bit longer for whatever reason. It's fully repeatable - even this is an average of three runs... the peak values (minimum latency) chart I made looks the same, LOL!

Is that at 3GHz with DRAM at 2133?

DDR4-2667, 3.9GHz w/ 3.9424GHz TSC (I'm sneaky like that - lets me count individual cycles faster than the core can clock). I will be doing fixed 3GHz runs with DDR4-2133 for this test soon - I wanted to graph these out and see how much consistency I was really getting and if the results actually made sense. It seems they are consistent and do make sense.

So if the DF Latency is about 25ns max., that still doesn't explain the memory latency. Even if we add the L3 Latency + 2x DF Latency + Memory Latency we still would have lower values than the actual readings.

It maybe partly does - the test results here are already approaching 100ns - and I do remove some of the computational overhead (rdtsc takes time).

The question is how do Intel CPUs ever show things like 19ns? That's technically impossible - you simply can't get the data from memory to a core that fast when it takes >30ns to get data to a neighboring core.
 

imported_jjj

Senior member
Feb 14, 2009
660
430
136
It seems they are consistent and do make sense.
.

Not sure about that, how to you explain 2 to 3 vs 3 to 2 for example? 30+ vs 50+ To stay with the first CCX, 1 to 0 and 0 to 1 is even worse. A similar oddity in CCX2 for the 2 pairs (4-5 and 6-7).
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
Not sure about that, how to you explain 2 to 3 vs 3 to 2 for example? 30+ vs 50+ To stay with the first CCX, 1 to 0 and 0 to 1 is even worse. A similar oddity in CCX2 for the 2 pairs (4-5 and 6-7).

Those particular features aren't always present, they just happened to be here for the average of three runs. I am trying to automate about 20 runs so I can do outlier detection and removal. The three benchmark runs took less than a minute, but the output isn't in an easily digestible format, so I am manually entering the data into a spreadsheet for now, supporting CSV output is a rather simple matter now that I know the layout I need for generating graphs like this one.

There's a reason this is called alpha software :p

Of course, more precise data may still show those features - which will require an explanation at that point.
 

KompuKare

Golden Member
Jul 28, 2009
1,012
923
136
It's the third core in the CCX that nearly always takes a bit longer for whatever reason. It's fully repeatable - even this is an average of three runs... the peak values (minimum latency) chart I made looks the same, LOL!
Of course there has to be logical explanation.
Would be interested if it's just the 3rd core on your individual CPU, or if it's something in all of them?
Maybe a defect in the current mask which forced them to run that connection slower or something? Wonder what it means for the hex cores? Because this makes it look like just disabling one random core per CCX to get 3+3 might deliver different performance that disabling the 3rd one in each CCX.

Does your motherboard allow you disable exactly that 3rd one? If it does, that would be something to test at some stage.
 
  • Like
Reactions: sm625

looncraz

Senior member
Sep 12, 2011
722
1,651
136
Do they? Lowest i remember is more like 30ns with DDR4-4266C12, i.e. APEX+B-die on suicide run.

My 2600k reliably showed 19ns with DDR3-2133 CL9 memory in most memory latency tests.

Of course there has to be logical explanation.
Would be interested if it's just the 3rd core on your individual CPU, or if it's something in all of them?
Maybe a defect in the current mask which forced them to run that connection slower or something? Wonder what it means for the hex cores? Because this makes it look like just disabling one random core per CCX to get 3+3 might deliver different performance that disabling the 3rd one in each CCX.

Does your motherboard allow you disable exactly that 3rd one? If it does, that would be something to test at some stage.

I should be doing 1W1R tests later today as well as making the output more easily digestible. I will do this it a fixed frequency of 3Ghz and with the slowest and fastest DDR4 speeds I can manage (DDR4-1600 and DDR4-2933). That should be accurate enough to see if there's a neighboring-core advantage and whether or not that third core in each CCX is really showing a relative penalty.
 
  • Like
Reactions: Drazick

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
I thought threadripper was the specific technology. What donkey-pit would decide to use it for a product?
 

looncraz

Senior member
Sep 12, 2011
722
1,651
136
I thought threadripper was the specific technology. What donkey-pit would decide to use it for a product?

I think it's just a name like Orochi - for the specific product footprint. I doubt it will be used in official marketing.
 
  • Like
Reactions: Drazick

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
Excuse my ignorance but which solution is better?

Might want to read this:

http://sinhardware.com/index.php/vrm-articles/82-vrm-guide

From the guide:

Doubler: We will talk about the doubler a little as they are commonly used today in the phase wars. They basically are like a multiplexer which takes a single PWM signal and divides that signal into two, however it also reduces the maximum switching frequency by half. There are also some quadrouplers, these are interesting as I have only seen one which is in production(IR3599) and it takes a single input and basically divides it into 4 by dividing into 2 twice, and you get an output which is ¼ the switching frequency of the input. Nowadays we see single doubler chips, or we see them integrated into the driver, however in the past sometimes motherboard makers used analog switches, which would take in 4 PWM signals and output 8. The only place you see this done now is with some X58/Z77/Z68 MSI motherboards.

and

Switching frequency: This is the frequency at which the VRM switches or pulses. Increased switching frequency means that the current will move more quickly throughout the VRM, however while increased switching frequency helps with transient response and helps to decrease ripple, it also reduces efficiency and increases temperature. With a low duty cycle the switching frequency is also limited. The switching frequency determines how often the phases switch or pulse. So if the switching frequency is 300khz then the phases pulse 300,000 times per second or pulses once every 3.33 microseconds(usec) or 3.33e-6 seconds. Switching frequency also needs to be constant along phases, so if you double the phase count you should also cut the frequency in half for each of the doubled phases, if you don’t do this, then the entire effective switching frequency of the entire VRM can double and lead to issues unless already accounted for.

My takeaway from the discussions I've seen of VRM layouts with doublers is that they can reduce switching frequency, which is bad when it comes to stability when overclocking. Typically higher overclocks result in conditions where the CPU has far less tolerance for ripple/voltage fluctuation. It was a big deal on FM2+ where Kaveri was ultra-sensitive to that sort of thing.

The X370-Pro would appear (on the surface) to have better switching frequency than the Crosshair VI Hero. How it stacks up against boards like the Taichi that are 12-phase (6 + doubler) is unknown to me.

Also I'm not sold on these boards with external clockgen chips. The one on my Taichi is not providing any kind of stability at advanced bclk settings. It may be that I am simply not supplying the right voltages.
 

OrangeKhrush

Senior member
Feb 11, 2017
220
343
96
syyDOFFK.jpg
 
  • Like
Reactions: lightmanek

OrangeKhrush

Senior member
Feb 11, 2017
220
343
96
I would like to point out that the information on the X399 was not from my source it was an internet leak, when posed to my source the 64 PCI-e lanes was disputed and also the 2500 score on Cinebench is way under.

I just did random math on what it could be

1601 * 16/8 * 3.5/3.7 = 3028 which can vary, that is based on a guess that the all core will be 3.5ghz and the 1800X scores 1601 at 3.7Ghz. That score is insane.
 
  • Like
Reactions: Zucker2k

blublub

Member
Jul 19, 2016
135
61
101
I would like to point out that the information on the X399 was not from my source it was an internet leak, when posed to my source the 64 PCI-e lanes was disputed and also the 2500 score on Cinebench is way under.

I just did random math on what it could be

1601 * 16/8 * 3.5/3.7 = 3028 which can vary, that is based on a guess that the all core will be 3.5ghz and the 1800X scores 1601 at 3.7Ghz. That score is insane.
I would have bought such a chip in an instant and now I am stuck with AM4 and the big boy gets a new socket.... Argh!

Gesendet von meinem SM-G930F mit Tapatalk
 
  • Like
Reactions: OrangeKhrush
Status
Not open for further replies.