New Zen microarchitecture details

Page 117 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

KTE

Senior member
May 26, 2016
478
130
76
What is questionable is your ability to take the numbers for what they are once they are at odd with your IPC "estimations", they could have used Cinebench 11.5 that it would had displayed the same figures, it s just that it would have given more info to their competitor, why should they do so when the latter didnt even disclose their SKL uarch before the last HC, that is, one year after it was released, so much for transparency...
Whenever AMD has withheld such [CPU] data within the past decade, it's been a complete disaster on release, and FAR opposite to the forum hype from the year before.

I can understand the "if they leak" bit, but the same excuse kept being given for Barcelona and Bulldozer, and look what happened.

The default position is to doubt until full, complete evidence -- not the other way round. And it should be encouraged as part of open thinking.

Sent from HTC 10
(Opinions are own)
 
  • Like
Reactions: CHADBOGA

NostaSeronx

Diamond Member
Sep 18, 2011
3,687
1,222
136
How do you know that? The Stilt found that AVX/AVX2 support made no difference in Blender performance on Intel processors.
I went looking and it isn't an issue with AVX2. For blender generic build, or 2.77 release,etc. __m256 style instructions aren't used, instead the intrinsics are __m128. If the switch was made we could see up to 30% improvement on the Broadwell-E processor. Which is technically two full refreshes or one new architecture ahead of AMD.

imho, they should get rid of FMA code for AVX2.
VIA Isiah 2 doesn't support it. Intel & AMD largely execute it slower than Mul+Add.
 
Last edited:
  • Like
Reactions: KTE

DrMrLordX

Lifer
Apr 27, 2000
21,710
10,986
136
I went looking and it isn't an issue with AVX2. For blender generic build, or 2.77 release,etc. __m256 style instructions aren't used, instead the intrinsics are __m128. If the switch was made we could see up to 30% improvement on the Broadwell-E processor. Which is technically two full refreshes or one new architecture ahead of AMD.

Hmm. Interesting. Have you been able to test this yourself? I would if I had any Intel chips around, but I don't. Nor do I have Summit Ridge (heh).

imho, they should get rid of FMA code for AVX2.
VIA Isiah 2 doesn't support it. Intel & AMD largely execute it slower than Mul+Add.

I'm sure there's some reason to support FMA in AVX2. Probably for those corner cases where it can be faster than Mul+Add. VIA uarchs supporting one thing or another will not factor into many code decisions (sorry VIA fans, it's true).
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
Hmm. Interesting. Have you been able to test this yourself? I would if I had any Intel chips around, but I don't. Nor do I have Summit Ridge (heh).


No need to test it, operands ( not instructions ) are __m128 in source code, so 128bit wide vector code will get generated, either SSE2, with some SSE4 (dot products) or AVX ( with some 128bit FMA3 code ). Not sure why they are not using wider vectors, but could be due to inherent limitations in their data structures.

Still the point stands, it is perfect test case to extract maximum from AMDs 4 mixed FPU pipes, while not stressing data paths beyond 128bits and leaving half of Intel's vector hw idle. Still i am very impressed that they are able to run code like that @ Broadwell speeds at same clock. Intel used to have insane advantage in such code and I know from my work on Skyrim's SkyBoost mod vectorization efforts - SandyBridge was able to chew through code, shrugging penalties for unaligned loads and turning great results.
 

ElFenix

Elite Member
Super Moderator
Mar 20, 2000
102,418
8,370
126
What is questionable is your ability to take the numbers for what they are once they are at odd with your IPC "estimations", they could have used Cinebench 11.5 that it would had displayed the same figures, it s just that it would have given more info to their competitor, why should they do so when the latter didnt even disclose their SKL uarch before the last HC, that is, one year after it was released, so much for transparency...

Intel disclosed the SKL micro architecture back at IDF 2015.

That s not the subject but they actually disclosed it only at HC 2016, hence the recent articles on the thing, hardware.fr state that they asked for infos that never come despite promises made one year earlier, so much for your 2015 date.


http://www.hardware.fr/news/14761/hot-chips-m1-sve-parker-info-skylake.html

What do you see on the slide below, HC 2015..?..

1-630.2498363346.png


https://www.computerbase.de/2016-08/kaby-lake-prozessor-intel/


Of course one does need much effort to imagine what did prompt them to do so..


Nothing that ressemble AMD exhaustive explanation of their uarch, frankly, that s poor, no wonder they got back at HC one year later.

To get back on topic you can see on the slide above that SKL has two 128b/256b FP ports while Zen has four 128b ports, if anything this should be more efficient in legacy SSE, wich is the norm since there s no way that AVX or even FMA can be used for more than little parts of a code.

http://www.intel.com/content/www/us...-ia-32-architectures-optimization-manual.html

More detailed info in the optimization guide that's been out since last year.

Intel presentation at HC this year was a joke. It's very disappointing they let pure marketing presentation happen :(



what's questionable is people's inability to keep their usual AMD vs. Intel/nvidia garbage out of this thread.

look, we know, AMD does not have a good history in being forthcoming with their products' weaknesses. we don't need to be reminded of it every page of this thread. nor do we need to bury our heads in the sand.

but i will remind you all that not a single one of us has committed a dollar of cold hard cash to buying this thing, and aside from a few diehards for whom the benchmarks don't matter, we're all waiting for actual wide ranging benchmarks and investigation from the review sites.

this is one of the few threads that has been able to steer clear of most of the garbage baggage many of you carry around, and i'd like to see it stay that way.
 

Nothingness

Platinum Member
Jul 3, 2013
2,503
901
136
what's questionable is people's inability to keep their usual AMD vs. Intel/nvidia garbage out of this thread.
Even though I admit having been off topic in my comment regarding Intel HC presentation, I'm certainly no hater/fan of either AMD or Intel, and being accused of that upsets me, but I'll live with that :D

FWIW, I'm part of the crowd that waits for more benchmark results for Zen before drawing any conclusion, even if my experience in processor design makes me think they'll need several iterations before being really competitive against Intel higher end CPU.

But in the end all that matters is that x86 dies :p
 

leoneazzurro

Senior member
Jul 26, 2016
952
1,516
136
What AMD need for the moment is not to beat Intel on pure performance, but to be competitive enough on the price/performance/power relationship. If they come in a reasonable distance from Intel offerings and price the parts adequately, they could gain back some market share. I.e. I expect the ST performance to be lower than Broadwell, but if it is not much lower, AND they place their 8 core at a similar/slighly lower price than Intel's 4 core parts, and 4 core dies in competition with Intel's two cores, they will have a good price/perf in MT application (I could dare to say an edge, but it's too ealry to say anything at this moment) while Intel will still rule i.e. gaming or poorly threaded applications. If they want to add too much premium, then it wil lbe a flop. But, nevertheless, I think Zen will definitely improve the situation compared to the present desktop scenario (but that's easy, I suppose).
 

DrMrLordX

Lifer
Apr 27, 2000
21,710
10,986
136
No need to test it, operands ( not instructions ) are __m128 in source code, so 128bit wide vector code will get generated, either SSE2, with some SSE4 (dot products) or AVX ( with some 128bit FMA3 code ). Not sure why they are not using wider vectors, but could be due to inherent limitations in their data structures.

Fair 'nuff.

Still the point stands, it is perfect test case to extract maximum from AMDs 4 mixed FPU pipes, while not stressing data paths beyond 128bits and leaving half of Intel's vector hw idle. Still i am very impressed that they are able to run code like that @ Broadwell speeds at same clock. Intel used to have insane advantage in such code and I know from my work on Skyrim's SkyBoost mod vectorization efforts - SandyBridge was able to chew through code, shrugging penalties for unaligned loads and turning great results.

Intel did have quite a lead. I would like to see Summit Ridge run something like Cinebench R10 or some other legacy fp benchmark. It would probably do quite well.

But in the end all that matters is that x86 dies :p

Boo!
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
I went looking and it isn't an issue with AVX2. For blender generic build, or 2.77 release,etc. __m256 style instructions aren't used, instead the intrinsics are __m128. If the switch was made we could see up to 30% improvement on the Broadwell-E processor. Which is technically two full refreshes or one new architecture ahead of AMD.

imho, they should get rid of FMA code for AVX2.
VIA Isiah 2 doesn't support it. Intel & AMD largely execute it slower than Mul+Add.
If Blender still uses single precision, those 128 bit vectors fit quite nicely, as 4 vector elements are a good base for the typical coordinate transformations using 4x4 matrices. Colors could be represented as RGBA as well.

So going from 1 vector (128b, 4 floats) to 2 parallel vectors (256b, 8 floats) might just make the code more complicated than bring significant benefits.

Also going to double precision to preserve 4 wide vectors would also cost bandwidth and capacity at other places in a processor (e.g. data caches are being effectively halved, and mem BW as vectors/clock too).

This is also an interesting topic for 3D and physics engines in games.
 
Last edited:

KTE

Senior member
May 26, 2016
478
130
76
Has this been posted before? Is it reliable?

https://twitter.com/BitsAndChipsEng

"Zen SMT it's more advanced than Skylake SMT, according to several software developers. More info during next weeks. Stay tuned!"

"Zen L3 Cache will be a lot faster than Broadwell-EP/EX L3 Cache!"

"Zen ES currently in the wind are just EVT (Engineering Validation Test) CPUs. The frequencies of final ES will be a lot higher, as we said."

The key from the beginning was achieving the performance gains while keeping power consumption down, Clark said. With Zen, energy efficiency was going to be just as important as performance”. During the first interview about Zen, Jim Keller said: “We know how to do small dense cores and we know how to do high frequency. What I've asked the team to do is take the DNA from both”.

But, as Clark says, this work is not exempt from risks: “For me, it wasn't the first time, but you don't do it often because it is so daunting. It's going to take a lot of [effort] and time … It comes with a lot of risk”. However, Clark is a veteran. He works at AMD since K5 project.

Last, but not the least, now we know why this new uArch is called Zen: “As the lead [on the core development], I picked 'Zen' as the [codename] because zen is a balance. We needed to balance the whole thing to make it work”.
 
Mar 10, 2006
11,715
2,012
126
Has this been posted before? Is it reliable?

https://twitter.com/BitsAndChipsEng

"Zen SMT it's more advanced than Skylake SMT, according to several software developers. More info during next weeks. Stay tuned!"

"Zen L3 Cache will be a lot faster than Broadwell-EP/EX L3 Cache!"

"Zen ES currently in the wind are just EVT (Engineering Validation Test) CPUs. The frequencies of final ES will be a lot higher, as we said."

The key from the beginning was achieving the performance gains while keeping power consumption down, Clark said. With Zen, energy efficiency was going to be just as important as performance”. During the first interview about Zen, Jim Keller said: “We know how to do small dense cores and we know how to do high frequency. What I've asked the team to do is take the DNA from both”.

But, as Clark says, this work is not exempt from risks: “For me, it wasn't the first time, but you don't do it often because it is so daunting. It's going to take a lot of [effort] and time … It comes with a lot of risk”. However, Clark is a veteran. He works at AMD since K5 project.

Last, but not the least, now we know why this new uArch is called Zen: “As the lead [on the core development], I picked 'Zen' as the [codename] because zen is a balance. We needed to balance the whole thing to make it work”.

The bits and chips guy is not a reliable source at all, honestly seems like an attention seeking AMD fan. Listen to The Stilt on these forums, he is the real deal.
 
  • Like
Reactions: Phynaz

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
What AMD need for the moment is not to beat Intel on pure performance, but to be competitive enough on the price/performance/power relationship. If they come in a reasonable distance from Intel offerings and price the parts adequately, they could gain back some market share. I.e. I expect the ST performance to be lower than Broadwell, but if it is not much lower, AND they place their 8 core at a similar/slighly lower price than Intel's 4 core parts, and 4 core dies in competition with Intel's two cores, they will have a good price/perf in MT application (I could dare to say an edge, but it's too ealry to say anything at this moment) while Intel will still rule i.e. gaming or poorly threaded applications. If they want to add too much premium, then it wil lbe a flop. But, nevertheless, I think Zen will definitely improve the situation compared to the present desktop scenario (but that's easy, I suppose).

Relying on price has been AMD's problem. There's a saying in sales - "He who lives by price dies by price".

I don't know how anyone expects AMD to stay in business selling 8 cores at the price Intel sells 4 cores at. Even Lisa Su has publicly stated she wants AMD to stop being the cheap chip company.
 
Mar 10, 2006
11,715
2,012
126
Relying on price has been AMD's problem. There's a saying in sales - "He who lives by price dies by price".

I don't know how anyone expects AMD to stay in business selling 8 cores at the price Intel sells 4 cores at. Even Lisa Su has publicly stated she wants AMD to stop being the cheap chip company.

I don't think AMD will be able to charge more than around $349 for 8C/16T Zen given the perf/clock and clock speeds. 8C/16T Zen will likely be what AMD positions as the direct competition to 4C/8T Kaby Lake-S. AMD will likely try market it on multi-threaded competitiveness relative to the 7700K.
 

SpaceBeer

Senior member
Apr 2, 2016
307
100
116
They have started with that strategy when they released RX 480 - It's slower (sometimes), has higher TDP and yet the same price as GTX 1060
 

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
I don't think AMD will be able to charge more than around $349 for 8C/16T Zen given the perf/clock and clock speeds. 8C/16T Zen will likely be what AMD positions as the direct competition to 4C/8T Kaby Lake-S. AMD will likely try market it on multi-threaded competitiveness relative to the 7700K.

I dont believe 8C 16T ZEN will compete against Socket 1151 and Core i7 7700K, there will be 4C 8T and 6C 12T SKUs for that segment.
 

DrMrLordX

Lifer
Apr 27, 2000
21,710
10,986
136
Has this been posted before? Is it reliable?

Haven't seen that before, and I have no way of knowing that it's reliable. I think hype-based stuff should be taken with a hefty dose of skepticism. It's interesting though, and statements of that nature should be examined carefully once more benchmarks are released.
 

superstition

Platinum Member
Feb 2, 2008
2,219
221
101
Phynaz said:
Relying on price has been AMD's problem. There's a saying in sales - "He who lives by price dies by price".
I guess the folks saying that aren't familiar with the way gourmet is greatly outclassed, in terms of profits, by fast food. Fine wine is outclassed among even adults by soda sales. The Honda Fit sells a lot more and makes a lot more money for its company than a Rolls.

AMD looks to have purposefully gone for an inexpensive-to-produce product with Zen. Small die. No iGPU complexity. No expensive process. By keeping TDP targets low for AM4, as well, it shaves off motherboard cost.

I don't think we should be alarmed if AMD prices Zen quite affordably. The strategy looks to be similar to that of Polaris. AMD ceded the high end to Nvidia but is getting back marketshare. This is definitely not the first time, either, than AMD/ATI has chosen to produce a less expensive-to-produce product. With GPUs this has often been the strategy it has taken.

I would love for AMD to max out the die size (and not with "moar cores" but with huge caches including L4 as well as industry-leading SIMD/FPU) and use a high-power process without a dense library. But, AMD is going for profit not impressing enthusiasts. I just hope they won't be cheesy and use polymer TIM.
 
Last edited:

MajinCry

Platinum Member
Jul 28, 2015
2,495
571
136
Didn't Lisa Su state that AMD wasn't going to be the discount-bargain king? Y'know, that they're going to actually make some decent returns on their CPUs.

Of course, anything is better than the current situation; the top FX processor being sold for ~£100. They'll price it in accordance to how it performs against the competition, as always. Just now they'll have competent enough CPUs that they might make it past the £200 mark.
 

dark zero

Platinum Member
Jun 2, 2015
2,655
138
106
I don't think AMD will be able to charge more than around $349 for 8C/16T Zen given the perf/clock and clock speeds. 8C/16T Zen will likely be what AMD positions as the direct competition to 4C/8T Kaby Lake-S. AMD will likely try market it on multi-threaded competitiveness relative to the 7700K.
Remember that the chip will be useful for other things like Virtual Machines, giving an edge on that. It can go up 500 dollars at worst and 700 at best.
 

CentroX

Senior member
Apr 3, 2016
351
152
116
I am hearing rumors that zen is on par with kaby lake performance, wow amd does not kid around any longer.
 

Doom2pro

Senior member
Apr 2, 2016
587
619
106
That is supposed to be Zen+, not that Zen. Also After Zen+ , they will go with Starship.

Starship... Is that the 7nm part that was rumored? Come to think of it, the new AMD-GloFo WSA gives that rumor more credibility...