New Zen microarchitecture details

Discussion in 'CPUs and Overclocking' started by Dresdenboy, Mar 1, 2016.

  1. KTE

    KTE Senior member

    Joined:
    May 26, 2016
    Messages:
    471
    Likes Received:
    130
    So typical AMD marketing means advertising best case highly multi-threaded result vs. the 'current' flagship core...

    1st major neg.

    Even 2x in CB15 vs. Piledriver 8370 would mean pretty modest IPC/average performance gain on a more than doubling of cores (plus SMT). Reminds me of AMD ads comparing Barcelona vs. K8 lol

    BTW, for any of us who were active back then and remember; Phenom was a very big leap for AMD. From K8. But it was too late, power hungry and too low in clocks to be competitive. And then it never OC'd either.

    Nothing I've seen indicates any better breakthrough with Zen. However, daily, what we're seeing indicates Phenom 9500 and it's buildup.

    So, on majords graph, I'm pitting the release Zeppelin flagship model in heavily ILP code @100-115% 6700K mark, and in most other serially limited code to be average around the 2700K mark.

    With Turbo, that is. I also can't see release frequencies being high either. Max base 2.8-3.0GHz for Zeppelin all cores loaded.

    As for the argument of "why release a poor product?" My God. Try following HPE, IBM, MoD, or phone MFGs. That's all many have done for 4-5 years. Because corporate internal politics is huge, vulture culture and is what dictates, not the end product or it's suitability. Because guys are sitting in lofty positions on fat checks with absolutely zero bother or competence. Because everyone watches their own back and deflects the problem onto someone else within corps. Because most internally do not, ever, work together, but as my deadline vs your deadline, as department vs department. Multi-billion dollar companies pay millions of expenses and contracts a year for piss poor services and outdated 80-90s products (a la mainframes). Just because none can be bothered to change it and take that additional workload/responsibility. You might think this cannot possibly happen with such clever people. Dear Lord! If you ever saw the figures...

    It's never engineering vs engineering. Geeks have this very simplistic outlook of best design will surface and flourish. Never so. It's about projects, teams, budgets, internal reputation and fat cat bonuses.

    Sent from HTC 10
     
  2. DrMrLordX

    DrMrLordX Diamond Member

    Joined:
    Apr 27, 2000
    Messages:
    7,893
    Likes Received:
    282
    If you had actually been paying attention to Zen all along, you'd be more worried about 8c/16t Summit Ridge's throughput versus a hypothetical 8m/16t Excavator product in multithreaded scenarios. It's looking like Zen will be quite nice compared to previous AMD offerings in sparsely-threaded apps but maybe not so stellar in the throughput department. Which is a real shame, honestly.

    That AMD is choosing to advertise Zen's mt performance at all is more encouraging than anything else. It's meant to be a server/datacenter CPU.
     
  3. Dresdenboy

    Dresdenboy Golden Member

    Joined:
    Jul 28, 2003
    Messages:
    1,672
    Likes Received:
    405
    I don't know... having seen the FinFET impact on actually low power archs and also some Zen/K12 related AMD patents describing what they did to get the targeted cycle times, I think we can't constrain the speculated clock frequencies at iso power that much at this point.

    For example they have fast and slow operands for instructions, where the latter get additional time to be ready. Never seen that bedore (but might exist somewhere).
     
    #1478 Dresdenboy, May 27, 2016
    Last edited: May 27, 2016
  4. KTE

    KTE Senior member

    Joined:
    May 26, 2016
    Messages:
    471
    Likes Received:
    130
    I'm not convinced...

    A server CPU advertisement is typically when the ST/SC performance falls short. A la Magny-Cours.

    Sent from HTC 10
     
  5. ElFenix

    ElFenix Super Moderator and Elite Member
    Super Moderator

    Joined:
    Mar 20, 2000
    Messages:
    96,375
    Likes Received:
    232
    well orochi is the x1xx series. vishera is the core name for the x3xx series.
     
  6. The Stilt

    The Stilt Golden Member

    Joined:
    Dec 5, 2015
    Messages:
    1,198
    Likes Received:
    1,321
    Nope. Both Bulldozer (DT marketing name Zambezi) and Piledriver (DT marketing name Vishera) are based on Orochi die. Orochi Rev. B (OR-Bx, Bulldozer) = Zambezi, Orochi Rev. C (OR-C0) = Vishera.
     
  7. Ajay

    Ajay Platinum Member

    Joined:
    Jan 8, 2001
    Messages:
    2,848
    Likes Received:
    63
    Is there any concrete info on this? TSMC's 16FF+ appears to be built for higher FMAX or lower power consumption over their 16FF process. It the reason most customers waited for the better process.

    With all the delays, I've kind of lost track of where 14LPP is at for GLF.
     
  8. Phynaz

    Phynaz Diamond Member

    Joined:
    Mar 13, 2006
    Messages:
    9,331
    Likes Received:
    234
    So was Bulldozer. We know how that turned out
     
  9. AtenRa

    AtenRa Lifer

    Joined:
    Feb 2, 2009
    Messages:
    12,222
    Likes Received:
    1,010


    [​IMG]

    [​IMG]

    [​IMG]

    [​IMG]
     
  10. ElFenix

    ElFenix Super Moderator and Elite Member
    Super Moderator

    Joined:
    Mar 20, 2000
    Messages:
    96,375
    Likes Received:
    232
    Ah. Well, the slide would probably need to be changed as orochi could be either of two very differently performing products.
     
  11. DrMrLordX

    DrMrLordX Diamond Member

    Joined:
    Apr 27, 2000
    Messages:
    7,893
    Likes Received:
    282
    With all due respect, that's a cheap shot. AMD has produced server-oriented processors before with success. Just because they had BD doesn't mean everything they'll produce forevermore will be just as bad.

    They also produced Magny Cours which was pretty damn awesome considering that their backs were against the wall when they made it. They took struggling Agena and turned it into something amazing.

    Oh, and KTE . . . Magny Cours was exceptional considering their product lineup of the time, so your point . . . ?
     
  12. SAAA

    SAAA Senior member

    Joined:
    May 14, 2014
    Messages:
    371
    Likes Received:
    12
    AtenRa I hope you see how that supposed clock/power advantages are measured at mW of power and 1-2GHz speeds.
    Yes at 1, 1.5 and 2GHz you might need half the power but then if you hit a wall above 3 it's pretty useless: there won't be that advantage at 100W and 3.5-4GHz, the same way there was a clock regression with finfets at high speed for Intel's 22nm.
    They target server and 2-3GHz range at best with Zen (especially the 32 cores multi-die), desktop will be the outlandish with ~3.5GHz turbo on single core loads.

    Being honest 32nm were the best processes for high speed, both SOI and bulk, making 5GHz possible. Maybe in a ideal world adding finFET to them might lead to even better performance (if they can reduce power on a larger node at these speed) but that will never happen.
     
  13. Ajay

    Ajay Platinum Member

    Joined:
    Jan 8, 2001
    Messages:
    2,848
    Likes Received:
    63
    Well, first a thanks ot AtenRa for the slides. I do agree with the above quote - I noticed that none of the results where for HVT cells - which is what I would think AMD & GLF would to clock Zen up high enough to be competitive. The challenge remains that FinFet Power curves get very flat a higher VTs. The fact that an 8C/16T server chip has a max TDP of round 95W seems to argue for a lower Fmax than would be necessary to compete with Intel (often at 140W or higher for top end server CPUs and HEDT).

    I expect Zen to easily outperform Kaveri - just still in the dark on how well it will do against it's main rivals - time will tell.
     
    #1488 Ajay, May 27, 2016
    Last edited: May 27, 2016
  14. deasd

    deasd Member

    Joined:
    Dec 31, 2013
    Messages:
    198
    Likes Received:
    14
    IIRC the 'Orochi' is a die-shrink codename, Zambezi/Vishera/Interlagos are same Orochi die.
     
  15. Dresdenboy

    Dresdenboy Golden Member

    Joined:
    Jul 28, 2003
    Messages:
    1,672
    Likes Received:
    405
    I learned that it's the name of the 8C die. Or do you mean it to be the shrink from the cancelled 45nm variant?
     
  16. deasd

    deasd Member

    Joined:
    Dec 31, 2013
    Messages:
    198
    Likes Received:
    14
    I could only remember Orochi is a die codename, not a dedicate name of an architecture or SKU. Because I didn't remember(and can't be googled) CPUZ recognizing any AMD cpus as Orochi.
    It's funny that AMD use 'Orochi' in its latest slide, consider it was only used when Bulldozer(Zambezi) was still on paper, and now the slide has been modified immediately. I think it could mean something, maybe those staffs who made slide realize themself leaking too much.o_O
     
    #1491 deasd, May 28, 2016
    Last edited: May 28, 2016
  17. USER8000

    USER8000 Golden Member

    Joined:
    Jun 23, 2012
    Messages:
    1,281
    Likes Received:
    386
    But you are making the assumption AMD is comparing one CMT core to one Zen core with HT. What if the comparison is one module to one Zen core with SMT??

    Even then SMT is not a 100% scaling,so lets say around 30% higher versus a single core.

    So an 8C/16T Zen CPU having twice the performance of an 8C Bulldozer chip would still mean most of the performance would be down to single threaded IPC and clockspeed increases.

    Let's assume AMD are comparing one Bulldozer core to one Zen core with SMT.

    That would still mean that 8C/16T Zen CPU would have at least 70% to 75% improvement in single threaded scores assuming a 30% improvement to multi-threaded performance with SMT.
     
    #1492 USER8000, May 28, 2016
    Last edited: May 28, 2016
  18. AtenRa

    AtenRa Lifer

    Joined:
    Feb 2, 2009
    Messages:
    12,222
    Likes Received:
    1,010
    You really believe DT ZEN at 14nm FF LPP will only have a 3.5GHz turbo on single core when BristolRidge at 28nm HDL Bulk can reach 4GHz ??

    If that was the design/mArch target for ZEN, then IPC would be way higher than 40% over Excavator.

    I strongly believe Turbo will reach 4GHz and may even be higher.

    one more slide.
    [​IMG]
     
  19. The Stilt

    The Stilt Golden Member

    Joined:
    Dec 5, 2015
    Messages:
    1,198
    Likes Received:
    1,321
    It's not. They've said "over current generation core" several times IIRC. According to AMD a 15h CU is two cores, not one.

    EDIT: It's still in the investor presentation slide.

    "Over 40% improvement in IPC over current AMD CPU core".

    Also the image in other slides, diplaying the generational improvement says "core".
     
    #1494 The Stilt, May 28, 2016
    Last edited: May 28, 2016
  20. USER8000

    USER8000 Golden Member

    Joined:
    Jun 23, 2012
    Messages:
    1,281
    Likes Received:
    386
    If you actually looked at the post I was replying to he said AMD were comparing a 16C/32T part with an 8C Bulldozer chip.

    Hence the worst case scenario would have been 1C Bulldozer core vs 1C/2T Zen core.

    Even Intel with its large cores tend to not see not much more than 30% uplift with SMT on average. It means if an 8C/16T Zen is double the performance of a 8C Bulldozer chip,most of the uplift is down to single threaded performance improvements.

    Assuming that clockspeeds are similar that is easily Ivy Bridge or Haswell level performance and if clockspeeds are lower it indicates even higher IPC.

    Remember this news which was reported on recently:

    http://www.itwire.com/it-industry-n...ll-favourably-compete-with-intel-skylake.html

    AMD might have issues hitting the high clockspeeds Intel does,but it wouldn't surprise me if IPC is actually not too bad at all.
     
    #1495 USER8000, May 28, 2016
    Last edited: May 28, 2016
  21. KTE

    KTE Senior member

    Joined:
    May 26, 2016
    Messages:
    471
    Likes Received:
    130
    It's true. I haven't followed computing since I left the field circa 2010. But I'm now working back in computing for the past year. Bear with me whilst I refresh but I do remember some basics

    Throwing more cores at a problem like AMDs at the time is never a solution, only a temporary band-aid - a stop gap. I had plenty of private chats with John Fruehe at the time (explaining this).

    Let's talk about today and the future. Right now, all the big businesses have moved or are in the middle of migrating to the cloud models; Iaas and Paas mainly (that I'm seeing). Licensing is per Core+GB of RAM and OS dependent. Wintel is FAR cheaper than the rest, I mean 1/50 to 1/3 some others. The major hosting companies right now use 4-8 Cores. Above that is rare, except when you really need AIX boxes or HP-UX. Even if you ask for more cores (VMs) from cloud providers, they won't offer you more than 4 on Wintel platforms. Something that'll be in the contracts.

    AMD needs that +90% market Wintel has fortified. In corps, people in purchasing and infrastructure right now don't even know of AMD being available in this [server] market.

    New process is always finicky, hence why MFGs tend to use it with matured archs first: To iron out the problems. New process + new complex arch far more so. It's rarely achieved without problems (clocking and power primarily). It's just part of the process learning curve.

    But any 8C/SMT chip will if properly gated and modulated present a major thermal budget for n-4/8 Cores to tap into. I suppose the process performance at such MHz will primarily dictate that due to PVT sensitivities but clock jitter, skews and distribution become the limiting factors in conjunction with the delays. It's not so simple to keep multiple high clocking and voltage domains, on large chips, in sync with stability, even if you posses a vast thermal budget. Gosh, even hot spots wreck havoc on gate and wire delays which directly affect the clocking.

    Anyone know the type of clock distribution BD/PD uses? Or how many drivers and buffers are used?

    What about FO4 delays for any of AMDs recent archs?

    Sent from HTC 10
     
  22. SAAA

    SAAA Senior member

    Joined:
    May 14, 2014
    Messages:
    371
    Likes Received:
    12
    My bad, I didn't specify that was for the 8-core model, and yes I do believe 3.5GHz will be more than enough, especially if the tdp is really 95W. What if Bristolridge is 4? A Zen at 3.5 would be like a 5GHz Excavator so no real problem there.
    Anyway yes 6-cores might easily have higher turbos up to 4GHz, exactly like dual and three modules Piledriver had higher clocks. Unless they decide to make the top bin an extreme in all senses so you have an fx9590 successor...
     
  23. CentroX

    CentroX Senior member

    Joined:
    Apr 3, 2016
    Messages:
    328
    Likes Received:
    137
    Since AMD is skipping Polaris at Computex, will they atleast showcase Zen? Or will they be a complete noshow?
     
  24. ShintaiDK

    ShintaiDK Lifer

    Joined:
    Apr 22, 2012
    Messages:
    20,392
    Likes Received:
    120
    They are only showing Bristol Ridge, aka Carrizo for desktop.
     
  25. The Stilt

    The Stilt Golden Member

    Joined:
    Dec 5, 2015
    Messages:
    1,198
    Likes Received:
    1,321
    Stoney also.