The problem with this thread is so many people (particularly on the negative) dont even look at the mirco architecture and think before going off to war. AMD SMT isn't going give CMT style 2nd thread "performance" unless your workload gets bad cache hit rates and thats the same for intel SMT, the reason is simple. CON core has 2/1 load/store a cycle per core and so does Zen and intel, assuming the stack engine does its own simple address gen L/S is the obvious bottle neck given an aprox avg of 50% of x86 ops have a memory component.
To me it's always been the same people pushing negative agenda without being able to back anything up. look at all my posts none of them have any of the marking numbers in them, they are far to simplistic and people pick and choose how to interpret them, its pointless.
look at the architecture.
Everything in terms of the presented architecture puts Zen at a broadwell/skylake level.
much improved L1D (write-back) increased associativity
not detailed by AMD but will be a much lower L2 latency
L3 to have 5 times the bandwidth (we will have to wait and see about latency ) but hat put L3 performance in the same ball park (
http://i.imgur.com/IAkGdxV.png,
https://www.aida64.com/sites/default/files/shot3_cachemem_skylake.png)
Fetch looks pretty simlar, decoubled prediction larger TLB's then BD comparable but different to intel
Decode looks to be the same but now with additional uop cache so again around skylake/broadwell
issue/dispatch is bigger comparable to skylake
PRF files are comparable to broadwell (slightly behind skylake)
I would call Execution the same as >haswell (broadwell, skylake they are all the same) all the unit counting is silly potentially AMD can get one more FP ADD/MUL a cycle
load store system is hardest to gauge as no one especially intel talks about it. no idea about improvements to memory disambiguation*
finally SMT, almost every structure is shared, according to Mr kanter retire on skylake isn't shared and it isn't on Zen either so it looks very comparable.
one big unknown is memory controller performance, given that they are licensing rumbus controller hopefully its not terrible .....lol
Now you can take the position that Intel will squeeze out a little bit more performance from each unit but that still not going to radically change things. So again it come back to why can't Zen perform per clock at a broadwell/skylake level (outside 256bit ops).
*i went looking to see if amd had said anything about memory disambiguation for Zen, all i found is my own post from march saying the exact same thing as i am now.....lol
https://forums.anandtech.com/threads/new-zen-microarchitecture-details.2465645/page-12#post-38097201 good to see the discussion has moved on......