- Jul 28, 2019
Good catch! However that poor scaling is SW problem of that test suite. Don't you think that cloud providers like Amazon, Google or MS Azure will just put twice as much as customers and so effectively solve that scaling issue?Going from 64c/128t AMD Epyc (7742) to 128c/256t Epyc (2x7742) results in a 10% average performance increase.
Going from 8c/16t AMD Epyc (7262) to 16c/32t Epyc (2x7262) results in a 17% average performance increase.
Going from 24c/16t AMD Epyc (7F52) to lower clocked 64c/128t Epyc (7742) results in a 4% average performance increase.
How in the world do you estimate 128c/128t Altra will be 2* 64c/64t Graviton2 in average performance in that test suite?
I assumed linear scaling of course to keep it simple. Some SW like raytracing engines scales linearly. I guess nobody is gonna use 128-core Altra for SW that cannot scale beyond 8-threads.
Jeeez Christ, do not scare me with horror Bulldozer please. You didn't get my point there because it's the opposite way. My idea was that super wide core with 6xALUs+2xBranch, 4xAGU, 4xFPU+ SMT2 would be more powerful than 2x cores with 3xALU+1xBranch,2xAGU,2xFPU, w/ noSMT (Cortex A76). Transistor wise similar but performance wise in ST load it can use all 6xALU and 4xFPU. This idea was also behind SMT4 in Zen3 - if AMD would create Zen3 as double wide as Zen2 it would consists of 6 or 8xALU, 4xAGU, 4xFPU (8xpipes). Applying SMT4 at that double wide core wouldn't hurt performance per thread at all. Even some significant gain is possible.