Are you guys referring to the same thing (integer, floating point, etc.)? From what I've read online, opinions are divided, some say it's Haswell level of IPC, others that it's Broadwell level (especially in floating point), while others say it's between Ivy Bridge and Haswell but closer to Ivy. Is there no consensus yet?
By the way, which is Ryzen's strongpoint, integer or floating point?
There is only one muppet who claims IB level IPC. You have to be careful, look what the workload is bound by, you might compare a IB with 1866 DDR3 memory to a Zen review running 2133 DDR4 and go "look at the IVB IPC, ROFL fail". While ignoring that middle of the road 1866 (9-9-9-27) has a access latency of 9.65ns while the 2133 DDR4 has an access latency of 13.2ns.
Now give the Zen SOC 3200mhz (15-15-15-X) which is 9.33ns and compare again, getting a double uplift because it brings up the entire Infinity fabric clock rate.
i did this test on ivb using dragon age:
Code:
speed timings Memory latency NS AIDA latency FPS
1333 7-7-7-24-T1 10.5 62.2 122
1333 8-8-8-27-T1 12 66.1 116
1333 9-9-9-30-T1 13.5 72.8 74
1333 10-10-10-33-T1 15 75.4 70
1600 8-8-8-27-T1 10 60.5 122
1600 9-9-9-30-T1 11.25 64.4 106
1600 10-10-10-27-T1 12.5 61.5 107
1600 12-12-12-39-T1 15 76.1 75
1866 9-10-9-33-T2* 9.6 60.2 108
1866 10-11-10-36-T2 10.7 58.1 105
1866 12-13-12-39-T2 12.8 66.9 97
1866 15-16-15-40-T2 16 FAIL TO POST
1866 11-12-11-36-T1 11.7 62.9 111
Here it is again sorted by latency
Code:
speed timings Memory latency NS AIDA latency FPS
1866 9-10-9-33-T2* 9.6 60.2 108
1600 8-8-8-27-T1 10 60.5 122
1333 7-7-7-24-T1 10.5 62.2 122
1866 10-11-10-36-T2 10.7 58.1 105
1600 9-9-9-30-T1 11.25 64.4 106
1866 11-12-11-36-T1 11.7 62.9 111
1333 8-8-8-27-T1 12 66.1 116
1600 10-10-10-27-T1 12.5 61.5 107
1866 12-13-12-39-T2 12.8 66.9 97
1333 9-9-9-30-T1 13.5 72.8 74
1333 10-10-10-33-T1 15 75.4 70
1600 12-12-12-39-T1 15 76.1 75
1866 15-16-15-40-T2 16 FAIL TO POST
*9-9-9-30 unstable 9-10-9-33 maybe not working well?
The latency trend is clear.
Anything that has serial sections of code where maximum ILP cant be extracted ( even if it can run lots of threads) gets good gains from reducing memory latency.