We have source code which have defined instructions to do whatever it will do. Machine-level instructions won't matter as there's multiple possible ways to translate those source code instructions even for same ISA. Whole Spec meaning is to offer source-code based benchmark which can be translated to cpu specific instructions freely.
Score/GHz is that benchmarks IPC. There's zero point of making machine instruction level comparison - in that race just adding desired amount of nops to instruction flow makes your IPC to rise - exactly to as high as wanted if cpu hardware is made to execute nops fast.
1) Agree, pure IPC evaluation is purely scientific and has little application to the real-world, where raw SPECint, SPECfp, and any other benchmark is all we care about, and normalizing to GHz doesn't matter one lick. However, again, I didn't make the IPC claims though, someone else did, and I'm a curious person, so I'm curious about it, and haven't been able to verify their claims, and they haven't backed up their claims with proof. That's all this is.
2) I agree that the instructions given by SPECint are to achieve a certain task, but there are many steps along the way that can cause variances in results. So when someone says SPEC/GHz
is IPC, I want to see the proof. The reason I ask this is:
- Shouldn't we consider the purported (but not verified) 8 or 10% difference in instructions retired between the two ISAs?
- Shouldn't we consider differences in benchmark results depending on compiler? Do we actually know the differences between Clang/LLVM as part of Xcode, and whether it has any benefit or detriment compared to Clang/LLVM on Ubuntu on the same machine? I ask because as I understand, Xcode compiles in a hardware/device-specific fashion to optimize the application for that device. To the best of my knowledge Clang/LLVM on Ubuntu and Windows doesn't necessarily do so. Isn't this a potential source of variance?
- Specific to SPEC/GHz, shouldn't we also consider differences between reported boost score and average clock speed actually seen during testing?
This is not a comprehensive list, and again, these are just questions I'm asking to those who claim that there is no real difference between SPEC/GHz and IPC. While individually small, such differences do compound. When we introduce a bunch of areas of small (and easily dismissible, it seems) error, then we end up with the un-verified (and possibly wrong) assumption that IPC scales with SPEC score.
Please understand, this is all just me being inquisitive. I get the sense that there is no way that the IPC lead with A13 is anything but substantial. I am just curious
how big. And that requires showing proof that the above factors have been controlled for, which wasn't done before, as best I can tell.
And also when people started making very specific claims ("+83%", "+80%", etc.) about how big the IPC lead is, I got excited and intrigued that they actually had the proof, but it seems that they don't actually have the data.