I doubt they are doing it by hand for all the codeThey suggest it's converting scalar code into vector code. I assume Intel didn't have someone do that by hand (what a waste that would be).
I doubt they are doing it by hand for all the codeThey suggest it's converting scalar code into vector code. I assume Intel didn't have someone do that by hand (what a waste that would be).
Who knows.I doubt they are doing it by hand for all the code
But they got busted really bad. Really should've tried to buy John Poole off 😀Who knows.
All I can say it is proven we cannot trust what Intel marketing says about this. They're being very selective.
Yup, I smell a promotion.Wow. Someone's going to get in trouble at Intel...
Wow. Someone's going to get in trouble at Intel...
Technically they are safe cause there is no change in instruction set it's not like a AVX2 only processor Becomes a AVX-512 they avoided it cleverly.Yup, I smell a promotion.
They are not going to feel very clever if after seeing how much vectorization is possible, John gets to work and releases GB 6.8 that is massively vectorized and helps AMD even more. He will want to do that to discourage Intel because he does not want them optimizing every version and him having extra work to do before every GB release. He does not want AMD or websites stopping the use of his benchmark in reviews because it can be cheated at.they avoided it cleverly.
Well yeah but i doubt anyone can out do Intel in X86_64 Software optimization they increased this much score just by iBOT imagine how much they can do with if they wrote stuff correctlyThey are not going to feel very clever if after seeing how much vectorization is possible, John gets to work and releases GB 6.8 that is massively vectorized and helps AMD even more. He will want to do that to discourage Intel because he does not want them optimizing every version and him having extra work to do before every GB release.
He won't do that, or shouldn't. It defeats the point of a cross-platform benchmark. We know that hand-tuned code can get very nice results on wide vector engines. We've known this since Cray. But GB, to be a useful consumer benchmark, should try to reflect real world applications and the optimizations they use. Autovec is reasonable, maybe, but real world applications seldom use it because it often provides small increases while limiting who can run your binary.They are not going to feel very clever if after seeing how much vectorization is possible, John gets to work and releases GB 6.8 that is massively vectorized and helps AMD even more.
He could change the scoring method and split the scores into INT and FP ones while noting that the FP ones could be massively optimized by a vendor and may not reflect real world optimizations in applications. But seriously, Intel has opened a whole can of worms. They need to make iBOT vendor agnostic and apply the optimizations indiscriminately to whatever CPU if they want the worms to disappear.He won't do that, or shouldn't. It defeats the point of a cross-platform benchmark.
At this point, we can assume Intel is cheating until proven otherwise.Well yeah but i doubt anyone can out do Intel in X86_64 Software optimization they increased this much score just by iBOT imagine how much they can do with if they wrote stuff correctly
I think the upcoming 6.7 release is only there to get around iBOT.He already invalidate iBOT Run which is fine and should be that way
YeahI think the upcoming 6.7 release is only there to get around iBOT.
They suggest it's converting scalar code into vector code. I assume Intel didn't have someone do that by hand (what a waste that would be).
Currently the lack of transparency is making it hard to say how clever this is. Is this a breakthrough that will allow us speed up legacy software en masse or cheap trick that will fade away when they will get a CPU that can compete without this hassle.thoughts?
I think the steps are actually something like this:
1. Lift x64 binaries to LLVM IR with something like remill.
2. Use llc to recompile this using PGO, autovec, -march for the target architecture. Store the check sum and corresponding new binary on Intel's benchmark busting servers for that chip family
3. iBOT client has a white list of known binary checksums for which it should fetch replacement binaries on start on supported chip family
4. Replace the original binary. It's actually hard for me to understand the additional start up time even after the initial download of new binaries. It shouldn't take two seconds to load an alternate GB binary but maybe they're doing something clever - like applying this entire process only to certain functions.
But Intel doesn't want to say what they're actually doing. I wonder why. If I'm right all the tools to do this are open source and very general but combined in a new way here. This would work on any chip not running natively tuned builds for which llvm has reasonable architectural optimzations. But you would probably need to review the results of steps 1/2 which explains the whitelist.
In real world apps, not unfair. It’s good.custom drivers to make their product have the best competitive edge? Yes. But is it unfair? Likely not?
Step 1 can go wrong. They can fix it up. I think that's all it is.If that's all they're doing there's no reason it couldn't be done locally since it only has to be done once. It has to be something more than that.
Load of horse since they can always contribute optimizations to Geekbench directly.thoughts?
View attachment 141092
So using the GPU driver analogy, if a GPU driver makes a GPU look better in a gaming benchmark, is that bad?In real world apps, not unfair. It’s good.
In benchmarks, bad