I think it's very irresponsible for reviewers to include synthetic benchmarks (like memory bandwidth, ALU throughput, etc). These benchmarks are extremely sensitive to microarchitecture details, and often you'll find that the code has been optimized for some processors but not others.
Originally posted by: Fox5
More likely laziness than intentional. At the time the benchmark was made, it was probably easiest to decide on optimizations (such as SSE) based on cpu identification. Perhaps all Intel cpus sold at the same support SSE2, whereas AMD was limited to SSE/3dnow, and everything else wasn't worth bothering with.
That has never been the case. The CPUID instruction returns feature bits that software is supposed to use to determine whether or not a given instruction set extension (e.g. SSE3) is supported by the processor.
That said, there is still a possibility that this wasn't malicious - the PCMark developers might have written codepaths that were optimized for each vendor's architecture regardless of the supported instructions. Each code sequence has different performance characteristics under each architecture, so it may be that AMD and Intel are given different SSE2 codepaths.... or that one vendor's implementation of SSE3 doesn't buy as much speedup so older instructions are used instead. In this case, we would see see a particularly strong effect because the performance characteristics of Nano are drastically different from the previous Via processors, and much more similar to AMD/Intel processors (since Nano finally supports out of order execution). Code optimized for narrow-issue in-order processors is
not going to be optimal compared to code optimized for a wide out of order processor, regardless of the vendor / microarchitectural details.
This particular case may be different from the Intel compiler maliciousness that was seen a few years ago, where non-Intel CPUs
never got an SSE codepath. In this case, the PCMark developers may have optimized for existing Via CPUs, but not yet optimized for Nano; in Intel's case they simply deoptimized 100% when a non-Intel CPU was used. It's also possible that Via was getting some generic codepath that still used SSE instructions but wasn't specifically optimized for any architecture, and it turns out that Nano looks enough like the AMD/Intel microarchitectures that code optimized for either of them runs pretty fast.
This is primarily a problem with "highly optimized" code - it's highly optimized
for a given microarchitecture and may perform poorly on a different microarchitecture, even though the second microarchitecture may be "better" in general. K8 trashes Core 2 on
this code sequence because that sequence of instructions is perfectly optimized for K8's available resources. It would be simple to write a small program that shows Core2 in a good light against K8; it's probably possible to write something that even makes Via Nano look very good (Nano's FP execution latency is very low, so a dependent chain of FP operations would probably be relatively fast on Nano clock-for-clock).
Did anybody else notice that nobody compared Nano to mainstream processors? Did Via force reviewers not to do that?