SPEC CPU 2006 : Skylake Core M vs Broadwell Core M vs Apple A9X

imported_ats · Jan 23, 2016

Thala said:
The differences are mostly in the back-end. However there are important optimizations at front-end and IR level like for instance ipo and loop transformations for auto-vectorization, which are identical. Also things like aliasing hints are handled consistently by gcc. That having said, i am convinced that the gcc code generator/back-end is better optimized for x86 due to AArch64 being relatively new.

One could just as easily argue that the x86 backends are filled with significantly more legacy cruft(and hence significantly worse/slower) than the new and shiny v8 backends.

jhu · Jan 23, 2016

Arachnotronic said:
I think you are right that console emulators would be a very good measure of CPU performance.

I think consideration should be given for the purpose of benchmarks. The way I see it, there are really only two real reasons for benchmarking: 1) to ascertain theoretical peak performance (eg how many actual FLOPs can this CPU do?) and 2) use as a proxy for performance for a particular work load (eg I need a new database server; how many transactions/sec can this system I want to buy actually do?).

Benchmarks like SPEC and Antutu are good for #1, but absolutely useless for #2. While I find benchmarks for #1 interesting, I would take more stock in benchmarks aimed at #2. Is SPEC going to tell me how smooth gameplay on Star Wars Battlefront is going to be?

72Threads · Jan 23, 2016

Intel can win half an intel-sponsored sponsored benchmark suite on an intel-sponsored site with an intel-biased editor all the while running intel-biased compiler code, thus intel is clearly the best cpu design company. I read it on anandtech!

Nothingness · Jan 23, 2016

jhu said:
Could you supply a download link? I can't seem to find one.

SPEC isn't freely available. You can buy it, it's a few hundreds of dollars.

As far as icc goes, I've been testing it for years and I'm always surprised it isn't able to generate better code than gcc for any of my code or any of the code I'm interested, which are not vector or SIMD friendly code. Completely useless as far as I'm concerned... And it's not only a SPEC compiler, it's also an AnTuTu compiler

Exophase · Jan 23, 2016

ViRGE said:
The problem with emulators is then you've just created a proxy test for JIT compiler performance. Which is not to say that they aren't useful as an application benchmark, just that it's probably not a good architecture benchmark.

I can't really say a lot about other emulators, but that's not an accurate statement for mine. The benchmarking I have covers time spent 2D, 3D, geometry, audio, and everything else (including CPU) by disabling those subsections. It varies a lot from game to game, but because it uses software rendering it's common for the time spent on 3D to be similar to the time spent on everything else. 2D, 3D, and geometry all use SIMD heavily (at least on ARM) but they also are pretty big testers of other things like various parts of the memory subsystem and at parts, branch prediction. The rendering also scales with core counts, up to 3 threads anyway (or 2 for 2D)

Even the "everything else" part is not entirely spent in translated code. There's time spent managing events (on DS lots of state changes happen during the screen update), switching between the two CPU cores, etc. And unlike programs on PC DS games (and a lot of other console games) spend a significant amount of time accessing peripherals directly through memory mapped registers. Typical numbers can be 5% of reads and 10% of writes. So how this is handled makes a big impact and is also separate from translated code.

If I look at a profile I can generally see several dozen functions in the top 80+% of execution time, and the actual translated code and the store handler it uses generally accounts for < 20% of runtime except in very CPU heavy or pathological games. It's a major part of what the test is about but it doesn't make the whole thing a proxy for it. And "JIT performance" itself isn't really a single thing; part of it will be instrumenting the emulator's translation facilities, but another part of it will be instrumenting the behavior of whatever it's actually emulating, which is highly variable.

On the other hand, I would guess that the Dolphin Povray test currently done on AT is a lot closer to spending most of its time in translated code since I doubt much is going on with 3D or audio there, and the stuff it's running is probably a much simpler/more limited test case than a real game.

ViRGE · Jan 23, 2016

72Threads said:
Intel can win half an [intel-sponsored sponsored benchmark suite

Uhh...

Current SPEC members:
Acer Inc.
Action S.A.
Advanced Micro Devices
Amazon Web Services, Inc.
Apple Inc.
ARM
ASUSTeK Computer Inc.
Avere Systems
Bull S.A.
Cavium Inc.
Ciara Technologies Inc.
Cisco Systems, Inc.
Dell, Inc.
E4 Computer Engineering SPA
EMC
Fujitsu
Gartner, Inc.
Google, Inc.
Hitachi Data Systems
Hitachi Ltd.
Hewlett Packard Enterprise
HP Inc.
Huawei Technologies Co. Ltd.
IBM
Inspur Corporation
Intel
Lenovo
Micron Technology, Inc.
Microsoft
NEC - Japan
NetApp
NVIDIA
Oracle
Panasas
Pathscale
Primary Data
Principled Technologies
Qualcomm Technologies Inc.
Quanta Computer Inc.
Red Hat
Samsung
SAP AG
Seagate
SGI
Sugon
Super Micro Computer, Inc.
SUSE
Symantec Corporation
Unisys
Via Technologies
VMware
Wipro Ltd.
ZTE Corporation

Exophase · Jan 23, 2016

What does a SPEC member mean, that they get to vote on what's included in the suite? How many of those members were around when SPEC2006 was being decided? What facilities, if any, does SPEC have to remove a test that's too easy to break? Do they actually care? My impression is that they don't.

I guess we'll see what the next SPEC looks like. It's long overdue compared to the previous release cadence. AFAIK they were actually going to release a mobile flavored variant.

ViRGE · Jan 23, 2016

Exophase said:
What does a SPEC member mean, that they get to vote on what's included in the suite?

Checking their policies, yes. It looks like a pretty typical committee structure; members participate in working groups and committees to develop benchmarks, and every participant gets a vote. Meanwhile even non-participating members get a say during the review and comment process.

How many of those members were around when SPEC2006 was being decided?

A quick check of the Internet Archive says that all of the majors except ARM were members in 2006, including Apple. The organization is older than about half the posters here (went NPO in the 80s), so some of these guys would have been members for a very long time.

What facilities, if any, does SPEC have to remove a test that's too easy to break? Do they actually care? My impression is that they don't.

They have a retirement process; ultimately it seems to hinge on the Open Systems Steering Committee agreeing that certain parameters (what they call a checklist) have been met. As for whether they care, their policies would seem to indicate that if they don't care they shouldn't be serving on the committee.

Vesku · Jan 23, 2016

Intel does optimize heavily for Spec performance, regardless of who is on the committee. Trading blows with an upstart Apple chip is not a great place for them to be.

Zodiark1593 · Jan 24, 2016

While it's easy enough to obtain the clockspeeds of the Macbook and the ultrabooks used in Anandtech's test, what about the clockspeed of the A9X during the run? Did it (the A9X) maintain it's base speed throughout, or did it too throttle back, and if so, by how much?

Arachnotronic · Jan 24, 2016

Vesku said:
Intel does optimize heavily for Spec performance, regardless of who is on the committee. Trading blows with an upstart Apple chip is not a great place for them to be.

There's nothing "upstart" about Apple's chip efforts...

Vesku · Jan 24, 2016

Arachnotronic said:
There's nothing "upstart" about Apple's chip efforts...

Only being around for a few years vs 30+. Competing head to head with Intel in performance after just a few generations of custom ARM SoCs.

Exophase · Jan 24, 2016

Vesku said:
Only being around for a few years vs 30+. Competing head to head with Intel in performance after just a few generations of custom ARM SoCs.

But that doesn't really fall into the equation here. Apple simply never had and probably still doesn't have the incentive to optimize for SPEC. Intel does, because Intel is (and especially has been in the past) selling in server spaces where they're competing against vendors with different archs and SPEC scores have always been a major metric in advertising to customers.

Apple on the other hand is selling in a place where almost no one cares about SPEC and where there are several other more relevant factors that people do care about. Maybe now that Anandtech has dragged out this SPECint2006 comparison with iPad Pro they'll care slightly more, but I doubt it.

When push comes to shove, the engineering efforts Apple would need to maintain a fork of llvm/Clang that implements whatever style benchmarking breakers Intel uses just for the libquantum and maybe hmmr tests are probably not that insurmountable. If they cared I'm sure it could have been done years ago and incorporated into Xcode. This isn't to say they'd now have a compiler that does everything ICC does and just as well for all cases, I'm just talking about a few hacks to heavily improve a couple of very specific benchmarks.

Vesku · Jan 24, 2016

Exophase said:
But that doesn't really fall into the equation here. Apple simply never had and probably still doesn't have the incentive to optimize for SPEC. Intel does, because Intel is (and especially has been in the past) selling in server spaces where they're competing against vendors with different archs and SPEC scores have always been a major metric in advertising to customers.

Apple on the other hand is selling in a place where almost no one cares about SPEC and where there are several other more relevant factors that people do care about. Maybe now that Anandtech has dragged out this SPECint2006 comparison with iPad Pro they'll care slightly more, but I doubt it.

When push comes to shove, the engineering efforts Apple would need to maintain a fork of llvm/Clang that implements whatever style benchmarking breakers Intel uses just for the libquantum and maybe hmmr tests are probably not that insurmountable. If they cared I'm sure it could have been done years ago and incorporated into Xcode. This isn't to say they'd now have a compiler that does everything ICC does and just as well for all cases, I'm just talking about a few hacks to heavily improve a couple of very specific benchmarks.

I've been agreeing. It's not "Oh look Apple's chip only wins half in Spec 2006." Instead it's "Wow, Apple's chip wins half in Spec 2006 a benchsuite that Intel puts a lot of time and effort into."

ShintaiDK · Jan 24, 2016

Vesku said:
Only being around for a few years vs 30+. Competing head to head with Intel in performance after just a few generations of custom ARM SoCs.

Remember its a 147mm2 chip as well with double the memory bandwidth and bigger caches that should favour it. Yet Skylake is still in front. And the Skylake in question may even be lower TDP.

imported_ats · Jan 24, 2016

Vesku said:
Only being around for a few years vs 30+. Competing head to head with Intel in performance after just a few generations of custom ARM SoCs.

Apple's design team is staffed with numerous veterans from DEC/Alpha along with other storied designs teams/houses. Apple isn't really an upstart by any meaning of the term. They purchased a very well respected design team started by Dan Dobberphul back before they made CPUs and have added to it since then. They have acquired many of the best and brightest from numerous design teams over the years. It would honestly be quite shocking if they weren't competitive with Intel.

imported_ats · Jan 24, 2016

Exophase said:
When push comes to shove, the engineering efforts Apple would need to maintain a fork of llvm/Clang that implements whatever style benchmarking breakers Intel uses just for the libquantum and maybe hmmr tests are probably not that insurmountable. If they cared I'm sure it could have been done years ago and incorporated into Xcode. This isn't to say they'd now have a compiler that does everything ICC does and just as well for all cases, I'm just talking about a few hacks to heavily improve a couple of very specific benchmarks.

There's actually a presentation out there of exactly what LLVM/CLang need. They really aren't benchmark breakers and have real general purpose value. The major thing that LLVM/CLang lacks wrt libquantum is AoS<>SoA conversion. That conversion capability is what opens up all the parallelism in libquantum and applies to whole classes of code as well. Many of the things that ICC is doing are just very complex transforms of code to unlock/unblock things (another one is loop fusion which merges multiple loops into 1 loop).

Thala · Jan 24, 2016

Remember its a 147mm2 chip as well with double the memory bandwidth and bigger caches that should favour it. Yet Skylake is still in front.

You still do not get it. It is not Skylake which is in front, but ICC with the Intel provided compiler options which are used in this particular test. Did you skip the whole thread?
Or are you expecting A9X running scalar code beating Skylake running parallel code due to bigger caches?

ShintaiDK · Jan 24, 2016

Thala said:
You still do not get it. It is not Skylake which is in front, but ICC with the Intel provided compiler options which are used in this particular test. Did you skip the whole thread?

Let me guess, GeekBench is the perfect test suite? :sneaky:

Thala · Jan 24, 2016

Let me guess, GeekBench is the perfect test suite?

And again it is not necessarily the test suite, which is lacking but how things are compiled. I have no particular argument against the SPEC suite itself.
Geekbench is by no means perfect. But at least they do not do this blunder of comparing parallel vs. non-parallel code or allowing one compiler to use ipo (e.g. building a database of all compilation units to drive inlining) or allowing one compiler to get aliasing hints.

What makes SPEC itself problematic is that ICC does optimizations to the code when given the SPEC code, that it typically do not do if you give ICC your own code which deviates from SPEC apparently. This is at least an indication that ICC is optimized based on code found in SPEC in the first place. For my typical use-cases there is no much difference to gcc but it beats gcc easily when running SPEC.

ShintaiDK · Jan 24, 2016

If you think SPEC is bad, GB is 10x worse. So whats the next bench?

Thala · Jan 24, 2016

If you think SPEC is bad, GB is 10x worse. So whats the next bench?

How come you manage to misinterpret me, despite me having mentioned several times in this very thread that SPEC is fine if you use the same compiler with same options?
At least we currently have no indication that something with SPEC is flawed to the same extend that Antutu was flawed.

stuff_me_good · Jan 24, 2016

Thala said:
And again it is not necessarily the test suite, which is lacking but how things are compiled. I have no particular argument against the SPEC suite itself.
Geekbench is by no means perfect. But at least they do not do this blunder of comparing parallel vs. non-parallel code or allowing one compiler to use ipo (e.g. building a database of all compilation units to drive inlining) or allowing one compiler to get aliasing hints.

What makes SPEC itself problematic is that ICC does optimizations to the code when given the SPEC code, that it typically do not do if you give ICC your own code which deviates from SPEC apparently. This is at least an indication that ICC is optimized based on code found in SPEC in the first place. For my typical use-cases there is no much difference to gcc but it beats gcc easily when running SPEC.

Don't worry mate, I hear you. Some people just don't know how to use their brain. Don't bother to feed the Intel biased trolls.

Arachnotronic · Jan 24, 2016

imported_ats said:
Apple's design team is staffed with numerous veterans from DEC/Alpha along with other storied designs teams/houses. Apple isn't really an upstart by any meaning of the term. They purchased a very well respected design team started by Dan Dobberphul back before they made CPUs and have added to it since then. They have acquired many of the best and brightest from numerous design teams over the years. It would honestly be quite shocking if they weren't competitive with Intel.

Good post because it's 100% true. AFAICT, Apple raided AMD, they got a lot of Intel's Atom folks (Silvermont/Airmont lead architect for example) and even some big shots on Core (former Intel Fellow Per Hammarlund, Haswell lead architect), nabbed a bunch of IBM guys, etc.

People like to see Apples competitiveness as a sign of weakness for Intel when they should be viewing it as Apple doing an excellent job.

PPB · Jan 24, 2016

ShintaiDK said:
Let me guess, GeekBench is the perfect test suite? :sneaky:

This is the problem with fanboys which are also totally clueless about the topic discussed. They think the guy in front of them is also a fanboy and needlessly bring the perceived "competitor" (from their biased POV) to the discussion when it is not relevant to the discussion and thus no point is made mantioning it at all.

Geekbench has some bad tests, but ICC is specifically targeting SPEC test's code. Where we come from, we call that breaking the test. It is also disturbing ICC is used more on benchmarking suites than the real world software whose workloads should represent. The undeniable conclusion is that people should avoid SPEC2006 for benchmarking when it involves Intel processors.

Arachnotronic said:
Good post because it's 100% true. AFAICT, Apple raided AMD, they got a lot of Intel's Atom folks (Silvermont/Airmont lead architect for example) and even some big shots on Core (former Intel Fellow Per Hammarlund, Haswell lead architect), nabbed a bunch of IBM guys, etc.

People like to see Apples competitiveness as a sign of weakness for Intel when they should be viewing it as Apple doing an excellent job.

Intel is doing something wrong if they let any company catch up to their performance this quick. Because Intel is no small company and can also grab all the talent they want with their deep pockets. This also proves Intel's internal competition between their 2 bigger design teams isnt really usefull if they can only bring 5-7% (averaged between ticks and tocks) IPC increases between new CPUs and having sloppy developments like removing FIVR from Skylake, getting it back for Cannonlake, linking Core and Uncore speeds on Sandy Bridge, rollbacking on Haswell onwards, etc.

CPU design for Intel is a lot bigger of a commitment that it is for Apple, at least right now. If Apple starts being ahead of Intel in performance and perf/watt with their Ax SoCs, they will start to see less and less reasons to even use Intel's products. Because as deep as their pockets are for bringing CPU talent, they are too for having a fast transition to a ARM based OSX ecosystem.

SPEC CPU 2006 : Skylake Core M vs Broadwell Core M vs Apple A9X

Senior member

Lifer

Junior Member

Diamond Member

Diamond Member

Elite Member, Moderator Emeritus

Diamond Member

Elite Member, Moderator Emeritus

Diamond Member

Platinum Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Lifer

Senior member

Senior member

Golden Member

Lifer

Golden Member

Lifer

Golden Member

Senior member

Lifer

Golden Member