Apple A12 benchmarks

french toast · May 7, 2018

Andrei also thinks it's pretty decent, I happen to think it is the best mobile benchmark.
I read a good article about a month ago explaining why it performs so well on apple compared to the Rest, I'll try and dig it up.

Headfoot · May 7, 2018

I don't doubt its a fine benchmark, but since when is literally just 1 benchmark sufficient... seems like the apple reality distortion field is in effect. Readers of anandtech demand a broad spectrum of benchmarks for literally every other product....

Thala · May 7, 2018

Headfoot said:
I don't doubt its a fine benchmark, but since when is literally just 1 benchmark sufficient... seems like the apple reality distortion field is in effect. Readers of anandtech demand a broad spectrum of benchmarks for literally every other product....

Therefore Geekbench already happens to be a collection of workloads/benchmarks. It is not just a single kernel.

Andrei also thinks it's pretty decent, I happen to think it is the best mobile benchmark.

It is not a "mobile benchmark" but a "cross-platform-benchmark". This holds in particular since the data-set-size unification.

DIfferent results with the same CPU are related to compiler toolchain:

Platform Compiler
Android Clang 3.8 NDK r12
iOS Xcode 7.3
Linux Clang 3.8
OS X Xcode 7.3
Windows Visual Studio 2015

HurleyBird · May 7, 2018

ksec said:
Sigh.

I think you should go and read up what each GB test before you make a judgement.

Yes, GB is a collection of tests. That puts some kind of limit onto how ridiculously you can optimise for it, but do you think that somehow prevents mobile SOC designers from running a cost/benefit analysis on individual components? Of course not.

Again, look at 3DMark. It's not testing one single thing like polygon throughput or texturing performance. It hits the entire rendering pipeline. In fact, it's far easier to design a generally applicable GPU benchmark than it is a CPU benchmark since all rendering loads approximating a 3D video game have a huge amount of commonality while CPU loads can be basically anything.

In an alternate world where consumers judge the speed of GPUs almost entirely on 3DMark scores, do you think that 3DMark hitting the entire rendering pipeline would prevent GPU vendors from abusing a lack of benchmark variety? Obviously it wouldn't. We know so from history -- just look at R600 which was only incidentally a 3DMark monster while sucking fumes everywhere else. If vendors actually set out to specifically target 3DMark, we'd likely see GPUs that are, conservatively, over twice as fast in it than they are currently. And it would be at the expense of real world performance.

I'm not saying that Apple actually does this. Just that in the current climate it would be smart to and there isn't anything preventing them. The real question may not be "if" but "to what degree," since the thought of Apple/Samsung/Qualcomm flat out ignoring GB is fairly ludicrous. Without an actual suite of benchmarks it's hard to know. If it looks to good to be true though...

Eug · May 7, 2018

HurleyBird said:
I'm not saying that Apple actually does this. Just that in the current climate it would be smart to and there isn't anything preventing them. The real question may not be "if" but "to what degree," since the thought of Apple/Samsung/Qualcomm flat out ignoring GB is fairly ludicrous. Without an actual suite of benchmarks it's hard to know. If it looks to good to be true though...

I think Apple flat out ignores Geekbench, at least as far as marketing is concerned.

My main issue with Geekbench is the tests are relatively short, so they may emphasize Turbo Boost more than perhaps they should. So in this case some might wonder if Intel is actually getting an advantage here.

french toast · May 7, 2018

Antutu is also collection of tests, I'm just saying...most people coming to these forums are well rounded enough to know a complex component like a CPU really needs to be judged over a wide selection of apps testing all of its features and capabilities, for instance how much AVX2 workloads are in geekbench? How long are they and how much data do they use outside of L2?..how representative are these tests of real world work?

Geekbench is considered a mobile benchmark, even if it does have validity imo, you don't always see geekbench in PC reviews, unlike benchmarks like cinebench, Corona, SPEC int/fp, dolphin, gaming tests/draw calls ect..
Geekbench 4 is legit especially when using it to compare generations of the same product or different products on the same compiler, but across different architecture and OS?
I'm not convinced, just wish I could find that article I read.

Thala · May 7, 2018

HurleyBird said:
Yes, GB is a collection of tests....

Wow, just wow. Geekbench is actually one of the better tests for x86. It can get worse.

Example:

volatile __int64 var = 0;

DWORD _cdecl worker(void* pparam) {
__int64 param = *(__int64*)pparam;
while (param < 0x20000000)
{
while (var != param);
var = param + 1;
param = param + 2;
}
return 0;
}

Thats integer code right? I am running this code on 2 cores so its even multithreaded integer code.

I am compiling the code above as 64 bit release for both ARM64 und x64 using Visual Studio 2017.

My i7-6700K @4ghz finishes the code in about 25 second.
My Snapdragon 835 (Cortex A73) @ 2.5GHz finishes the code in about 12 seconds.

In both cases 2 cores are loaded 100%.

So can we conclude now, that Qualcomm knows my code and they specifically optimizing for it, because the result is too good to be true?

HurleyBird · May 7, 2018

So you can create micro benchmarks that are more performant on a specific architecture? Wow. News at 11.

Yes, you can find a scenario that is good/bad for a random x86 core and/or compiler. And yes, you can do the same for some random ARM core. This literally has nothing to do with anything. It's totally orthogonal to what we're discussing.

GeekBench isn't a micro benchmark. It is entirely possible to optimise towards a benchmark even when it tries to approximate general purpose usage. It's also possible for an architecture to perform abnormally in a general purpose benchmark entirely incidentally. No one is saying that Apple's GB scores must be untrue only because they look too good, just that a high degree of scepticism is warranted when you have such a narrow scope of benchmarks combined with a high amount of platform noise.

Thala · May 7, 2018

HurleyBird said:
Yes, you can find a scenario that is good/bad for a random x86 core and/or compiler. And yes, you can do the same for some random ARM core. .

Example please, or just blowing hot air?

Oh and this is very much on topic. If someone optimized for a particular benchmark, then most other code, which is not contained in the benchmark needs to perform worse.

So according to Geekbench the i7-6700K is about 3 times faster than the Cortex A73. If Qualcomm has specifically optimized for Geekbench most other code needs to have a bigger factor than 3 in favor of the i7-6700K.

IntelUser2000 · May 7, 2018

oak8292 said:
How much power or IPC improvement could Intel make if they ditched 32 bit compatibility? How much of a drag is the ability to run the 10 year old legacy software that is a requirement of so many Windows users?

This argument about whether the x86 ISA is a hindrance or not pops up every now and then. Regardless of the ISA the fact of the matter is that Apple may just have a team that's better at executing than anyone else. This is true in all aspects of life. Also the total market Apple participates in is many times bigger than x86. The sheer amount of development may have enabled a CPU uArch winner to arise.*

Intel makes $30 billion a year just selling x86 PC chips. Various vendors work at trying to enter the space has proven the market for Intel-based PCs value compatibility. They have their own niche, an entrenched space no one can enter because of this. There's no reason they should give this up because of maybe 20%(made up high number) performance advantage.

Apple, and the other mobile OS vendors carved up their own niche. And despite the years of x86 vs rest of the world mantra, the two markets are in co-existence. And I think they'll continue to do so for a while even if Intel falls completely flat on its face. And AMD will do their part too.

*Similar in the memory world, Micron admits despite years of DRAM replacements being proposed, due to the fact vast majority of real world deployments are based on DRAM, it kept so-called "DRAM-killers" in check.

In lithography, its similar with EUV. EUV is technically superior, but the traditional DUV isn't standing still, and many refinements kept EUV from being a real advantage over DUV.

HurleyBird · May 7, 2018

Thala said:
Example please, or just blowing hot air?

I have way better uses of my time than to create a bunch of micro benchmarks and test them on various platforms to appease some guy on a forum, but this should be self evident and obvious to just about everyone. Fitting or not fitting within a certain cache size and the degree and type of branchiness alone has an enormous impact. Do you actually believe that you can't find the reverse of your example given that the 6700K is substantially faster than the 835 in practically every real world metric?

Thala · May 7, 2018

HurleyBird said:
I have way better uses of my time than to create a bunch of micro benchmarks and test them on various platforms to appease some guy on a forum, but this should be self evident and obvious to just about everyone.

So hot air...nothing to support you conspiracy theories...

HurleyBird · May 7, 2018

/sigh

IntelUser2000 · May 7, 2018

Actually, there is a case of a vendor optimizing for Geekbench, and working quite well. Samsung's own Exynos 9810. Of course it doesn't work for Apple because their chips perform good for other benchmarks, unlike the Exynos.

It may be possible mobile CPUs are built to benefit the bursty load nature of mobile usage. But at this point I think its quite undeniable that uarch leader is now Apple. Good CPUs are not only faster in the real world, but synthetics, and go past any driver/compiler optimizations. ATI used to keep falling behind Nvidia when the latter would come with optimized drivers. Then ATI introduced the R300 and was so far ahead no amount of driver changes would put it in favor of Nvidia.

Thala · May 7, 2018

Do you actually believe that you can't find the reverse of your example given that the 6700K is substantially faster than the 835 in practically every real world metric?

I do not know of any particular weakness of Cortex A73 similar to the example above. That's why i was asking you for examples - would have been sufficient to just describe them.

Most other code (realword code mind you, not artificial examples like the above one) i ever compiled for both architectures support that a factor of 3 for integer code is pretty much spot on. For compute bound code more often than not factor is closer to 2.5 and almost never larger than 3. (I do not take into consideration SSE code pathes when i do not have NEON available).
So currently i cannot find an evidence that due to vendor optimizations Geekbench is overly optimistic for ARM - and from the looks of it you cannot provide evidence either.

Besides my example from above tests the performance of the cache coherency protocol implementation - which might by interesting for some multithreaded algorithms with fine grained communication. Such a test is apparently not part of Geekbench.

What Samsung did with Exynos 9810 needs some further analysis though

geoxile · May 7, 2018

Any 3D mark physics tests?

Systems analyst · May 8, 2018

ARM have stated that it is common practice to run the SoC design in emulation, with not only the o/s, but also applications, to check for a good design with the intended workload. In the light of this, it is stretching credulity to think that a SoC designer would design to a benchmark, rather than the real-world applications.

ksec · May 8, 2018

Eug said:
I know nothing about programming this stuff, but I found it interesting that even Linus doesn't dislike Geekbench 4, and he hates everything!

https://www.realworldtech.com/forum/?threadid=159853&curpostid=159862

Seriously though, Linus thought Geekbench 3 was utter garbage, but said GB4 is much better.

Because not only does he write code, good code, but also very low level code, AND he worked as a CPU company in his career, ( Transmeta ). And he knows what GB4 is testing, and that it provides a very solid overview of what good the "basics" of an ISA or implementation of that ISA can get.

And if someone thinks CPU designer have time to design and specifically design for one sets of benchmarks then I really have nothing more to say.

Nothingness · May 8, 2018

HurleyBird said:
Yes, GB is a collection of tests. That puts some kind of limit onto how ridiculously you can optimise for it, but do you think that somehow prevents mobile SOC designers from running a cost/benefit analysis on individual components? Of course not.

Again, look at 3DMark. It's not testing one single thing like polygon throughput or texturing performance. It hits the entire rendering pipeline. In fact, it's far easier to design a generally applicable GPU benchmark than it is a CPU benchmark since all rendering loads approximating a 3D video game have a huge amount of commonality while CPU loads can be basically anything.

In an alternate world where consumers judge the speed of GPUs almost entirely on 3DMark scores, do you think that 3DMark hitting the entire rendering pipeline would prevent GPU vendors from abusing a lack of benchmark variety? Obviously it wouldn't. We know so from history -- just look at R600 which was only incidentally a 3DMark monster while sucking fumes everywhere else. If vendors actually set out to specifically target 3DMark, we'd likely see GPUs that are, conservatively, over twice as fast in it than they are currently. And it would be at the expense of real world performance.

I'm not saying that Apple actually does this. Just that in the current climate it would be smart to and there isn't anything preventing them. The real question may not be "if" but "to what degree," since the thought of Apple/Samsung/Qualcomm flat out ignoring GB is fairly ludicrous. Without an actual suite of benchmarks it's hard to know. If it looks to good to be true though...

There's a big difference here: GPU benchmarks always go through a software driver that can do a lot of tweaks if it detects a benchmark. Geekbench is a binary benchmark that directly runs on the CPU so the only tricks that can be applied are increased frequency and specifically tuned libraries (though only two or three of the tests are really using libraries).

This does not mean that CPU designers are not looking at Geekbench, but they certainly are not only looking at it, and you can bet most, if not all, tweaks made to improve GB speed will also benefit other software.

Thala said:
What Samsung did with Exynos 9810 needs some further analysis though

I guess it's an example of frequency cheating.

HurleyBird · May 8, 2018

Nothingness said:
There's a big difference here: GPU benchmarks always go through a software driver that can do a lot of tweaks if it detects a benchmark. Geekbench is a binary benchmark that directly runs on the CPU so the only tricks that can be applied are increased frequency and specifically tuned libraries (though only two or three of the tests are really using libraries).

Granted, that counteracts a bit the fact that 3DMark hits the entire rendering pipeline while no CPU benchmark can hit every kind of CPU workload. And of course GPU vendors have been guilty of a lot of shady driver tricks in the past. That said, you only need to look as far as R600 to find an architecture, not a driver, that excelled in 3DMark while failing in real world games. And it's not like ATI was even trying to make a "3DMark card" at the time. It just happened incidentally. If a vendor actually designed an architecture around 3DMark, you would likely find an even larger divide between 3DMark performance and everything else.

Nothingness said:
You can bet most, if not all, tweaks made to improve GB speed will also benefit other software.

No. Decisions made for the purpose of improving GB scores have to come at the expense of something else. TINSTAAFL. Again, imagine if AMD/Nvidia were to put more architectural effort into driving up 3DMark scores. It wouldn't help real world games unless a game happened to be incidentally very 3DMark-like. You'd end up with a more lopsided architecture that wouldn't perform as well in the real world.

Nothingness said:
I guess it's an example of frequency cheating.

I mean, I'm sure that's one part of it. But even when you liberally adjust for frequency the 9810's GB scores are way higher than they "should" be given its performance level in real world apps. It's possible we're looking at a fluke, but I wouldn't be surprised at all if over-optimising for GB is exactly what Samsung did. Maybe they could have gotten away with it if their overall performance was higher and it wasn't so easy to do cross platform comparisons on android.

Nothingness · May 8, 2018

HurleyBird said:
No. Decisions made for the purpose of improving GB scores have to come at the expense of something else. TINSTAAFL. Again, imagine if AMD/Nvidia were to put more architectural effort into driving up 3DMark scores. It wouldn't help real world games unless a game happened to be incidentally very 3DMark-like. You'd end up with a more lopsided architecture that wouldn't perform as well in the real world.

The software base of Geekbench is wide enough that any tweak would benefit many other pieces of software. For instance adding an FMA unit will increase GB SGEMM score, but also will improve every software that makes use of FMA.

The only thing that would be really a useless cheat is if compiler tricks were done to detect GB code and do specific optimizations (similar to what Intel does with icc for SPEC). gcc and LLVM don't have such thing and as far as I know neither does Apple (the generated code for GB4 looks similar to others).

Mopetar · May 8, 2018

Eug said:
I think Apple flat out ignores Geekbench, at least as far as marketing is concerned.

Apple doesn't even market their processor speed or the amount of RAM in their phones. If you don't believe me, go look at the tech specs section for any of their phones. They might not even talk about it during the product launch either. Instead they just find some single benchmark that's significantly better and provide the usual "up to 50% faster" claim.

Apple typically does a good job with their chip design, and they really rule to roost when it comes to performance, but they almost never talk about just the chip itself, or at least not in any technical detail. To them, it's all about the whole phone as a product.

slashy16 · May 8, 2018

I hope these benchmarks are accurate because I have been eagerly awaiting a macbook with OSX based on their ARM chips.

Jan Olšan · May 8, 2018

ksec said:
Because not only does he write code, good code, but also very low level code, AND he worked as a CPU company in his career, ( Transmeta ). And he knows what GB4 is testing, and that it provides a very solid overview of what good the "basics" of an ISA or implementation of that ISA can get.

I don't think Linus ever praised GB. He might have said that GB4 is not as bad as GB3 which was utter garbage, that is true. But trying to spin that as praise or even approval is a stretch. IIRC he still doesn't consider it good, though I can't find more recent post quickly.

CatMerc · May 8, 2018

The validity of Geekbench as a cross platform benchmark for me goes out the toilet when the simple act of switching to Linux gains you thousands of GB score on Ryzen. Seriously go to the database and search 2700X by highest score. Everything Linux as far as the eye can see. And MacOS on Hackintosh machines sometimes, before you see Windows.

The OS and compiler play such a huge role you cannot use it as a valid comparison for architectures. It's only a valid comparison for the entire chain of software and hardware that get it running.

Apple A12 benchmarks

Senior member

Diamond Member

Golden Member

Platinum Member

Lifer

Senior member

Golden Member

Platinum Member

Golden Member

Elite Member

Platinum Member

Golden Member

Platinum Member

Elite Member

Golden Member

Senior member

Member

Senior member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Member

Senior member

Golden Member