Apple A12 benchmarks

Thala · May 12, 2018

If AMD gets anything close to GloFo's stated 55% reduction in power and improves their architecture somewhat at 7nm, they will be closer. A Zen 2 SoC could provide around the same performance at "7.5w" package TDP as the A12 does in GB4.

I consider it close to impossible that any x86/x64 based design getting close to the efficiency as what is possible with ARMv8. Unless Apple is incompetent this not going to happen.
What you will see in the coming years, that even stock ARM cores will overtake AMD Ryzen in IPC at much lower power. The new Cortex A76 is supposed to increase IPC by 30%.

The point was that even among x86 cpus, the GB4 results between AMD and Intel are wildly different than everyday uses.

That's because, similar like SPEC, Geekbench is a CPU benchmark, which doesn't measure everyday-use application performance, which impacted by much more components of a device like HDD, GPU, I/O. For pure compute Geekbench is a good reference.

Kenmitch · May 12, 2018

scannall said:
Apple has just been killing it at CPU design for a while now. Though that seems to make some people angry. Not sure why, cool new tech is always good.

CPU tech is cool....Apple isn't anymore. Apple was at one time somewhat of a status symbol to many....These days even the homeless sport their devices.

Death before Apple is how I roll.

moinmoin · May 12, 2018

Eug said:
I'm not in the industry but I've read lots of posts here and there that Apple's server group pissed off a lot of potential customers. While the hardware was well respected, they didn't like Apple's lack of flexibility and sub-par on-site service support. Furthermore, they did not like the fact that Apple was so secretive. There was no way to plan for future hardware and software upgrades because nobody ever knew what the roadmap was.

Basically it seemed the main market for their servers was small to medium sized business, and Apple did not seem to want to cater to large corporate clients that can be much more demanding than businesses with 50 employees.

I may be totally off on this, but that's the gist of what I got when Apple servers were still a thing... and then Apple simply killed off the entire division.

All this, and Apple is currently redefining the purpose of macOS Server, phasing out a lot of server tools in favor of a new focus on management of devices on the network. This doesn't look like a preparation of a re-entry into the server market at all.

jpiniero · May 12, 2018

moinmoin said:
All this, and Apple is currently redefining the purpose of macOS Server, phasing out a lot of server tools in favor of a new focus on management of devices on the network. This doesn't look like a preparation of a re-entry into the server market at all.

Doesn't mean they couldn't build a cloud service like AWS. Just use Linux.

moinmoin · May 12, 2018

jpiniero said:
Doesn't mean they couldn't build a cloud service like AWS. Just use Linux.

They likely already do for their internal iCloud etc. server farms. I was obviously talking about products for the market, not internal stuff.

jpiniero · May 12, 2018

moinmoin said:
They likely already do for their internal iCloud etc. server farms. I was obviously talking about products for the market, not internal stuff.

They don't use their own processors though, and it's not available to the general public like AWS is. Like they don't need to sell the hardware, just offer something like AWS.

moinmoin · May 12, 2018

jpiniero said:
They don't use their own processors though, and it's not available to the general public like AWS is.

You don't know the first part, and the latter was my point.

jpiniero · May 12, 2018

moinmoin said:
You don't know the first part, and the latter was my point.

IIRC, Apple originally was using HP servers but that was when they started. Either way there is no evidence I've seen that they are using anything but typical Xeons from a vendor like HP.

name99 · May 12, 2018

Eug said:
Well, the quote is (if Google Translate is right), the power consumption is higher than expected for 7 nm. It didn't actually say it was higher than A11 though.

The quote is so vague who knows WTF it means. But it's premature to conclude that the power consumption is either a "real" problem, or reflects 7nm issues.

Assume, for example, that, based on the problems with the A9, A10, A11 drawing more power than old batteries can support, Apple added an overall max-current-draw micro-controller to the SoC. Such a micro-controller would have the job of monitoring all parts of the SoC, looking at various collected statistics, and GRADUALLY dialing down performance as needed to not exceed a certain current draw (as opposed to the current SW solution of always force a low frequency for the CPU once the battery is too old).

Now imagine the characteristics of such a design. In particular it would take some time to figure out the optimal mapping from the various statistics available to future current draw, and thus how to program this micro-controller. THIS (unoptimized micro-controller) could be the problematic item that's resulting in (under vaguely defined circumstances) "23% too high" power.

I'm not saying this is the issue; I'm simply saying that there's fsckall in the comment to actually latch onto as something worth trying to understand.

name99 · May 12, 2018

ksec said:
He actually mentions it somewhere in the comment it is closer to 25%. Making this 25% would be 3Ghz. The latter is what I calculate to be around 3Ghz, although he did mention performance coming from better branch prediction, and no mention of clock speed.

Assuming that part of the claim is correct (as dubious as everything else...) the issue is unlikely to be better branch prediction per se. Apple already has the best branch prediction in the world, somewhat ahead of Intel. (You can see this by looking at the branch-tough GB4 benchmarks like Lua.) Intel uses TAGE, Apple probably uses TAGE and VTAGE, perhaps with a larger table or better updating. Although there are tweaks to TAGE to get it slightly better, there's nothing that really moves the needle.

What probably IS being done is better fetch. If you only predict one branch per cycle, and have constraints on your load from the I-cache (like the loads must be 8-instruction aligned) it's exceedingly tough to maintain 6 instruction fetch per cycle. So you improve fetch. It starts by predicting two branches per cycle (if the first is fall-through) and you can push that even higher. (POWER does up to 8 assuming the first 7 are fall-through, though this seems to me utterly pointless.) Next you allow a fetch width of say up to 8 instructions with ANY starting alignment. Or you can allow 8 wide fetch to straddle a cache line. Going really crazy you can allow fetch from both current PC AND the predicted jump-to PC, pulling in instructions from both cache lines. There's some discussion of this in the 90's Alpha literature, though it might have been in the context of the last one they talked about (but never shipped) before the project was cancelled.

Point of all this is --- it's not better branch prediction, it's better FETCH, which is an entire subsystem of which branch prediction is but a small (though very sexy) part.

name99 · May 12, 2018

rgba said:
I don't think it matches the 8700K. Here are the scores for my i7-8700 under Windows, macOS and Linux. Results seem to vary quite a bit between different Geekbench versions.

The Windows result may well reflect MS vs LLVM compiler.
The (somewhat similar) Linux vs Apple result may reflect slight differences in when turbo-ing goes active, or just the expected amount of noise given background chatter in the OSs.

If you look at a large number of results on the browser page for a given CPU you tend to see some bullshit fake results at the top, then a large number of very similar results which seem to realistically reflect best case results for that CPU. THOSE sort of results (not exactly an average, but a crowd-sourced best case --- but common --- scenario) are what you should be using for comparisons, not any single idiosyncratic result.

name99 · May 12, 2018

HurleyBird said:
The third option is that Geekbench isn't crap in and of itself, but when one benchmark becomes overridingly important people will, of course, optimise for it. Geekbench isn't an important metric at all when it comes to marketing the speed of PCs, but is by far the biggest metric when it comes to marketing the speed of mobile devices.

You want a variety of benchmarks not just so that you get a varied picture on things, but so that this kind of thing never happens. Right now 3DMark counts for, maybe, at most 5% of the perceived speed of a GPU with the performance of real games accounting for the remaining 95%. Imagine if this was like the mobile device world, where a synthetic benchmark like 3DMark became all that anyone cared about. GPUs would quickly become much better at 3DMark, and would not advance as quickly in actual games as they otherwise would.

That said, the wide variance you see in GB based on OS does point towards the benchmark not being ideal in any case.

A variety of benchmarks DOES exist; it's just people don't want to accept what they all say...

Along with GB4 we also have the widely published browser results which give the same sort of numbers.

There are also my extensive (and published, though generally ignored) comparisons of Mathematica on Mac to Wolfram Player on iOS which show that for most purposes (basically "generic code") the A10 at 2.4GHz is equivalent to a Haswell at 3.6GHz. (Wolfram Player
- has non-optimized bignums
- non-opimized BLAS
- very little support for multi-core
If you ignore those issues and concentrate on generic issues --- basically single-threaded non-bignum arithmetic and a wide variety of pattern matching and searching --- you see the results I describe.)

ksec · May 13, 2018

name99 said:
Assuming that part of the claim is correct (as dubious as everything else...) the issue is unlikely to be better branch prediction per se. Apple already has the best branch prediction in the world, somewhat ahead of Intel. (You can see this by looking at the branch-tough GB4 benchmarks like Lua.) Intel uses TAGE, Apple probably uses TAGE and VTAGE, perhaps with a larger table or better updating. Although there are tweaks to TAGE to get it slightly better, there's nothing that really moves the needle.

What probably IS being done is better fetch. If you only predict one branch per cycle, and have constraints on your load from the I-cache (like the loads must be 8-instruction aligned) it's exceedingly tough to maintain 6 instruction fetch per cycle. So you improve fetch. It starts by predicting two branches per cycle (if the first is fall-through) and you can push that even higher. (POWER does up to 8 assuming the first 7 are fall-through, though this seems to me utterly pointless.) Next you allow a fetch width of say up to 8 instructions with ANY starting alignment. Or you can allow 8 wide fetch to straddle a cache line. Going really crazy you can allow fetch from both current PC AND the predicted jump-to PC, pulling in instructions from both cache lines. There's some discussion of this in the 90's Alpha literature, though it might have been in the context of the last one they talked about (but never shipped) before the project was cancelled.

Point of all this is --- it's not better branch prediction, it's better FETCH, which is an entire subsystem of which branch prediction is but a small (though very sexy) part.

Yes that is why I said only through scaling in the first reply. And not branch prediction, another reason is I don't know if that Chinese Word actually means "branch prediction"at all, It is closely related and it could means something else. ( Although It doesn't translate to fetch either, so I have no idea )

I also think there will need to be a whole lot of rebalancing if they improve fetch or branch prediction. From cache system to memory. So a simple 7nm die shrink ( or not so simple ) and higher clock speed were my views.

ksec · May 13, 2018

name99 said:
A variety of benchmarks DOES exist; it's just people don't want to accept what they all say...

Along with GB4 we also have the widely published browser results which give the same sort of numbers.

There are also my extensive (and published, though generally ignored) comparisons of Mathematica on Mac to Wolfram Player on iOS which show that for most purposes (basically "generic code") the A10 at 2.4GHz is equivalent to a Haswell at 3.6GHz. (Wolfram Player
- has non-optimized bignums
- non-opimized BLAS
- very little support for multi-core
If you ignore those issues and concentrate on generic issues --- basically single-threaded non-bignum arithmetic and a wide variety of pattern matching and searching --- you see the results I describe.)

Did you test on A11?

name99 · May 13, 2018

ksec said:
Did you test on A11?

No.
The business of trying to do serious benchmarking meant I needed a large screen for Wolfram Player. And I only own an iPhone 7, not an iPhone 8.

IF I can find access to an A11X based iPad Pro (when/if that comes out...), I will try again. But it's not a completely trivial exercise. Wolfram Player is not set up to simply "run a script" the way Mathematica is, it's set up to play animations and visualizations. So every benchmark has to be embedded in such a visualization.
It also has a number of UI bugs in it (it was only released about six months ago) including things like it hangs if a computation runs too long (I forget how long, but maybe a minute?) so you have to be careful to scale all computations to be as large as possible (to approach that limit) but not exceed it. And you can't (or at least couldn't at the time) embed all the computations in a single animation.

Wolfram Player is being updated (it's had about two minor dot dot updates since the first release) and Wolfram are generally really good about fixing bugs and bringing uniformity to the platform. Meaning when I do this again, I'll do the whole thing from scratch again, to look into which of the missing functionality (as I said bignums, vectorized parallelized BLAS, parallel directives, etc) is now working properly.

If you're interested you can see my final summary of what I learned here:
https://www.dropbox.com/s/zkh7i61sg5rgjme/Mathematica on iPad.xlsx?dl=0

The computations are generally extracts from Mathematica's built-in benchmark, designed to test a wide range of performance functionality, but with the order of the tests/results modified to show different groups of functionality that are or are not yet optimized for AArch64 and iOS.

(Looking at that I see I was comparing against my Ivy Bridge machine not my Haswell machine. Haswell, as I mentioned, picked up TAGE, so has better branch prediction, which really helps much of Mathematica. One can perhaps, handwavingly, treat the overall IPC boosts on the Apple side from A10 to A11 as more or less equivalent to the big jump on the Intel side of Ivy to Haswell [TAGE] and then the various minor jumps to *lake; meaning that, "more or less", the numbers would match up with a hypothetical 2.4GHz A11X run against a 3.6GHz Coffee Lake or whatever we're up to now on 14nm++++++)

Zucker2k · May 16, 2018

FIVR said:
You should call intel and tell them about this finding, they could use it as their next great slogan now that "intel inside" is dead.

Something like "Yes, our SoCs suck. Especially when compared to Apple. But do not worry: this small company with no marketshare to speak of in our most important market sucks more."

Perhaps it could be shortened to Intel: "It could be worse!"

How do you get away with so much crap in these threads? Seriously, I wanna know.

Zucker2k · May 16, 2018

frozentundra123456 said:
Maybe you should call them, it seems you have taken on as your sole mission in life to deride intel whenever possible and excuse AMD. The point was that even among x86 cpus, the GB4 results between AMD and Intel are wildly different than everyday uses.

If anyone else tried the opposite, they'll probably be off these forums by now.

FIVR · May 16, 2018

Zucker2k said:
If anyone else tried the opposite, they'll probably be off these forums by now.

The poster you just quoted was quoting AMD geek bench in a conversation about A12 and intel. It was totally irrelevant information he was using to obscure the fact that intel was performing extremely poorly, which btw is ironic considering he then goes on to claim I "make excuses" for AMD, when he is doing that exact thing for intel... blatantly.

All of that is ok though, because at least he is discussing the topic at hand. You, on the other hand, do nothing but bait and personally attack other members. You don't offer any information, and you don't discuss the topic of the thread at all.

YOU are the one trolling here, and the only one not discussing anything related to A12. YOU should be off the forums, if ANYONE should be.

Enough is enough.

Both you and Zucker2k need to stop
this back and forth arguing and accusations.
If it continues, there will be consequences.

AT Mod Usandthem

Zucker2k · May 17, 2018

FIVR said:
The poster you just quoted was quoting AMD geek bench in a conversation about A12 and intel. It was totally irrelevant information he was using to obscure the fact that intel was performing extremely poorly, which btw is ironic considering he then goes on to claim I "make excuses" for AMD, when he is doing that exact thing for intel... blatantly.

All of that is ok though, because at least he is discussing the topic at hand. You, on the other hand, do nothing but bait and personally attack other members. You don't offer any information, and you don't discuss the topic of the thread at all.

YOU are the one trolling here, and the only one not discussing anything related to A12. YOU should be off the forums, if ANYONE should be.

At least, I keep off from making baiting/trolling comments when I have nothing relevant to contribute. Just check your posts in this thread alone. Is that what you call offering information? And it's not just this thread. It's your bread and butter on this forums. Unlike you, I don't want you or any other member off this forums. I don't gain anything from that, but hey, don't let me sit on your joy.

Enough is enough.

Both you and FIVR need to stop
this back and forth arguing and accusations.
If it continues, there will be consequences.

AT Mod Usandthem

ksec · May 17, 2018

moinmoin said:
All this, and Apple is currently redefining the purpose of macOS Server, phasing out a lot of server tools in favor of a new focus on management of devices on the network. This doesn't look like a preparation of a re-entry into the server market at all.

Qualcomm has taught us it isn't even easy at all. I don't want to derail the topic too much, but if you look at Apple's direction in Server and DC, they are still heavily relying on AWS, Azure and Google. Playing the three cloud services to its maximum advantage without having itself investing too much into cloud.

name99 · May 17, 2018

ksec said:
Qualcomm has taught us it isn't even easy at all. I don't want to derail the topic too much, but if you look at Apple's direction in Server and DC, they are still heavily relying on AWS, Azure and Google. Playing the three cloud services to its maximum advantage without having itself investing too much into cloud.

"Cloud services" is a large category!
It's certainly true that Apple has told us that they utilize third party cloud services.
It is ALSO true that Apple have told us that they have their own large data centers, including a few pictures of the rows of racks, and, they are building more of these.

Which raises two issues.
The first is that there is scope for Apple to use their own "server" infrastructure (custom chips and/or hardware and/or software) in whatever those racks are doing today. Obviously there can be mix and match here --- if Linux/Linaro serves their data center needs better, they can use that on top of Apple custom CPUs placed on sledges that are essentially Facebook Open Compute designs.

Now you can argue this is crazy, and it might be, except consider the next step, which might be for Apple to offer cloud compute services to complement their current cloud storage services. These could take a variety of forms. For example suppose Apple offered "compilation in the cloud" for small shops. Specialized cloud computation of this sort IS coming; for example Wolfram offer Mathematica computation in the cloud for people who have the occasional need/desire for a large computation, but it makes no sense for them to a buy a large computer.

Now before you say that that's a dumb idea, that compiling is easily done on a small 6 or 8 core Mac, open your mind to what warehouse compiling could allow. For example: vast amounts of optimization in compiling are either search problems (ie super-optimization, easily parallelized across many many CPUs) or pattern recognition problems (ie train a neural net on various alternative ways to restructure a loop, let it learn the pros and cons of each restructuring, and have some of your compile passes now be neural net passes...
There are obvious extensions to this, like running validation suits after every change, or running large performance suites and, again, looking at patterns...

Beyond selling compute services as compilation, Apple could move on to selling compute services to developers on behalf of customers. Right now any developer can use iCloud storage where it makes sense, both to store program info that's specific to me, and to store aggregate info specific to everyone's use of the product. But only Apple can offload computation from my device. Apple could provide services that allow any developer to offload computation, something that right now developers have to arrange on their own, either through AWS or Azure, or by creating their own infrastructure if they are large enough.

Apple selling servers as boxes may make little sense.
Apple selling servers as a service may make a LOT of sense.

Eug · May 22, 2018

Bloomberg: TSMC now in mass production of 7 nm Apple A12 chips for next iPhone

Apple Inc. manufacturing partner Taiwan Semiconductor Manufacturing Co. has started mass production of next-generation processors for new iPhones launching later this year, according to people familiar with the matter.

The processor, likely to be called the A12 chip, will use a 7-nanometer design that can be smaller, faster and more efficient than the 10-nanometer chips in current Apple devices like the iPhone 8 and iPhone X, the people said.

StinkyPinky · May 23, 2018

If the A12 performance leap is similar as recent ones the performance could be astounding. No wonder Apple are moving towards putting their own chips in their computers.

CatMerc · May 23, 2018

StinkyPinky said:
If the A12 performance leap is similar as recent ones the performance could be astounding. No wonder Apple are moving towards putting their own chips in their computers.

Aside from my gripes with Geekbench, the designs don't necessarily scale up. It is very likely that these CPU's are purpose built for a specific TDP and performance level. Much of processor design is finding the sweet spot between complexity (IPC) and clockspeed for a given node. It is entirely likely that Apple's designs would need to trade off IPC to clock up to the level of desktop parts. They also don't need as much of a complex and power hungry uncore as desktop CPU's, since they need to handle less memory, less expansion, etc' etc', and they don't need to have powerful fabrics that allow for high multi-core efficiency and scaling. The uncore on desktop CPU's alone takes more power than entire mobile SoC's.

I expect 7nm and 10nm x86 CPU's to be a significant jump far beyond what we're used to. That's if Intel can manufacture 10nm eventually

And I also wouldn't expect miracles from a desktop class Apple CPU. They'll be good for sure, but I think people are overhyping their potential.

Nothingness · May 23, 2018

CatMerc said:
I expect 7nm and 10nm x86 CPU's to be a significant jump far beyond what we're used to.

Really? I don't expect much if anything...

Apple A12 benchmarks

Golden Member

Diamond Member

Diamond Member

Lifer

Diamond Member

Lifer

Diamond Member

Lifer

Senior member

Senior member

Senior member

Senior member

Senior member

Senior member

Senior member

Golden Member

Golden Member

Diamond Member

Golden Member

Senior member

Senior member

Lifer

Diamond Member

Golden Member

Platinum Member