Apple A12 & A12X * Now A12Z as well * Now in a Mac mini

bakyt115 · Nov 5, 2018

aHR0cHM6Ly93d3cubGFwdG9wbWFnLmNvbS9pbWFnZXMvdXBsb2Fkcy9wcHJlc3MvNDU3NDEvbHRwX3ZpZGVvX2Fkb2JlX3J1c2hfaXBhZF9wcm9fMTItOV8yMDE4LmpwZw==

aHR0cHM6Ly93d3cubGFwdG9wbWFnLmNvbS9pbWFnZXMvdXBsb2Fkcy9wcHJlc3MvNDU3NDEvbHRwX3Bob3RvX2Fkb2JlX2xpZ2h0cm9vbV9pcGFkX3Byb18xMi05XzIwMTguanBn

May be ipad use gpu for video transcode and photoediting?

JoeRambo · Nov 5, 2018

bakyt115 said:
May be ipad use gpu for video transcode and photoediting?

For sure, but it's more of a problem for Intel CPUs that have obsolete technology and don't have proper full acceleration for H265 @ 4K encodes. The result of using ~2014 versus 2018 tech is slaughter even when we factor in usual tight Apple integration.

beginner99 · Nov 5, 2018

bakyt115 said:
May be ipad use gpu for video transcode and photoediting?

obviously. It's ultimately a software issue and I bet Apple paid Adobe to have the develop GPU acceleration and intel didn't.

TheGiant · Nov 5, 2018

do we have an excel calculation test?
those tests of video encoding ...no comment

Nothingness · Nov 5, 2018

name99 said:
SPEC2006 is "standard" C code in the sense that it is garbage code riddled with bugs and undefined behavior, like most C code.
Of particular relevance is that 400.perlbench has at least two pointer overflow bugs...
https://blog.regehr.org/archives/1395

It's possible that LLVM is now good enough to figure out for itself that those overflows occur, and refuse to compile; or it may be some other bug in the code (code that's willing to tolerate pointer overflows is likely to tolerate a lot of other crap).

(Just so there's no misunderstanding, I think SPEC2006 does its job of pushing hard various aspects of the CPU, particularly testing code with a large data footprint, and with aggressive memory bandwidth or latency demands [though it does a poor job of testing code with a large instruction footprint].
But it's important not to valorize it. Much of the code is lousy quality, and we should be calling out crappy code WHEREVER we see it, not making excuses for it. It CERTAINLY should not be considered as exemplar code for newbies to emulate and learn from.)

I have seen much worse "pro" code

And many real programs are badly coded, so that just mimics reality.

I'd prefer to switch to CPU 2017 but I'm not sure Anandtech/Andrei have access to it.

Arachnotronic · Nov 5, 2018

Nothingness said:
I have seen much worse "pro" code And many real programs are badly coded, so that just mimics reality.

I'd prefer to switch to CPU 2017 but I'm not sure Anandtech/Andrei have access to it.

A CPU's ability to handle bad code is a good indicator of how good the machine is. If your code needs to be massaged and tuned to get the most out of it, you haven't built a very robust machine.

Nothingness · Nov 5, 2018

Arachnotronic said:
A CPU's ability to handle bad code is a good indicator of how good the machine is. If your code needs to be massaged and tuned to get the most out of it, you haven't built a very robust machine.

Indeed. And here Intel for sure is really great! Because they have to run what on would consider obsolete bad code at good speed.

thunng8 · Nov 5, 2018

bakyt115 said:
May be ipad use gpu for video transcode and photoediting?

The video encode likely uses apple’s video encode block.

Photo editing I’m not sure sure about.

thunng8 · Nov 5, 2018

beginner99 said:
obviously. It's ultimately a software issue and I bet Apple paid Adobe to have the develop GPU acceleration and intel didn't.

I doubt it. Adobe Lightroom is the worlds most popular photo DAM manager. I would hope that Adobe would spend some time optimizing it for intel platforms.

As for it being gpu accelerated, I’m not so sure. Raw conversion is not easy to accelerate. Countless software have tried it and have not been successful in getting any speed increases.

asendra · Nov 5, 2018

https://homes.cs.washington.edu/~bornholt/post/z3-iphone.html

I’m sure Apple also paid this guy so it could optimize his program and make it competitive againts a 6700k/7700k!!

/s

darkswordsman17 · Nov 6, 2018

Apple doesn't necessarily have to have paid for it in the typical sense, but rather, as part of them developing the app, Adobe just took the time to build in the extra capability, or that simply that's just how you code for that setup (meaning, Apple put in the work to make apps be able to use GPU and or specialized processing blocks for certain tasks, so that if you're doing video processing, the OS defaults to offloading it to the GPU or dedicated hardware).

bakyt115 said:
May be ipad use gpu for video transcode and photoediting?

If it is then they need to do further analysis to compare the quality. From what I've gathered, GPU transcoding is very fast but is significantly worse in quality. If Apple can make the quality equal (or better even?) with the acceleration (regardless of it being GPU or specialized hardware for that task), on top of being faster then awesome. Also worth asking is if the Intel ones were using QuickSync as that would probably be the better comparison, although again you need to compare quality. Ideally you'd be able to compare different options (best quality of each, fastest, and then see if you could find an optimal quality/time settings where you find what is a suitable quality setting - or more than one for different users - but you try to control the quality and then compare the time each takes to achieve that result).

TheGiant · Nov 6, 2018

darkswordsman17 said:
Apple doesn't necessarily have to have paid for it in the typical sense, but rather, as part of them developing the app, Adobe just took the time to build in the extra capability, or that simply that's just how you code for that setup (meaning, Apple put in the work to make apps be able to use GPU and or specialized processing blocks for certain tasks, so that if you're doing video processing, the OS defaults to offloading it to the GPU or dedicated hardware).

If it is then they need to do further analysis to compare the quality. From what I've gathered, GPU transcoding is very fast but is significantly worse in quality. If Apple can make the quality equal (or better even?) with the acceleration (regardless of it being GPU or specialized hardware for that task), on top of being faster then awesome. Also worth asking is if the Intel ones were using QuickSync as that would probably be the better comparison, although again you need to compare quality. Ideally you'd be able to compare different options (best quality of each, fastest, and then see if you could find an optimal quality/time settings where you find what is a suitable quality setting - or more than one for different users - but you try to control the quality and then compare the time each takes to achieve that result).

This
Quality is worse on the GPU. That is why people are using CPUs now. The test isn't quality made.
As for the calculation test, that result only shows that like single/small number of threads performance wise, geekbench doesn't show results that are way off. Seriously apple has 5W CPU that has the performance of 50W CPU of this desktop age.
That is a general f..k up for Intel, but for AMD even bigger as they are frequency and also IPC wise behind.

I never found an excel calculation comparison on the web. With that RAM in the new iPhone or iPad, lighter calculations could be tested against an x86 cpu, like techspot does.

name99 · Nov 6, 2018

TheGiant said:
This
Quality is worse on the GPU. That is why people are using CPUs now. The test isn't quality made.
As for the calculation test, that result only shows that like single/small number of threads performance wise, geekbench doesn't show results that are way off. Seriously apple has 5W CPU that has the performance of 50W CPU of this desktop age.
That is a general f..k up for Intel, but for AMD even bigger as they are frequency and also IPC wise behind.

I never found an excel calculation comparison on the web. With that RAM in the new iPhone or iPad, lighter calculations could be tested against an x86 cpu, like techspot does.

When iPad Pro A10X came out, I ran tests of Mathematica (on iMac) vs Wolfram Player (Mathematica engine, different UI) on iPad Pro A10X. Basically 2.4GHz A10X was equivalent to the 3.6GHz i5 Ivy Bridge in my iMac.
When I get the new iPad Pro I'll update the tests.

dark zero · Nov 6, 2018

And... Antutu score comes out..

https://wccftech.com/ipad-pro-antutu-benchmark-crushes-competition/

iPad 12.9 Pro: By the way competitor, my Antutu levels are now OVER A HALF MILLION!!!!

ddarko · Nov 7, 2018

Discussion with Anand Shimpi (!) and Phil Schiller about the A12X on Arstechnica:

https://arstechnica.com/gadgets/2018/11/apple-walks-ars-through-the-ipad-pros-a12x-system-on-a-chip/

Eug · Nov 7, 2018

ddarko said:
Discussion with Anand Shimpi (!) and Phil Schiller about the A12X on Arstechnica:

https://arstechnica.com/gadgets/2018/11/apple-walks-ars-through-the-ipad-pros-a12x-system-on-a-chip/

Heh. It's ironic AnandTech didn't get the Anand interview. Anyhoo, glad to see him out in front view too now.

“We've got our own custom-designed performance controller that lets you use all eight at the same time,” Shimpi told Ars. “And so when you're running these heavily-threaded workloads, things that you might find in pro workflows and pro applications, that's where you see the up to 90 percent improvement over A10X.”

---

This performance is unprecedented in anything like this form factor. In addition to the ability to engage all the cores simultaneously, there's reason to believe that cache sizes in the A12, and likely therefore the A12X, are a substantial factor driving this performance.

You could also make the case that the A12X's performance in general is partly so strong because Apple's architecture is a master class in optimized heterogeneous computing—that is, smartly using well-architected, specialized types of processors for matching specialized tasks. Though the A12X is of course related to ARM's big.LITTLE architecture, Apple has done a lot of work here to get results that others haven't.

---

"You typically only see this kind of performance in bigger machines—bigger machines with fans," Shimpi claimed. "You can deliver it in this 5.9 millimeter thin iPad Pro because we've built such a good, such a very efficient architecture."

---

Shimpi offered a pitch for the GPU. "It's our first 7-core implementation of our own custom-designed GPU," he said. "Each one of these cores is both faster and more efficient than what we had in the A10X and the result is, that's how you get to the 2x improved graphics performance. It's unheard of in this form factor, this is really an Xbox One S class GPU. And again, it's in a completely fanless design."

---

Generally, this GPU has a huge lead in the mobile space, but it's not encroaching on laptop territory the same way the CPU is—at least, not in these sorts of benchmarks. The advantage over other mobile devices is significant, though. There aren't any other devices this category that come close. As for performance gains relative to the iPhone XS and its A12, Shimpi said memory bandwidth is one part of that.

"The implementation is the same," he clarified. "But you do have much more memory bandwidth so there may be cases where it's actually faster than what you get on the phone if you do have a workload that has taken advantage of the fact that you have twice as big of a memory subsystem."

This impacts not just 3D graphics in games but a lot of the UI effects in iOS itself. Shimpi noted it's not just about peak memory bandwidth, but delivering bits efficiently. "Having that dynamic range is very important because there are times when you want to operate at a lower performance point in order to get efficiency and battery life," he said.

Mobile device comparisons aside, the laptop and desktop are more or less the ultimate target. "We’ll actually take content from the desktop, profile it, and use it to drive our GPU architectures. This is one of the things that you usually don't see in a lot of mobile GPU benchmarks," Shimpi explained.

---

Typically when you get this type of CPU and GPU performance, a combination of the two, you have a discrete memory system. So the CPU has its own set of memory and the GPU has its own set of memory, and for a lot of media workloads or pro workflows where you actually want both working on the same data set, you copy back and forth, generally over a very narrow slow bus, and so developers tend to not create their applications that way, because you don't want to copy back and forth.

We don't have any of those problems. We have the unified architecture, the CPU, the GPU, the ISP, the Neural Engine—everything sits behind the exact same memory interface, and you have one pool of memory.

On top of that, this is the only type of memory interface that iOS knows. You don't have the problem of, well, sometimes the unified pool may be a discrete pool, sometimes it may not. iOS, our frameworks, this is all it’s ever known, and so as a result developers benefit from that. By default, this is what they're optimized for, whereas in other ecosystems you might have to worry about, well, OK, sometimes I have to treat the two things as discrete; sometimes they share.

PeterScott · Nov 7, 2018

Eug said:
Heh. It's ironic AnandTech didn't get the Anand interview. Anyhoo, glad to see him out in front view too now.

“We've got our own custom-designed performance controller that lets you use all eight at the same time,” Shimpi told Ars. “And so when you're running these heavily-threaded workloads, things that you might find in pro workflows and pro applications, that's where you see the up to 90 percent improvement over A10X.”

---

This performance is unprecedented in anything like this form factor. In addition to the ability to engage all the cores simultaneously, there's reason to believe that cache sizes in the A12, and likely therefore the A12X, are a substantial factor driving this performance.

You could also make the case that the A12X's performance in general is partly so strong because Apple's architecture is a master class in optimized heterogeneous computing—that is, smartly using well-architected, specialized types of processors for matching specialized tasks. Though the A12X is of course related to ARM's big.LITTLE architecture, Apple has done a lot of work here to get results that others haven't.

---

"You typically only see this kind of performance in bigger machines—bigger machines with fans," Shimpi claimed. "You can deliver it in this 5.9 millimeter thin iPad Pro because we've built such a good, such a very efficient architecture."

---

Shimpi offered a pitch for the GPU. "It's our first 7-core implementation of our own custom-designed GPU," he said. "Each one of these cores is both faster and more efficient than what we had in the A10X and the result is, that's how you get to the 2x improved graphics performance. It's unheard of in this form factor, this is really an Xbox One S class GPU. And again, it's in a completely fanless design."

---

Generally, this GPU has a huge lead in the mobile space, but it's not encroaching on laptop territory the same way the CPU is—at least, not in these sorts of benchmarks. The advantage over other mobile devices is significant, though. There aren't any other devices this category that come close. As for performance gains relative to the iPhone XS and its A12, Shimpi said memory bandwidth is one part of that.

"The implementation is the same," he clarified. "But you do have much more memory bandwidth so there may be cases where it's actually faster than what you get on the phone if you do have a workload that has taken advantage of the fact that you have twice as big of a memory subsystem."

This impacts not just 3D graphics in games but a lot of the UI effects in iOS itself. Shimpi noted it's not just about peak memory bandwidth, but delivering bits efficiently. "Having that dynamic range is very important because there are times when you want to operate at a lower performance point in order to get efficiency and battery life," he said.

Mobile device comparisons aside, the laptop and desktop are more or less the ultimate target. "We’ll actually take content from the desktop, profile it, and use it to drive our GPU architectures. This is one of the things that you usually don't see in a lot of mobile GPU benchmarks," Shimpi explained.

---

Typically when you get this type of CPU and GPU performance, a combination of the two, you have a discrete memory system. So the CPU has its own set of memory and the GPU has its own set of memory, and for a lot of media workloads or pro workflows where you actually want both working on the same data set, you copy back and forth, generally over a very narrow slow bus, and so developers tend to not create their applications that way, because you don't want to copy back and forth.

We don't have any of those problems. We have the unified architecture, the CPU, the GPU, the ISP, the Neural Engine—everything sits behind the exact same memory interface, and you have one pool of memory.

On top of that, this is the only type of memory interface that iOS knows. You don't have the problem of, well, sometimes the unified pool may be a discrete pool, sometimes it may not. iOS, our frameworks, this is all it’s ever known, and so as a result developers benefit from that. By default, this is what they're optimized for, whereas in other ecosystems you might have to worry about, well, OK, sometimes I have to treat the two things as discrete; sometimes they share.

Great that the SoC is getting more powerful, but that hasn't been the issue for me in a long time. The SoC has been powerful enough to make a good laptop repacement for a couple of years. But touch UI only works in the consumption role for me.

Until there is proper cursor support for smoother input, I am out.

Eug · Nov 7, 2018

PeterScott said:
Great that the SoC is getting more powerful, but that hasn't been the issue for me in a long time. The SoC has been powerful enough to make a good laptop repacement for a couple of years. But touch UI only works in the consumption role for me.

Until there is proper cursor support for smoother input, I am out.

That’s why I highlighted that passage from The Boss. They are profiling desktop workloads to help plan their chip designs.

We are on the cusp of a very big change in the Apple-verse. 2019 or 2020

IntelUser2000 · Nov 7, 2018

On top of that, this is the only type of memory interface that iOS knows. You don't have the problem of, well, sometimes the unified pool may be a discrete pool, sometimes it may not. iOS, our frameworks, this is all it’s ever known, and so as a result developers benefit from that.

"At the end of the day we want to make sure that whatever vision we have set out for the thing, if it requires custom silicon, that we're there to deliver. For a given form factor, for a given industrial design in this thermal envelope, no one should be able to build a better, more performant chip."

Seems like a pretty good summary of why they can put out such a good architecture that eclipses everyone else.

It's not just they have lots of money. Intel has lots of money, but only requires a fraction of the company to get it working. Ideas can't be bought with cash.

Apple has all the advantages
-Money is no object.
-Related to the above, they have absolute presence selling premium devices. No need to artificially segment chips to maximize profits on a chip. You just care about the device as a whole.
-Top tier talent is there because they are the darling of the tech industry.
-A single OS and likely firmware to work on, so it has the advantages that console has(that its a stable platform). Reduces TTM(Time to Market), because they don't need to account for different vendors. No need to wait for microsoft on implementing their hardware features into the OS, no need to wait for notebook vendors that may or may not use a hardware feature.

PeterScott · Nov 7, 2018

Eug said:
That’s why I highlighted that passage from The Boss. They are profiling desktop workloads to help plan their chip designs.

We are on the cusp of a very big change in the Apple-verse. 2019 or 2020

I didn't read their comment that way at all. More of the iPad as laptop/desktop replacement marketing. See how powerful it is, it can replace your other computers, when the issue about iPad replacing laptop/desktop is NOT about power, it's about UI.

wizfactor · Nov 7, 2018

Just wanted to say I am so glad that Anand Shimpi could finally talk about what he's been up to since leaving the site. I think most of us always assumed that he joined the Silicon team, but it's still great to hear that that is where he's working now, and that he's doing really well.

The chip design team is probably my favorite team working at Apple right now. Anand is practically working alongside Silicon Gods!

Eug · Nov 7, 2018

wizfactor said:
Just wanted to say I am so glad that Anand Shimpi could finally talk about what he's been up to since leaving the site. I think most of us always assumed that he joined the Silicon team, but it's still great to hear that that is where he's working now, and that he's doing really well.

The chip design team is probably my favorite team working at Apple right now. Anand is practically working alongside Silicon Gods!

I actually originally guessed he would be doing something else.

TheGiant · Nov 8, 2018

I see the personally as a bunch of "nothing" to be maybe extreme.
No info about desktop/mobile replacement timing etc, lots of technical stuff.
We have to see if you add sata, pci-e, usb and other things that are pretty normal and we don't even think about them to the chip and also the desktop/laptop operating system which isn't optimized to the max as mobile is what it does.
As I have and Ipad 4 and it doesn't run the latest iOS. Imagine that on the x86 space. Sorry, the new windows 11 doesn't run your SSE3 code or otherwise the new version of....wont run on your current hardware (like c2d from 2006, no hell sandy bridge generation).

I wonder when Intel says all before AVX2 runs with emulation (or smething like that). The progress in all aspects (not only) CPUs in mobile space are running so fast because there is the money and the desktop market stagnating is transferring exactly into the rate of improvement.

Gideon · Nov 8, 2018

Eug said:
Mobile device comparisons aside, the laptop and desktop are more or less the ultimate target. "We’ll actually take content from the desktop, profile it, and use it to drive our GPU architectures. This is one of the things that you usually don't see in a lot of mobile GPU benchmarks," Shimpi explained.

---

Typically when you get this type of CPU and GPU performance, a combination of the two, you have a discrete memory system. So the CPU has its own set of memory and the GPU has its own set of memory, and for a lot of media workloads or pro workflows where you actually want both working on the same data set, you copy back and forth, generally over a very narrow slow bus, and so developers tend to not create their applications that way, because you don't want to copy back and forth.

We don't have any of those problems. We have the unified architecture, the CPU, the GPU, the ISP, the Neural Engine—everything sits behind the exact same memory interface, and you have one pool of memory.

To me, it looks more like, "The future of fusion is ... Apple". E.g. they do plan to tackle desktop/notebook and instead of going stupidly wide with vector units, will use the GPU to compensate to the relatively weak vector performance (vs Intel).

I mean, if there's any company that has the easiest time pulling that off, that's apple. They already have a pretty heterogenous environment there, a single memory pool, they own the entire vertical hardware/software stack etc ...

IntelUser2000 · Nov 8, 2018

TheGiant said:
No info about desktop/mobile replacement timing etc, lots of technical stuff. We have to see if you add sata, pci-e, usb and other things that are pretty normal and we don't even think about them to the chip and also the desktop/laptop operating system which isn't optimized to the max as mobile is what it does.

If they are going to break compatibility anyway, then they can bring Macs that are likely scaled up versions of their iDevices. You'd have them super integrated, which will basically eliminate user upgrading, but they don't have to care about that. They'll be able to make it thinner and lighter that way or with a bigger battery with the saved space, and use less power as well and lower cost. There is a cost to flexibility.

I was reading an article that Apple is so focused on having the simplest, uncluttered design, sometimes it ends up being confusing in certain segments. The Mac Pro had everything in the back, the power buttons, plugs, expansion ports. Because they wanted it clean. Despite it being "Mac Pro".

Their design paradigm works extremely well for consumers, because it can become an extension of their fashion.

Apple A12 & A12X *** Now A12Z as well *** Now in a Mac mini

Member

Golden Member

Diamond Member

Senior member

Platinum Member

Lifer

Platinum Member

Member

Member

Member

Lifer

Senior member

Senior member

Platinum Member

Senior member

Lifer

Platinum Member

Lifer

Elite Member

Platinum Member

Junior Member

Lifer

Senior member

Golden Member

Elite Member

Apple A12 & A12X * Now A12Z as well * Now in a Mac mini