Mediatek MOAR COARS!

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

jpiniero

Lifer
Oct 1, 2010
16,830
7,279
136
Qualcomm only use big.LITTLE until they got their own custom cores ready. Apple doesnt use big.LITTLE at all. So even Qualcomm tells you that big.LITTLE is subpair. (Not to mention a disaster software support wise).

I looked and couldn't find any confirmation of the Snapdragon 820's specs beyond them fabbing it at SS/GF 14 nm. It's possible it is using big.LITTLE or some variant thereof.
 

imported_ats

Senior member
Mar 21, 2008
422
64
86
Care to be more specific about those issues, and do you have any data indicating to what extent that actually impacts performance?

You could try reading the actually pretty well down article that Anandtech did on the 20nm Samsung bl chip which actually showed that not using bl and shutting down the A57s resulted in better performance.

Also, if it would be such a problem and bad solution, then how come ARM, Samsung and Qualcomm all are using b.L?

because ARMs built in power management technology is archaic. No one who has a choice would ever really design something like bl. Note that Apple doesn't use bl, not does qualcomm use bl except for their oops chips. And when/if Samsung ever gets their own core, its unlikely that they'll use bl either. bl is a poor solution all around. It simply is much to complicated and has too much latency.
 

imported_ats

Senior member
Mar 21, 2008
422
64
86
You have to qualify that statement, are you saying that a higher ipc 8 core will lose to a lower ipc dual? I intentionally twisted that to prove a point.

Given the apriori obvious thermal constraints, I'd wager that yes, a lower ipc dual core will beat a higher ipc 8c in actual use.

Also don't let legacy prevent you from moving forward, going forward a lot of algorithms might have to be redesigned to scale with core count.

Stop, just STOP! It isn't legacy. Its never been about legacy. Its been about very few things actually have the viable concurrency to begin with, and those that do are actually quite difficult to program in parallel. We've had widespread use of multi-context hardware for a decade, and significant use going back well over 2 decades.

Making actual useful programs parallel is not easy, in fact in most cases it is impractically hard. Those that have been made parallel have invariably had either significant control concurrency or significant data concurrency. Neither of which applies to the vast majority, the vast bulk, of applications.

Also note that ST perf has stumbled on a roadblock due to clock speeds hitting a wall, ipc and core count bare really the only way to achieve more throughput.

Throughput is important for servers, for client workload, pretty much not at all. Client workloads care about latencies, not throughputs. And we continue to get rather substantial ST performance gains.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
You could try reading the actually pretty well down article that Anandtech did on the 20nm Samsung bl chip which actually showed that not using bl and shutting down the A57s resulted in better performance.

I think you need to read that article again if you actually think permanently disabling the A57 cluster increased performance.
 

lopri

Elite Member
Jul 27, 2002
13,314
690
126
We need to remember that big.LITTLE in current incarnations is largely geared towards 4-threaded performance. Some benchmarks may be able to push all 8 cores to a degree, but normal usage mode is switching between LITTLE cores and big cores to maximize power efficiency. You see this from all of Samsung's big.LITTLE SOCs as well as Qualcomm's.

This is another reason why I think this "consumers go for moar cores" theory is wrong. Because the octa-core SOCs from Mediatek and Kirin(?) are also based on this 4-threaded performance concept. They do not act like a full-blown 8 cores in majority of the operations. They just have been, for the most part, using two clusters of LITTLE cores at different frequencies instead of actual big cores. (at least until this 10-core monster was announced)

I am guessing the cheapness of ARM's LITTLE cores (both financially and electrically) as well as varying yields of LITTLE cores per different revision play a role here. It is instructive that we see some newer versions of LITTLE cores clocking upwards of 2.0 GHz which was previously thought to be impractical. It naturally follows that not all LITTLE cores are created equal, and that may as well be another motivating factor for the OEMs to create little.LITTLE SOCs.

None the less, the bottom-line here is that these SOCs are still quad performers. OEMs advertising of these as octa-cores is, while technically accurate, not based on any consumer demand but based on technical and financial reality. So again, I reiterate that the "consumers demand moar cores in a smartphone" myth is, well, a myth.

Given the apriori obvious thermal constraints, I'd wager that yes, a lower ipc dual core will beat a higher ipc 8c in actual use.

I have one counter-example here, albeit an anecdotal one (from a credible source, though). I was curious why we did not see more 2+2 configurations that should be more power-friendly than typical 4+4 configurations. One of the reasons I have heard is that for typical Android operation two LITTLE cores are not nearly adequate for a smooth user experience. That causes more big core access and scheduler overhead in 2+2 configurations than 2+4 or 4+4 configurations to the point that whatever theoretical power savings in a 2+2 configuration is not realized in the OS environment.

An assumption that a lower IPC dual-core beats a higher IPC octa-core is not sound in theory and in practice.

P.S. Apple's A8X was clearly designed to be a quad-core, as seen in the floor plan. Apple must have disabled one of them for power reasons, yield reasons (TSMC 20nm..), or possibly even financial reasons. It is difficult to imagine that Apple engineers designed a quad-core in order to purposefully fuse off one of the cores eventually (while saving its cache..). So yes, Apple have already moved towards 4-threaded performance. I predict that this is where things will settle for the foreseeable future, just as they have on desktop. So please stop the nonsense that Apple will stay with dual-cores forever thanks to Cyclone's high IPC. (but as poofyhairguy pointed out they REALLY need to add more RAM to their devices first. A7 + 1GB RAM on my iPad Air is a miserable experience)
 
Last edited:

kpkp

Senior member
Oct 11, 2012
468
0
76
Moar details about moar cores:
16-150506101602V6.jpg
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
It is instructive that we see some newer versions of LITTLE cores clocking upwards of 2.0 GHz which was previously thought to be impractical. It naturally follows that not all LITTLE cores are created equal, and that may as well be another motivating factor for the OEMs to create little.LITTLE SOCs.

Just because they're doing it doesn't mean it isn't still impractical. We've seen Cortex-A53s at 1.7GHz already, and we've seen Cortex-A7s at 2GHz in the past.

There are of course knobs you can turn to make an implementation reach higher clock speeds than usual. We even heard of demos of Cortex-A9s on older processes reaching upwards of 3GHz.

I have one counter-example here, albeit an anecdotal one (from a credible source, though). I was curious why we did not see more 2+2 configurations that should be more power-friendly than typical 4+4 configurations.

There's this strange idea that just having the cores there makes the SoC more power hungry. Power gated cores use very, very little power.

If more cores uses more power for a similar amount of work accomplished then the scheduling is bad.
 

scannall

Golden Member
Jan 1, 2012
1,960
1,678
136
If running lots of weaker cores at higher clock rates was a good idea, then AMD would own the desktop and server world...
 

DrMrLordX

Lifer
Apr 27, 2000
22,937
13,023
136
Hahah I like the cars. So tacky. Interesting take on how the thing's going to work . . . never engage more than one cluster at once eh. Well if they can fit all that into one package and if they can't find any better way to use the transistors, then so be it.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Hahah I like the cars. So tacky. Interesting take on how the thing's going to work . . . never engage more than one cluster at once eh. Well if they can fit all that into one package and if they can't find any better way to use the transistors, then so be it.

It works because they have very little choice anyways in terms of the silicon area that ends up being allocated to dark silicon in the design.

Why have dark silicon take down the performance of your big-cores? Just make them a little less big (i.e. smaller) and create lots of little cores all around the die. Then you cluster them and prevent multi-cluster activation.

Dark silicon issue solved.

exploring-emerging-technologies-in-the-hpc-codesign-space-13-638.jpg


(for some light reading, see here :p)
 

Abwx

Lifer
Apr 2, 2011
11,885
4,873
136
Hahah I like the cars. So tacky. Interesting take on how the thing's going to work . . . never engage more than one cluster at once eh. Well if they can fit all that into one package and if they can't find any better way to use the transistors, then so be it.

Moar gearz is the master word....

Anyway this more cores strategy is the good one for the customer, only more cores make perfs scale linearly with power, if it wasnt for the market domination of who you know PCs would currently use 16 cores CPUs.

This limitation is the cause of the PC decline, why bother with a big device that doesnt manage to have the mandatory one order of magnitude better perfs to stay in relevancy..?.

Phones SoCs like this one should ring a bell for the PC industry actors that current offerings are late in respect of historical trends.
 

lopri

Elite Member
Jul 27, 2002
13,314
690
126
There's this strange idea that just having the cores there makes the SoC more power hungry. Power gated cores use very, very little power.

If more cores uses more power for a similar amount of work accomplished then the scheduling is bad.

Notwithstanding a (bad) scheduling overhead is real in big.LITTLE, the scenario I suggested has nothing to do with such. I am unsure why you selectively quote what I said out of context.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Notwithstanding a (bad) scheduling overhead is real in big.LITTLE, the scenario I suggested has nothing to do with such. I am unsure why you selectively quote what I said out of context.

Because you were curious why 2+2 wasn't more power efficient in the first place, and a lot of other people seem to share this curiosity - which makes no sense to me.
 

DrMrLordX

Lifer
Apr 27, 2000
22,937
13,023
136
Anyway this more cores strategy is the good one for the customer, only more cores make perfs scale linearly with power, if it wasnt for the market domination of who you know PCs would currently use 16 cores CPUs..

I know right? Damn that Linus Torvalds!!!

Oh wait, that's not who you meant, was it . . .
 

lopri

Elite Member
Jul 27, 2002
13,314
690
126
Because you were curious why 2+2 wasn't more power efficient in the first place, and a lot of other people seem to share this curiosity - which makes no sense to me.

You talk as if Global Task Switching was there with big.LITTLE from the beginning. Cluster Migration and Core Migration were also part of the plan (and reality), and number of cores certainly did have bearings on power consumptions.

Just because they're doing it doesn't mean it isn't still impractical. We've seen Cortex-A53s at 1.7GHz already, and we've seen Cortex-A7s at 2GHz in the past.

There are of course knobs you can turn to make an implementation reach higher clock speeds than usual. We even heard of demos of Cortex-A9s on older processes reaching upwards of 3GHz.

I frankly did not understand this response, either, because it seems largely agreeing to what I said (which you quoted). Now I get a sense that you objected my use of the word "impractical." Let me confess that I did not mean much by that word.

No need to be hung up with past tense.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
You talk as if Global Task Switching was there with big.LITTLE from the beginning. Cluster Migration and Core Migration were also part of the plan (and reality), and number of cores certainly did have bearings on power consumptions.

I don't see what this has to do with efficiency of 2+2 vs 4+4 cores, could you explain your line of reasoning to me? Or do you mean that 2+2 using global task switching can be more efficient than 4+4 using cluster migration?

I frankly did not understand this response, either, because it seems largely agreeing to what I said (which you quoted). Now I get a sense that you objected my use of the word "impractical." Let me confess that I did not mean much by that word.

No need to be hung up with past tense.

I'm saying this because I've seen a lot of comments that are treating upcoming little.little chips as unusual/exotic or even with specially modified A53 cores. I blame marketing for this.
 
Apr 30, 2015
131
10
81
There is an ARM white-paper by Peter Greenhalgh, on big.little:

big.LITTLE processing with ARM Coretex-A15 & Coretex-A7.

It may be of interest to some readers.
 

imported_ats

Senior member
Mar 21, 2008
422
64
86
Moar gearz is the master word....

Anyway this more cores strategy is the good one for the customer, only more cores make perfs scale linearly with power, if it wasnt for the market domination of who you know PCs would currently use 16 cores CPUs.

For consumer workloads, more cores generally results in worse delivered performance. AKA you are quite far from both reality and theory.