Mediatek MOAR COARS!

Tsavo · Apr 25, 2015

I want Volvo cores in everything.

It's the way forward.

jpiniero · Apr 25, 2015

ShintaiDK said:
Qualcomm only use big.LITTLE until they got their own custom cores ready. Apple doesnt use big.LITTLE at all. So even Qualcomm tells you that big.LITTLE is subpair. (Not to mention a disaster software support wise).

I looked and couldn't find any confirmation of the Snapdragon 820's specs beyond them fabbing it at SS/GF 14 nm. It's possible it is using big.LITTLE or some variant thereof.

Arachnotronic · Apr 25, 2015

jpiniero said:
I looked and couldn't find any confirmation of the Snapdragon 820's specs beyond them fabbing it at SS/GF 14 nm. It's possible it is using big.LITTLE or some variant thereof.

Bet you $20 that S820 has no more than 4 cores.

jpiniero · Apr 25, 2015

Arachnotronic said:
Bet you $20 that S820 has no more than 4 cores.

It might be 8 ironically enough. With no big.LITTLE.

imported_ats · Apr 25, 2015

Fjodor2001 said:
Care to be more specific about those issues, and do you have any data indicating to what extent that actually impacts performance?

You could try reading the actually pretty well down article that Anandtech did on the 20nm Samsung bl chip which actually showed that not using bl and shutting down the A57s resulted in better performance.

Also, if it would be such a problem and bad solution, then how come ARM, Samsung and Qualcomm all are using b.L?

because ARMs built in power management technology is archaic. No one who has a choice would ever really design something like bl. Note that Apple doesn't use bl, not does qualcomm use bl except for their oops chips. And when/if Samsung ever gets their own core, its unlikely that they'll use bl either. bl is a poor solution all around. It simply is much to complicated and has too much latency.

imported_ats · Apr 25, 2015

monstercameron said:
You have to qualify that statement, are you saying that a higher ipc 8 core will lose to a lower ipc dual? I intentionally twisted that to prove a point.

Given the apriori obvious thermal constraints, I'd wager that yes, a lower ipc dual core will beat a higher ipc 8c in actual use.

Also don't let legacy prevent you from moving forward, going forward a lot of algorithms might have to be redesigned to scale with core count.

Stop, just STOP! It isn't legacy. Its never been about legacy. Its been about very few things actually have the viable concurrency to begin with, and those that do are actually quite difficult to program in parallel. We've had widespread use of multi-context hardware for a decade, and significant use going back well over 2 decades.

Making actual useful programs parallel is not easy, in fact in most cases it is impractically hard. Those that have been made parallel have invariably had either significant control concurrency or significant data concurrency. Neither of which applies to the vast majority, the vast bulk, of applications.

Also note that ST perf has stumbled on a roadblock due to clock speeds hitting a wall, ipc and core count bare really the only way to achieve more throughput.

Throughput is important for servers, for client workload, pretty much not at all. Client workloads care about latencies, not throughputs. And we continue to get rather substantial ST performance gains.

Exophase · Apr 25, 2015

imported_ats said:
You could try reading the actually pretty well down article that Anandtech did on the 20nm Samsung bl chip which actually showed that not using bl and shutting down the A57s resulted in better performance.

I think you need to read that article again if you actually think permanently disabling the A57 cluster increased performance.

lopri · Apr 26, 2015

We need to remember that big.LITTLE in current incarnations is largely geared towards 4-threaded performance. Some benchmarks may be able to push all 8 cores to a degree, but normal usage mode is switching between LITTLE cores and big cores to maximize power efficiency. You see this from all of Samsung's big.LITTLE SOCs as well as Qualcomm's.

This is another reason why I think this "consumers go for moar cores" theory is wrong. Because the octa-core SOCs from Mediatek and Kirin(?) are also based on this 4-threaded performance concept. They do not act like a full-blown 8 cores in majority of the operations. They just have been, for the most part, using two clusters of LITTLE cores at different frequencies instead of actual big cores. (at least until this 10-core monster was announced)

I am guessing the cheapness of ARM's LITTLE cores (both financially and electrically) as well as varying yields of LITTLE cores per different revision play a role here. It is instructive that we see some newer versions of LITTLE cores clocking upwards of 2.0 GHz which was previously thought to be impractical. It naturally follows that not all LITTLE cores are created equal, and that may as well be another motivating factor for the OEMs to create little.LITTLE SOCs.

None the less, the bottom-line here is that these SOCs are still quad performers. OEMs advertising of these as octa-cores is, while technically accurate, not based on any consumer demand but based on technical and financial reality. So again, I reiterate that the "consumers demand moar cores in a smartphone" myth is, well, a myth.

imported_ats said:
Given the apriori obvious thermal constraints, I'd wager that yes, a lower ipc dual core will beat a higher ipc 8c in actual use.

I have one counter-example here, albeit an anecdotal one (from a credible source, though). I was curious why we did not see more 2+2 configurations that should be more power-friendly than typical 4+4 configurations. One of the reasons I have heard is that for typical Android operation two LITTLE cores are not nearly adequate for a smooth user experience. That causes more big core access and scheduler overhead in 2+2 configurations than 2+4 or 4+4 configurations to the point that whatever theoretical power savings in a 2+2 configuration is not realized in the OS environment.

An assumption that a lower IPC dual-core beats a higher IPC octa-core is not sound in theory and in practice.

P.S. Apple's A8X was clearly designed to be a quad-core, as seen in the floor plan. Apple must have disabled one of them for power reasons, yield reasons (TSMC 20nm..), or possibly even financial reasons. It is difficult to imagine that Apple engineers designed a quad-core in order to purposefully fuse off one of the cores eventually (while saving its cache..). So yes, Apple have already moved towards 4-threaded performance. I predict that this is where things will settle for the foreseeable future, just as they have on desktop. So please stop the nonsense that Apple will stay with dual-cores forever thanks to Cyclone's high IPC. (but as poofyhairguy pointed out they REALLY need to add more RAM to their devices first. A7 + 1GB RAM on my iPad Air is a miserable experience)

monstercameron · Apr 28, 2015

http://fudzilla.com/news/processors/37637-mediatek-uses-20nm-to-make-10-cores

tsmc still has a few customers left.

kpkp · May 6, 2015

Moar details about moar cores:

monstercameron · May 6, 2015

Well, there goes my secret AMD gpu dreams, the Mali ain't that bad right?

Exophase · May 6, 2015

lopri said:
It is instructive that we see some newer versions of LITTLE cores clocking upwards of 2.0 GHz which was previously thought to be impractical. It naturally follows that not all LITTLE cores are created equal, and that may as well be another motivating factor for the OEMs to create little.LITTLE SOCs.

Just because they're doing it doesn't mean it isn't still impractical. We've seen Cortex-A53s at 1.7GHz already, and we've seen Cortex-A7s at 2GHz in the past.

There are of course knobs you can turn to make an implementation reach higher clock speeds than usual. We even heard of demos of Cortex-A9s on older processes reaching upwards of 3GHz.

lopri said:
I have one counter-example here, albeit an anecdotal one (from a credible source, though). I was curious why we did not see more 2+2 configurations that should be more power-friendly than typical 4+4 configurations.

There's this strange idea that just having the cores there makes the SoC more power hungry. Power gated cores use very, very little power.

If more cores uses more power for a similar amount of work accomplished then the scheduling is bad.

scannall · May 6, 2015

If running lots of weaker cores at higher clock rates was a good idea, then AMD would own the desktop and server world...

Abwx · May 6, 2015

Have these been posted already.?.

http://www.notebookcheck.com/MediaT...797-Helio-X20-vs-Snapdragon-810.142075.0.html

DrMrLordX · May 6, 2015

Hahah I like the cars. So tacky. Interesting take on how the thing's going to work . . . never engage more than one cluster at once eh. Well if they can fit all that into one package and if they can't find any better way to use the transistors, then so be it.

Idontcare · May 6, 2015

DrMrLordX said:
Hahah I like the cars. So tacky. Interesting take on how the thing's going to work . . . never engage more than one cluster at once eh. Well if they can fit all that into one package and if they can't find any better way to use the transistors, then so be it.

It works because they have very little choice anyways in terms of the silicon area that ends up being allocated to dark silicon in the design.

Why have dark silicon take down the performance of your big-cores? Just make them a little less big (i.e. smaller) and create lots of little cores all around the die. Then you cluster them and prevent multi-cluster activation.

Dark silicon issue solved.

exploring-emerging-technologies-in-the-hpc-codesign-space-13-638.jpg

(for some light reading, see here

)

Abwx · May 6, 2015

DrMrLordX said:
Hahah I like the cars. So tacky. Interesting take on how the thing's going to work . . . never engage more than one cluster at once eh. Well if they can fit all that into one package and if they can't find any better way to use the transistors, then so be it.

Moar gearz is the master word....

Anyway this more cores strategy is the good one for the customer, only more cores make perfs scale linearly with power, if it wasnt for the market domination of who you know PCs would currently use 16 cores CPUs.

This limitation is the cause of the PC decline, why bother with a big device that doesnt manage to have the mandatory one order of magnitude better perfs to stay in relevancy..?.

Phones SoCs like this one should ring a bell for the PC industry actors that current offerings are late in respect of historical trends.

lopri · May 6, 2015

Exophase said:
There's this strange idea that just having the cores there makes the SoC more power hungry. Power gated cores use very, very little power.

If more cores uses more power for a similar amount of work accomplished then the scheduling is bad.

Notwithstanding a (bad) scheduling overhead is real in big.LITTLE, the scenario I suggested has nothing to do with such. I am unsure why you selectively quote what I said out of context.

Exophase · May 7, 2015

lopri said:
Notwithstanding a (bad) scheduling overhead is real in big.LITTLE, the scenario I suggested has nothing to do with such. I am unsure why you selectively quote what I said out of context.

Because you were curious why 2+2 wasn't more power efficient in the first place, and a lot of other people seem to share this curiosity - which makes no sense to me.

DrMrLordX · May 7, 2015

Abwx said:
Anyway this more cores strategy is the good one for the customer, only more cores make perfs scale linearly with power, if it wasnt for the market domination of who you know PCs would currently use 16 cores CPUs..

I know right? Damn that Linus Torvalds!!!

Oh wait, that's not who you meant, was it . . .

lopri · May 7, 2015

Exophase said:
Because you were curious why 2+2 wasn't more power efficient in the first place, and a lot of other people seem to share this curiosity - which makes no sense to me.

You talk as if Global Task Switching was there with big.LITTLE from the beginning. Cluster Migration and Core Migration were also part of the plan (and reality), and number of cores certainly did have bearings on power consumptions.

Exophase said:
Just because they're doing it doesn't mean it isn't still impractical. We've seen Cortex-A53s at 1.7GHz already, and we've seen Cortex-A7s at 2GHz in the past.

There are of course knobs you can turn to make an implementation reach higher clock speeds than usual. We even heard of demos of Cortex-A9s on older processes reaching upwards of 3GHz.

I frankly did not understand this response, either, because it seems largely agreeing to what I said (which you quoted). Now I get a sense that you objected my use of the word "impractical." Let me confess that I did not mean much by that word.

No need to be hung up with past tense.

Exophase · May 7, 2015

lopri said:
You talk as if Global Task Switching was there with big.LITTLE from the beginning. Cluster Migration and Core Migration were also part of the plan (and reality), and number of cores certainly did have bearings on power consumptions.

I don't see what this has to do with efficiency of 2+2 vs 4+4 cores, could you explain your line of reasoning to me? Or do you mean that 2+2 using global task switching can be more efficient than 4+4 using cluster migration?

lopri said:
I frankly did not understand this response, either, because it seems largely agreeing to what I said (which you quoted). Now I get a sense that you objected my use of the word "impractical." Let me confess that I did not mean much by that word.

No need to be hung up with past tense.

I'm saying this because I've seen a lot of comments that are treating upcoming little.little chips as unusual/exotic or even with specially modified A53 cores. I blame marketing for this.

Systems analyst · May 7, 2015

There is an ARM white-paper by Peter Greenhalgh, on big.little:

big.LITTLE processing with ARM Coretex-A15 & Coretex-A7.

It may be of interest to some readers.

monstercameron · May 7, 2015

Tsavo said:
I want Volvo cores in everything.

It's the way forward.

even volvo followed the moar coars theme, had 5 cylinder engines!

imported_ats · May 7, 2015

Abwx said:
Moar gearz is the master word....

Anyway this more cores strategy is the good one for the customer, only more cores make perfs scale linearly with power, if it wasnt for the market domination of who you know PCs would currently use 16 cores CPUs.

For consumer workloads, more cores generally results in worse delivered performance. AKA you are quite far from both reality and theory.

Mediatek MOAR COARS!

Platinum Member

Lifer

Lifer

Lifer

Senior member

Senior member

Diamond Member

Elite Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Golden Member

Lifer

Lifer

Elite Member

Lifer

Elite Member

Diamond Member

Lifer

Elite Member

Diamond Member

Member

Diamond Member

Senior member