Linus Torvalds: Too many cores = too much BS

DrMrLordX · Jan 7, 2015

Linus can complain all he likes. AMD is pushing GCN compute while Intel is introducing compute capability with Gen8 graphics on Broadwell. The resources for massively parallel processing are there/will be there in consumer-level hardware. The winners will be the ones who figure out how to make use of it all.

soccerballtux · Jan 7, 2015

TuxDave said:
I saw some email exchanges from him and it mostly went along the lines of:

"You're an idiot and this is why you're an idiot..."
<bunch of technical facts>
"So yes, you're an idiot."

the word idiot honestly means nothing to me anymore. it means "I'm wrong, and I should have thought of these reasons why"

it's only personal if you make it personal

you just gotta stop identifying with your intellect

Mushkins · Jan 7, 2015

III-V said:
Linus's argument was not "more cores are never useful." It's that they have a cost, directly in the form of die size and transistor count -- in most cases, for most people, it is better to spend those transistors elsewhere, or simply omit them and save on cost.

I don't know why his argument is being so frequently, so tremendously misunderstood. He's not trying to take your cores away... for those of you that actually make use of them.

Because time and time again the man fails to make an objective argument while sticking to facts and presentable evidence. Instead, he goes into full-rant-mode, talks down to people, and acts a fool. The guy seems to thrive off of turning technical discussion into meaningless emotional argument. He's trolling, whether he realizes it or not, and he does it all the time.

He's a smart, accomplished guy. But instead of writing scholarly articles and white papers, he makes rants and slanted forum posts. Makes it hard to take him seriously most of the time, and he's catering to the crowd that's going to get bated into his emotional bickering instead of sticking to the technical facts.

He also makes a lot of flawed, generic statements and doesn't watch what he means vs what he's actually saying. I think anyone looking objectively at the OPs link will be able to catch that Linus is simply saying that Moar Cores does not = Moar Better, and we shouldn't fall for the marketing hype everyone's slinging while still acknowledging that for some jobs, more cores *is* the right tool. Unfortunately, his writing doesn't say that word for word, instead he words it as if as if it's a universal truth and that parallel programming will *never* advance further than it currently is and we'll *never* benefit any further in general computing from more cores.

ShintaiDK · Jan 7, 2015

DrMrLordX said:
Linus can complain all he likes. AMD is pushing GCN compute while Intel is introducing compute capability with Gen8 graphics on Broadwell. The resources for massively parallel processing are there/will be there in consumer-level hardware. The winners will be the ones who figure out how to make use of it all.

That still dont invalidate anything he said. More supporting it.

itsmydamnation · Jan 7, 2015

ShintaiDK said:
That still dont invalidate anything he said. More supporting it.

This thread didn't really have anything to do with what he said either...... or more generally linus' view across multipule thread @ RWT.

jhu · Jan 7, 2015

DrMrLordX said:
Linus can complain all he likes. AMD is pushing GCN compute while Intel is introducing compute capability with Gen8 graphics on Broadwell. The resources for massively parallel processing are there/will be there in consumer-level hardware. The winners will be the ones who figure out how to make use of it all.

That has nothing to do what Linus said.

DrMrLordX · Jan 7, 2015

ShintaiDK said:
That still dont invalidate anything he said. More supporting it.

Are you so sure?

Torvalds wants more IPC. Who doesn't? Fact is, you can't just wave a magic wand and make IPC appear.

The argument for or against mass-parallelism in consumer-level software is a red herring. The real issue is: how can we make computer processors faster? The answer from Intel and AMD is: more parallel computing resources. Torvalds doesn't like that answer, but he appears to have misapprehended the problem being addressed by the introduction of such hardware.

Sure, there are problems with expecting every rank-and-file coder to make use of GCN cores or HDwhatever graphics. That's . . . inevitable. The march towards many-core consumer hardware carries on. People like Torvalds, no matter how influencial, aren't going to stop it.

ShintaiDK · Jan 7, 2015

DrMrLordX said:
Are you so sure?

Torvalds wants more IPC. Who doesn't? Fact is, you can't just wave a magic wand and make IPC appear.

The argument for or against mass-parallelism in consumer-level software is a red herring. The real issue is: how can we make computer processors faster? The answer from Intel and AMD is: more parallel computing resources. Torvalds doesn't like that answer, but he appears to have misapprehended the problem being addressed by the introduction of such hardware.

Sure, there are problems with expecting every rank-and-file coder to make use of GCN cores or HDwhatever graphics. That's . . . inevitable. The march towards many-core consumer hardware carries on. People like Torvalds, no matter how influencial, aren't going to stop it.

GPGPU compute is still a tiny niche and will most likely continue to be so. Its even hard to see the actual progress into new segments since the first CUDA. The only reason why you get more compute resources is because its tied to graphics performance.

And you should reread what Linus writes. It seems you got it all wrong.

Cerb · Jan 7, 2015

DrMrLordX said:
Are you so sure?

Torvalds wants more IPC. Who doesn't? Fact is, you can't just wave a magic wand and make IPC appear.

The argument for or against mass-parallelism in consumer-level software is a red herring. The real issue is: how can we make computer processors faster? The answer from Intel and AMD is: more parallel computing resources. Torvalds doesn't like that answer, but he appears to have misapprehended the problem being addressed by the introduction of such hardware.

Sure, there are problems with expecting every rank-and-file coder to make use of GCN cores or HDwhatever graphics. That's . . . inevitable. The march towards many-core consumer hardware carries on. People like Torvalds, no matter how influencial, aren't going to stop it.

All of that is beside the point. GCN, CUDA, OpenCL, etc., does nothing for general purpose computing. Neither, however, do weaker cores in abundance. There are still those out there expecting the next big ultra-wide computer with a magical new compiler to change that, which, while it can't be said to be impossible, cannot be expected to work out. Meanwhile, faster clock speeds, lower latencies, better speculation, etc., keep improving, if only slightly, real performance.

GCN, and other specialized hardware, is a side item, and not an issue. Torvalds is even rather reasonable about that, and all for suitable tasks not wasting the CPU's time.

Madpacket · Jan 7, 2015

If you read between the lines better OoO is what he really wants but this is becoming harder and harder to achieve. He complains about moar cores as the industry is struggling in which direction to move towards. This is a natural progression during a period of uncertainty but the market will eventually work itself out.

piasabird · Jan 7, 2015

When it comes to Gaming the best way to show the benefit is to show the benchmarks to prove it is faster. Large Cache faster read and write times better throughput faster RAM, etc.

Ajay · Jan 7, 2015

Madpacket said:
If you read between the lines better OoO is what he really wants but this is becoming harder and harder to achieve. He complains about moar cores as the industry is struggling in which direction to move towards. This is a natural progression during a period of uncertainty but the market will eventually work itself out.

Right, since he's a software engineer, it's logical for him to look for single threaded gains. Heck, if we were getting ~25% more throughput per generation we would worry less about multicore - but those days are over. The market is choosing multicore - just looking at what the CPU/SoC vendors are shipping. ISV's products are just a lagging indicator.

el etro · Jan 7, 2015

cbn said:
4.7 Ghz G3258 beats 4.7 Ghz FX-6300 in 8 out of 15 games.

Badly threaded games.

Anyway i agree with Linus T.

Ramses · Jan 7, 2015

My Grandmother would have told Linus to hold out both hands, wish in one and crap in the other, see which one gets filled first.

Then she would have made some amazing biscuits.

Idontcare · Jan 7, 2015

Ajay said:
Right, since he's a software engineer, it's logical for him to look for single threaded gains. Heck, if we were getting ~25% more throughput per generation we would worry less about multicore - but those days are over. The market is choosing multicore - just looking at what the CPU/SoC vendors are shipping. ISV's products are just a lagging indicator.

To be fair, on the flip side the hardware guys have a conflict of interest when it comes to pursuing multi-core MPU development versus increasingly larger core development.

Much less costly and less time consuming to task your R&D team with the goal of making a single tiny core with some otherwise modicum IPC improvements followed by cutting and pasting 6 or 18 of the same small core next to each other on a die layout schematic versus the alternative of allocating the exact same silicon footprint and xtor budget as those 6 cores consume but tasking your R&D team to come up with single or dual large core design.

Hardware guys are like "meh, let's just make it easier on ourselves, design a single small core and just cookie-cutter the hell out of our SoC and tell the world we have no choice because the software guys keep claiming parallelism is a chicken-and-egg problem, well we did our part to solve that problem, hard part done, now the software guys just need to get off their lazy asses and deliver on the programming"

DrMrLordX · Jan 7, 2015

ShintaiDK said:
And you should reread what Linus writes. It seems you got it all wrong.

I did read what he wrote. Nobody is pushing dozens of small, weak cores on anyone in actual retail products. Do you really think he went through all that trouble to complain about Xeon Phi? AMD's octal-core products have relatively small market penetration, and Intel has segmented their hexacore and octalcore products in a high-priced, enthusiast niche. It is not rational to conclude that he thinks such products represent the future of consumer computing devices.

GCN/Gen8 is what is or will be sold to the consumer. What else is a GPU but a bunch of underpowered little cores working on highly-parallelized workloads?

Cerb said:
GCN, and other specialized hardware, is a side item, and not an issue.

It will be an issue when Intel begins promoting it. That is not meant to be a snide remark, either; once Intel gets behind stuff like Broadwell-k and (more likely) Skylake, expect things to change more rapidly. HSA just hasn't been a major market force yet, and it may never be.

mindbomb · Jan 7, 2015

DrMrLordX said:
I did read what he wrote. Nobody is pushing dozens of small, weak cores on anyone in actual retail products. Do you really think he went through all that trouble to complain about Xeon Phi? AMD's octal-core products have relatively small market penetration, and Intel has segmented their hexacore and octalcore products in a high-priced, enthusiast niche. It is not rational to conclude that he thinks such products represent the future of consumer computing devices.

yea, he also complained about lack of out of order execution, which also doesn't exist on modern x86 processors, so it seemed to me he was referring specifically to the marketing of CUDA on nvidia gpus in saying that parallelization isn't gonna happen on a large enough scale to make that popular outside it's niche.

And he chooses quad cores as an example of a sane amount of cores, which isn't a huge contrast from the six in a high end 5820k.

njdevilsfan87 · Jan 7, 2015

mindbomb said:
yea, he also complained about lack of out of order execution, which also doesn't exist on modern x86 processors, so it seemed to me he was referring specifically to the marketing of CUDA on nvidia gpus in saying that parallelization isn't gonna happen on a large enough scale to make that popular outside it's niche.

I think it could, and actually should happen in video games. Video game entities can be coded fully in serial. The processing of the code for each entity would result in random completion times due to random actions from the users, physics, what it actually does, and etc. But developers can try to minimize the possible variance. Then split up the load among as many cores as possible and watch the performance scale. It will not scale perfectly, but there would be notable added benefit.

1) More IPC = allows for more complex, accurate, realistic code to be processed in a timely manner.
2) More cores = allows many instances of #1 to be ran on the same system.

MMOs are something that would benefit a lot from figuring this out. In the end we need both.

Cerb · Jan 7, 2015

DrMrLordX said:
I did read what he wrote. Nobody is pushing dozens of small, weak cores on anyone in actual retail products.

Not now, no, because most of them learned. However, that is exactly what the prior console generation was, and the Toshiba/Sony part of it was allegedly going to revolutionize everything.

DrMrLordX said:
It will be an issue when Intel begins promoting it. That is not meant to be a snide remark, either; once Intel gets behind stuff like Broadwell-k and (more likely) Skylake, expect things to change more rapidly. HSA just hasn't been a major market force yet, and it may never be.

What is Skylake going to do that's any different? Skylake is another fat core design, doing even more fat core things. I've seen nothing about it morphing into a Xeon Phi with a GPU attached, or anything (and even a Xeon phi is evidence for the argument: they went with much stronger cores as it developed, instead of cramming the maximum mount of ALUs and SIMD units on the die).

If you read it it should be obvious: if it takes a great deal of effort to parallelize 10% for some arbitrary speedup, you're still taking >90% the original time, so why not speed everything up by a smaller amount, instead getting such a paltry speedup for one thing? And, if it's common enough, then fully offload it to special hardware and be done, instad of getting weak CPU cores for the job.

DrMrLordX · Jan 8, 2015

Cerb said:
Not now, no, because most of them learned. However, that is exactly what the prior console generation was, and the Toshiba/Sony part of it was allegedly going to revolutionize everything.

While Cell failed, current-gen console parts have essentially done the same thing all over again, only worse: instead of one "general purpose" core with several (6-8, depending on how many were fused off) SPEs, now we have 8 "general purpose" cores with, what, 1024 shader units? Those shader units can be used for computation just as easily as they can be used for graphics, too. Whatever lesson Cell was meant to teach to the industry as a whole, the one that firms like AMD and Intel don't seem to have learned is "parellelism in consumer hardware is bad, mmmkay".

That being said, those parts aren't necessarily the defining hardware in what will be the next generation of processors. Cell was meant to be used in darn near everything, while those custom AMD parts are pretty niche.

What is Skylake going to do that's any different? Skylake is another fat core design, doing even more fat core things.

Skylake is going to have Gen9 graphics, which will be just another step in the evolution towards APU-like behavior by mainstream Intel products. Sure, it'll have "fat cores" for the same reason that Cell and AMD's custom Jaguar chip (among others) have them. Fact is that a Skylake owner who runs software capable of GPGPU on Gen8/9 graphics will get much more out of their CPU than someone who does not . . . sort of like a Kaveri owner today.

What you will also see are more forum users on sites like this complaining that the IPC improvements for Skylake aren't so great. It remains to be seen what Intel can do to goose up "fat core" performance as they move forward. Judging by the early Broadwell results (yeah, I know, it's still early and that's a low-power part) I'm not holding my breath waiting for +%20 IPC over Haswell or anything like that.

If you read it it should be obvious: if it takes a great deal of effort to parallelize 10% for some arbitrary speedup, you're still taking >90% the original time, so why not speed everything up by a smaller amount, instead getting such a paltry speedup for one thing? And, if it's common enough, then fully offload it to special hardware and be done, instad of getting weak CPU cores for the job.

See, there's the rub: "offloading it to special hardware" is exactly what AMD and (eventually) Intel are going to want people to do. It's not like they're asking people to use add-on cards exclusively here (in Intel's case, they only want/expect a very small niche to use their Phi products). Intel is spending perfectly good die real-estate on Gen8/Gen9 iGPUs that are more than just an afterthought or convenience to the user.

If you really want to know why Intel is going this route (aside from the possibility that they just want to irritate Torvalds), go ask them. I will tell you this much, though: if anyone can provide the necessary software/compiler tools to make GPGPU accessible to everyday coders, it'll be Intel. Not so sure that their drivers will be up to snuff, but the compilers? Yeah, Intel will do great there.

seitur · Jan 8, 2015

Ajay said:
Right, since he's a software engineer, it's logical for him to look for single threaded gains. Heck, if we were getting ~25% more throughput per generation we would worry less about multicore - but those days are over. The market is choosing multicore - just looking at what the CPU/SoC vendors are shipping. ISV's products are just a lagging indicator.

Core number is at it's limit though. Small core products does not really benefit much from more than 4 general purpose cores. Sure for some applications up to 8 cores may be viable, but those are and will be an miniority. 4 small and 4 smaller general purpose cores in Big.Little SoC is absolute max what would be viable in mobile devices and for non-mobile such thing is worthless. Core number in small core CPU/SoCs is already at it's limit. Unless you think that vendors will start pushing for 12 general purpose core smartphone SoC? ;p

What you're looking in future of both x86 and ARM small cores is demand for bigger IPC rather than further demand for even more small general purpose cores in one CPU/SoC.

So Linus is right imo. Recent history support his view too. Intel demolished AMD via having superior per core performance. ARM field? Similar trend. Biggest players - Qualcomm and Apple also went for strong per core performance (vs. other ARM SOCs) in their SoCs.

For general purpose cores - per core performance will continue to be most important factor. Future of computation lies in Increase of IPC, GPGPU, specialized cores and looking for silicon replacement(for future per core performance increases as silicon approaches it's limit) instead of increasing number of small general purpose cores.

NTMBK · Jan 8, 2015

Idontcare said:
To be fair, on the flip side the hardware guys have a conflict of interest when it comes to pursuing multi-core MPU development versus increasingly larger core development.

Much less costly and less time consuming to task your R&D team with the goal of making a single tiny core with some otherwise modicum IPC improvements followed by cutting and pasting 6 or 18 of the same small core next to each other on a die layout schematic versus the alternative of allocating the exact same silicon footprint and xtor budget as those 6 cores consume but tasking your R&D team to come up with single or dual large core design.

Hardware guys are like "meh, let's just make it easier on ourselves, design a single small core and just cookie-cutter the hell out of our SoC and tell the world we have no choice because the software guys keep claiming parallelism is a chicken-and-egg problem, well we did our part to solve that problem, hard part done, now the software guys just need to get off their lazy asses and deliver on the programming"

Doesn't going multicore just make the interconnect design more difficult? Keeping 18 cache-coherent cores efficiently fed with data can't be an easy job!

Nothingness · Jan 8, 2015

NTMBK said:
Doesn't going multicore just make the interconnect design more difficult? Keeping 18 cache-coherent cores efficiently fed with data can't be an easy job!

Yes, and place and route can get very difficult depending on the topology of your interconnect.

Enigmoid · Jan 8, 2015

NTMBK said:
Doesn't going multicore just make the interconnect design more difficult? Keeping 18 cache-coherent cores efficiently fed with data can't be an easy job!

I would say that was part of the problem AMD had with Zambini/Piledriver. They need a better interconnect to cut down a lot of the excess cache; part of the problem was that each module had 2 MB exclusive cache meaning a shared L3 was needed for performance, bloating the die.

SunnyD · Jan 8, 2015

jhu said:
That's exactly the issue. Why write software for non-existant hardware? But we've also seen where ambitious hardware aspirations have gone too (Itanium being the latest victim).

And I'm not advocating anybody do that at this current point in time. I do advocate that research continue to be done both on the hardware and software fronts to achieve better general parallelism making for better general purpose scaling gains across the board. That will be far more meaningful to computing than what we have now.

What I am pointing out though is that Torvalds's argument is too limited in scope without proper context, which is what I'm trying to bring to the conversation. At least you and a few others see that. It's a shame that Torvalds himself doesn't. Or at least I don't think he does with the way he talks.

It will be a slow road until some sort of breakthrough is discovered.

Linus Torvalds: Too many cores = too much BS

Lifer

Lifer

Golden Member

Lifer

Diamond Member

Lifer

Lifer

Lifer

Elite Member

Platinum Member

Lifer

Lifer

Golden Member

Platinum Member

Elite Member

Lifer

Senior member

Platinum Member

Elite Member

Lifer

Senior member

Lifer

Diamond Member

Platinum Member

Belgian Waffler