Linus Torvalds: Too many cores = too much BS

dave_the_nerd · Jan 2, 2015

SunnyD said:
Depending on the task, parallelism can be pointless.

However a properly designed OS can easily keep a ton of small lightweight cores busy just by handling background tasking while freeing up the heavyweight cores to do the more important heavyweight user demanded operations to keep the system responsive. This, imho, would be the holy grail of "parallelism" at the OS level for a multi-many core CPU at this point. Less places for the OS to get bottlenecked or backlogged equals a much more responsive system. It's not exactly traditional parallelism though.

Didn't computers used to be designed that way? With separate coprocessors for everything?

Cerb · Jan 2, 2015

VirtualLarry said:
Has he not heard about job-based queue systems, which may prove to be more scalable than traditional threaded code?

Strange that something decades old is not traditional, today

. IoW, I'm sure he has, and that's nothing to do with it. Amdahl's Law may not strictly apply to splitting a multitude of partially-dependent tasks into subtasks that may run concurrently on parallel processors (in that one can't create a neat little, "X% can be paralleliized"), but the net result is similar: extra work, diminishing returns, then hard limits. Putting parts of ready tasks in queues is nothing new, and doesn't magically solve any problems. It sensibly allows for stable well-performing solutions to already solved problems, at the cost of needing to build the code base over from scratch.

You can't magically make software that isn't embarrasingly parallel scale out to say, 50 CPU cores, and most programs will not have the dev resources to make it even to 4-8, like games do. OTOH, as long as gaming stays a big industry, game engines are likely to scale out ever better, and will get close to their limits, within genres (big multiplayer FPS engines might make it to 12+ cores, while sim games might finally make real use of 3+, in a few generations

).

In all reality, this is not a problem. The context is dead ideas for general computing; a dead horse. Parallel-geared coprocessors: yes. Many weaker cores for parallel processing in lieu of our hundreds of millions of transistors big coes: no (at least not unless an affordable silicon replacement is made, in which case all bets are off). It's also not a problem in that up to several cores, we can take advantage of process and thread-level concurrency that does not require each process to itself scale out, but merely not block another one at that time.

SunnyD said:
Depending on the task, parallelism can be pointless.

However a properly designed OS can easily keep a ton of small lightweight cores busy just by handling background tasking while freeing up the heavyweight cores to do the more important heavyweight user demanded operations to keep the system responsive. This, imho, would be the holy grail of "parallelism" at the OS level for a multi-many core CPU at this point. Less places for the OS to get bottlenecked or backlogged equals a much more responsive system. It's not exactly traditional parallelism though.

But exactly which tasks are those? Identifying those is the problem, and if you give devs the choice, they are going to want theirs to be on the big cores every single time. If the choice is wrong, then the user basically has a slower computer than they should. Meanwhile, giving identified background processes less time on the big CPU cores offers the same net result without the added complications. We didn't go from mainframes to single-CPU to SMP for nothing...

Ratman6161 · Jan 2, 2015

Actually a couple people have been misquoting him to a degree. What he actually said was: "The only place where parallelism matters is in graphics or on the server side, where we already largely have it. " Many of you have quoted him as saying "graphics and physics" but he was saying servers not physics.

There are all kinds of people doing high end specialized tasks but let me give you a real world example. Where I work, I support about 36 people using line of business applications plus standard office stuff like MS Office 2010. In 2009 we gave them systems with 3 Ghz Core 2 Duo's and 4 GB of RAM. In 2013 we replace those with i5's with 4 GB RAM....but we only did that replacement because we buy systems on a 4 year replacement cycle. I've still got the old ones stacked up in my office and occaisionally haul one out as a spare.

Now, people were commenting on how their new i5 systems "feel a lot faster". But what they don't realize is that it's not because of the faster CPU. The reason they feel faster is that they new ones have SSD's instead of spinning disks. Example. I needed a laptop to give someone for short term use so I hauled out an old Dell E-6500 (2.4 Ghz Core 2 Duo + 4 GB RAM) from my junk pile. Stuck a Crucial M4 SSD in there and imaged it. Even though that M4 is no speed daemon by today's standards, it made that old laptop feel like a new machine. Light years better.

Anyway, the lesson I've taken from all this is that around the time when Dual cores became common (about 2007) CPU's actually got fast enough for business users and we really don't need anything faster until such time as we start doing something significantly different with the PC's. But our new software deployments have been going in the opposite direction. Most of what we have has been moving to web based applications and the CPU power on the desktops is getting even less important not more.

As Mr Torvalds pointed out, servers are a different story. In 2007 we had dual socket systems with one core per socket which we replaced with 4 cores per socket and later 6. Now you can get a lot more. In the old days each server was a separate hardware box but now with multi core cpu's we are running many virtual servers on a single box. Our 40 or so VM's would once have taken racks and racks of hardware but now run on 5 hardware servers instead. So yes, in our servers more cores is definitely better. But desktops? When we got to two, we had enough.

ShintaiDK · Jan 2, 2015

There is also an economic perspective within it. Sometimes the money is simply better spend optimizing the code already there. Rather than try x more cores.

But again, many people got unrealistic hopes for multithreading and more cores.

Even POV-Ray as jhu showed, an application that is parallel by nature and scales incredible well. 60 cores isnt more than 30-40 times faster. Not 60 times.

skipsneeky2 · Jan 2, 2015

I could see why many games still make good use of 4 threads ,not so much 6 or 8.Guess perhaps developers make games in mind for the majority perhaps?7 years i have seen games make use of 4 threads and its quite crazy to still only see this.

Is there truly a pc game that demands a core i7?fx6300 and its 6 threads?I feel some differences over a i5 with my i7 but even a game like BF4 has a minor difference in performance.Toyota mentioned Crysis 3 made good use of the i7 but i wouldn't know this.

monstercameron · Jan 2, 2015

The "if it ain't broke don't fix it " approach, it is exactly what we need and the whole industry should stay stagnant and optimize for single core. It's about time computer scientists stop finding ways to parallelise code and focus on squeezing even more single threaded performance out of dual core CPUs. /sarc

grant2 · Jan 2, 2015

monstercameron said:
Quote, remark, comment, sentiment etc. let's not play word games.

"debunked rumour" or "urban myth" would be the correct terms to use.

It's a bit more than word games to clarify that a real quote is being compared to an imaginary one.

frozentundra123456 · Jan 2, 2015

Seems we have forgotten that all the choices are not mutually exclusive. There is a place for more cores up to a point, but more cores should not be used as an excuse for slow inefficient cores either. I just bought an atom tablet, and while it is "quad core" and is OK for its intended use, I guarantee you I would much prefer a dual core haswell pentium laptop or desktop if I wanted to do anything but browse the net.

And lets not forget that this is an enthusiast seminar. For probably something like 90% of users, a 3+ ghz dual core like a pentium is plenty good enough. Very few (percentage wise of all users) really play demanding games, and even fewer try to do it while doing something like encoding. As an aside, I dont really get why one would want to encode while gaming. Wouldnt it just make more sense to leave that to be done overnite while not gaming? But that is just me. My first computer was a 233 mhz Pentium, so my mindset is to not run more than one demanding process at a time.

BSim500 · Jan 2, 2015

Mondozei said:
Basically. Do you think he's right or wrong?

Depends on the application. Video editing, servers, etc, can scale very easily over more cores (purple / green line below) but I'd rather have 4x strong cores than 8x weak ones for gaming simply for better consistency as many games are well below even the blue line beyond even 3 cores sometimes:-

If anything the curves for even heavily threaded games tend to be steeper for 1-4 cores (ie, between the red & purple lines), but then turn into a very flat blue line beyond that.

TreVader said:
Eventually you won't be able to increase single core IPC or clock speed, and you have to go to core count.

But will hitting an IPC / GHZ wall in itself make difficult-to-thread apps / games suddenly significantly easier to code for? Or will it instead drive a push for more efficient game engines? Or alternative to silicon in general? A lot of talk about low-level "direct to metal" is good, but right now interest also seems to be the opposite direction, ie, high level "portable" joint PC / mobile development engines like Unity (which isn't exactly the best performing game engine to put it mildly partly for the portability reason) for joint PC/tablet games.

I think half the problem is developers have pretty much made it clear speed optimizations and hand tweaking code for large core count CPU's are at the bottom of their "priority pile" below cross-platform design (and compromises) and minimizing development time/cost. We could all rush out and buy 16x core CPU's tomorrow, but I'm willing to bet good money many current games devs would not bother to maximize it as long as it runs well on fewer cores for the bulk of the market beyond the "easy bits" which may add all of 5-10%. Heck, these are the same people who in 2014 are still struggling to get a mouse to work on $50m games as smoothly / solidly as it did in 1990's / early 2000's $1-8m budget games simply because it's quicker to feed the mouse input into a controller simulator and use "week one gamers" as unpaid beta-testers... (Or make sure oversized PC HUD's are scaled properly for 2ft view distance which often takes just 1 person 1 day of work of UI.ini tweaking in some games). With that scale of consistent development laziness / cost-cutting, I'm not holding my breath for any major drive for them to spend weeks extra low-level hand-tweaking physics / AI code, etc, for hyper-scalability. I would joke some would even stoop to pumping more money in marketing "explaining" how 30fps is a 'more desirable cinematic experience' rather than write better code / use more efficient engines - but the sad truth is that's already been done too.

frozentundra123456 said:
And lets not forget that this is an enthusiast seminar. For probably something like 90% of users, a 3+ ghz dual core like a pentium is plenty good enough. Very few (percentage wise of all users) really play demanding games, and even fewer try to do it while doing something like encoding. As an aside, I dont really get why one would want to encode while gaming. Wouldnt it just make more sense to leave that to be done overnite while not gaming? But that is just me. My first computer was a 233 mhz Pentium, so my mindset is to not run more than one demanding process at a time.

Agreed. It's pretty much common sense and intelligent time management to maximize time spent away from the PC (sleeping, cooking, eating, showering, taking the dog for a walk, etc) for "offline tasks" (video encoding, malware scanning, automated backups, etc) leaving the maximum amount of time spent at the PC for "real-time" stuff (gaming).

monstercameron · Jan 2, 2015

grant2 said:
"debunked rumour" or "urban myth" would be the correct terms to use.

It's a bit more than word games to clarify that a real quote is being compared to an imaginary one.

oops, let me preface that with the word allegedly.

ninaholic37 · Jan 2, 2015

monstercameron said:
This reminds me of the infamous bill gates 640k quote.

+1

Arachnotronic · Jan 2, 2015

monstercameron said:
oops, let me preface that with the word allegedly.

Computer World said:
Gates himself has strenuously denied making the comment. In a newspaper column that he wrote in the mid-1990s, Gates responded to a student's question about the quote: "I've said some stupid things and some wrong things, but not that. No one involved in computers would ever say that a certain amount of memory is enough for all time." Later in the column, he added, "I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again."

http://www.computerworld.com/articl...-go-away----but-did-gates-really-say-it-.html

ninaholic37 · Jan 2, 2015

I think monstercameron's point was that it reminded him of the quote, and "we will never need/benefit from this" claims in general, not whether it was actually said by Bill Gates.

Absolution75 · Jan 2, 2015

As a software developer, he's right.

Even if you could extract more performance out of a highly parallelized architecture, there wouldn't be enough competent software developers to be able to code for those systems.

Zstream · Jan 2, 2015

Absolution75 said:
As a software developer, he's right.

Even if you could extract more performance out of a highly parallelized architecture, there wouldn't be enough competent software developers to be able to code for those systems.

In time, these competent software developers will sell software to those not smart enough to deploy it?

Spungo · Jan 2, 2015

VirtualLarry said:
Has he not heard about job-based queue systems, which may prove to be more scalable than traditional threaded code?

Maybe he just sucks at programming, which is ironic coming from me because I suck at programming.

Even if you could extract more performance out of a highly parallelized architecture, there wouldn't be enough competent software developers to be able to code for those systems.

Maybe they should stop sucking?

Trying to split AI or logic intro several cores is a waste of time and human resources, you must keep in mind that in any task that you want to split and do it in parallel you are adding overhead for each thread, making it less and less efficient.

Threading things is not hard, but it does take time and money. Things are poorly threaded today because there's no reason to make a program more parallel. If the thing works on 2 cores, why would you spend extra money to make it work slightly better with 4 cores? We'll see highly threaded games when it's no longer possible to make a AAA quality game with just 2 cores.

Shivansps · Jan 3, 2015

The problem is easy, you are adding overhead, you need to factor the win/complexity/cost into it. Thats why there are just a few tasks that make sence to use it. Its the classic "scaling".
The overhead is a problem, and by definition, by making a task MT heavy, you are making it less efficient, people seems to think that "more MT = more efficient" and thats just complete BS.

Once DX12 is released the mayor problem to pc games whould be solved, but the thing is, both weak and strong cores gona gain from it, my bet still is on strong cores for several reasons..

mrmt · Jan 3, 2015

Linus' vision for the consumer market isn't a many general purpose weak cores like some companies eanted to push, because this concept has a lot of practical problems (TDP, programing tools, interconnects). Instead he think the consumer market will be ruled by SoCs with few general purpose strong cores plus a lot of specialized hardware (accelerators) for relevant workloads. I think the most relevant companies are already heading towards the latter vision.

Nothingness · Jan 3, 2015

ShintaiDK said:
Even POV-Ray as jhu showed, an application that is parallel by nature and scales incredible well. 60 cores isnt more than 30-40 times faster. Not 60 times.

That less than ideal speepdup is likely due to memory bandwidth saturation. Or there might be a scene setup pass that is inherently non parallelisable, and then Amdahl's law strikes.

Both of these explanations are the reasons why no matter how good a programmer is, linear speedup is not reachable for most problems.

OTOH there are funny examples of superlinear speedups, tree exploration being one of them

greatnoob · Jan 3, 2015

Compute parallelism depends on the task itself. If you are going to bother optimising code for say, 8 cores, and you are dispatching that job to all 8 or more threads then it's very likely what you're doing is some sort of FP compute in which case you have specialised hardware and software for that (OpenCL, graphics cards or whatever fits into those categories).

The best way to use those extra threads (at least to me) would be a single pipeline where the thread does its own thing in the background then sends its output back to the main thread to be used for later. An example would be background level loading. I'm sure there are many use cases for those 8+ cores but it is too much work refactoring code that already works just to gain a <10% performance increase with a bonus of dealing with hidden thread management bugs post-release. If it can be included in the low-level backend APIs, SDKs, etc. then THAT would be worthwhile since you'll be seeing system-wide performance gains. Java - and the Android SDK, from what I've used of it so far - are on the right path when it comes to simplifying (although not completely alleviating) the headaches of threaded software development (in contrast to other languages or SDKs).

Fjodor2001 · Jan 3, 2015

greatnoob said:
The best way to use those extra threads (at least to me) would be a single pipeline where the thread does its own thing in the background then sends its output back to the main thread to be used for later. An example would be background level loading. I'm sure there are many use cases for those 8+ cores but it is too much work refactoring code that already works just to gain a <10% performance increase with a bonus of dealing with hidden thread management bugs post-release. If it can be included in the low-level backend APIs, SDKs, etc. then THAT would be worthwhile since you'll be seeing system-wide performance gains. Java - and the Android SDK, from what I've used of it so far - are on the right path when it comes to simplifying (although not completely alleviating) the headaches of threaded software development (in contrast to other languages or SDKs).

+1. Except for the statement that it necessarily is too much work to make the code parallelized. There are several excellent libraries that help the developers accomplish that, like e.g. Task Parallel Library for C# (just to give an example). There is of course still work that needs to be done for the SW developer, but it at least requires less effort.

Also, people need to realize that a good SW design does not optimize the code specifically for X number of cores. Instead you divide the work into separate work tasks, and let the OS scheduler assign those tasks to the cores that are available at the moment, regardless of whether that is 6, 8, 16 or 1024 cores. Of course, if there are only 10 such tasks ready for execution at any given time, then only 10 cores can be utilized. But with good design, the number of parallel tasks ready for execution at any given time should be high.

Flapdrol1337 · Jan 3, 2015

Didn't he also say onboard was good enough for most gamers and you'd have to be on drugs to think dedicated gpu's have a future?

MisterLilBig · Jan 3, 2015

More cores are better, always.

People hear this and quickly get into the mindset of Intel vs AMD/product comparisons.

If you where given a free Broadwell, would you really go with the 4 core one or the 8-12-16 core one? All else being equal, you would really go for the 4 cores? If yes, you are simply choosing to have less performance.

And every time I hear the parallel game programming comments all I think is 'Playstation 3' and all the years that has been in the market and the hundreds of games that came out for it.

The new consoles are a vacation for these game engine engineers.

Shehriazad · Jan 3, 2015

Flapdrol1337 said:
Didn't he also say onboard was good enough for most gamers and you'd have to be on drugs to think dedicated gpu's have a future?

Not sure if he said that...but at least SoCs (and thus to some degree "onboard") are definitely the future. Dedicated GPUs for the normal consumer right now are still a thing...but let's look 10 years into the future.
In 10 years chips will very likely be way past the 10nm mark. 3D HBM or another memory will have become the standard and CPUs will probably just carry the GPU with necessary Vram (or total ram) on die(with possible expansion slots on the veeeery small boards....I don't think ATX sized boards will be a thing in 10 years...at least not for normal gamers).

For the professional area those GPUs will survive longer I guess...but I don't see GPUs lasting much longer.
Power draw of those pieces also gets low enough...and people love simplicity. If you can put a hardcore gaming PC on a mini board and a single chip..then this will be the future.

Not to mention I think I clearly remember even Nvidia claiming that at some point their GPUs will be credit card sized. AMD has been working on APUs (a little ahead of time again, aye?) for a while and Intel also puts dedicated graphics in all of their standard consumer hardware.
Not to mention at least in the mobile market, Nvidia already works on SoCs as well.

So if he said that...I am sure that he is right (somewhat). But the way you "quote" it makes it sound out of context, anyway.

MisterLilBig · Jan 3, 2015

Shehriazad said:
So if he said that...I am sure that he is right (somewhat).

I know John Carmack has said it many times and now he has been working on SoC's as a target also(Samsung Note 4).

Even before, "onboard" or better, "on device" has been good. We had laptops with GPU's from NV and ATI for awhile. To me, the "onboard graphics" comment always sounded like "This iGPU from non-pro graphics companies will be good enough for most people.", it was targeted to Intel and AMD, but then AMD ate ATI and since then it's been said mostly towards Intel.

That said, the problem with iGPU's in laptops is the growing resolution and the lack of serious improvement on them. The Core i7-4950HQ Iris Pro 5200 can't really handle most games at 900p(anandtech review) and the FX-7600P is having playable but not great framerates at 1080p(tech report review), it happened with smartphones and will continue to do so, for a few years.

Linus Torvalds: Too many cores = too much BS

Lifer

Elite Member

Senior member

Lifer

Diamond Member

Diamond Member

Golden Member

Lifer

Golden Member

Diamond Member

Golden Member

Lifer

Golden Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Golden Member

Senior member

Senior member

Senior member