So you're you're back to questioning whether it is possible to write parallelized SW?
I don't think he's questioning whether it's possible, simply pointing out most code that isn't "embarrassingly parallel" (like video encoding or synthetic benchmarks) usually bumps heads with
Amdahl's Law, whilst hyper-parallelable specialist programs are typically GPU based (OpenCL / CUDA, etc).
"MOAR CORES" for general usage (web, office, media playback, casual gaming, etc) has ended up just as much "pushing on a string" as "chicken vs egg". For media playback,
we're talking 3w for H265 UHD via fixed function decoding hardware. CPU "software" decoding hasn't become less relevant than today. Even for video encoding we've reached the point where instead of power hungry 8x core laptop's, or 32x core desktops, people are just using Quicksync or Shadowplay using 1/4 of the power, and still ending up with video that's below the comparable threshold of what Youtube re-encodes will degrade it to anyway regardless of how pristine the source is. Same with "throwaway" video (web / video conferencing, Skype, etc), it's all about mobile / power efficiency these days. Given the sheer economies of scale for mobile devices, I can see more effort being put into improving quality of sub 10w fixed-function encoders than demanding 16x core CPU's for software X264/5 encodes.
Ripping a CD (audio encoding) is so fast that the bottleneck is the optical drive not the CPU even on a Celeron. Laptop's and most "off the shelf" pre-build desktop bottlenecks are the HDD not the CPU. Office & web browsers run on Celeron's (bottleneck for opening / reading lots of 4k files in web cache is again the mechanical HDD). Low-end "mainstream" photo editing is usually trivial enough (crop, resize, red-eye removal, etc) that it's fast on even the slowest CPU (and done even on tablets). High end professional photo editing can be GPU accelerated faster than the fastest CPU. Even AAA gaming still shows i3's beating FX-9590's in 2015 after the 8th annual "
this will be the year 8-core FX chips render i5's obsolete" failed prediction... Casual / low end Indie gaming runs on a potato. What's left for the average person (that doesn't include CAD, professional / research applications, etc)? Not much. You can buy an
18-core / 36-thread Intel CPU right now. But it's not "mainstream" for a reason.