Intel is removing a great deal of the abilities of the i7 to get their UP parts into the mobile sector.
Haswell and its future ilk look awfully powerful, to me. The next Atom we'll have to see about (if Silvermont is indeed OOOE, was Intel going to do it anyway, or should we thank Bobcat? Or, is it all just a rumor?

).
nVidia, AMD and Intel are all dropping huge die budgets into pure compute logic(for Intel not for their mainstream parts, the Xeon Phi).
What else should they do with it (the CPUs are already big enough as it is, and more caches won't help desktops much)? Very few of us have any use for >4 cores, yet most people can use a good GPU, today. For the most part, the parallel computing bits are icing if you can use them, today. In a few years, I hope that will have changed, though, as software companies should have more motivation, with pretty much everybody having the necessary hardware.
The most profitable CPU AMD has designed was an in order core part
I don't get what you mean, unless you're making a sarcastic jab at their typical financial performance. They didn't make enough money from their ARM division, so it can't be that, and their last strictly in-order CPU was the K5, that came out 15 years ago. The MediaGX Geodes, while technically newer, they bought and improved upon, but didn't fundamentally design. The K6 and newer have all performed instruction re-ordering, and you could even argue that one wasn't a pure AMD design. The 29K died for the sake of x86, so I can't imagine it was more profitable than what followed, either. OTOH, the K8, Llano, and Bobcat have done quite well for AMD, but none of them are in-order.
It was designed for an embedded device by Sony? Do you consider the EE to be a failure?
The EE was never marketed by anyone as anything but the PS2 chip, so I don't see where it had a chance to succeed or fail. The market chooses consoles by game and network, and the games are chosen by publishers. The PS3 was going to sell, and did, by game selection and console pricing. It could have been as weak as the Wii, and still sold plenty (not as much, granted), or it could have been 2-3x better, and not sold more at all. The market success or failure of the Cell is based on users buying Cell products because it has a Cell, like the supercomputer(s) made with PS3s (Sony's loss was their gain, and then Sony removed the Linux option), or the IBM systems; and 3rd parties choosing to integrate the Cell into devices they are designing, where they could choose something else.
When the Cell was alleged to be followed by more configurations and future versions around launch, and then everything but Toshiba licensing single SPEs pretty much fell flat...that's a failure, for a design that was intended/marketed to flourish and keep on evolving into more mainstream uses, which never appeared. It went into a few high-end Sony and Toshiba TVs, as part of the early push, and then not much else happened. IBM made some blades with DP-optimized versions, and then they didn't follow those up, despite having new versions planned.
I linked an article earlier already pointing out that report you just linked was wrong. What IBM has stated was that Cell was getting put into the main POWER line(I believe POWER 8 revision). It makes a lot of sense for them, they can use the main line POWER8 core along with as many SPUs as needed for the given configuration.
Do you have a link for that? All I've seen is the 32 series died, and then they made typical vague statements. Even so, that tells me the blades and workstations must have sold like crap, because there's no way they could get Power money for them, and I have a hard time believing potential customers would pay Power prices, either, when Teslas are almost affordable, today.
nVidia went in the direction of Cell.
They went a direction the Cell didn't: an evolving architecture growing up to have the features of other modern computer parts, and making deals with major companies to help them take advantage of the hardware. CUDA, its SDK, and moving towards more sane memory models have helped make them successful in the market, because it has allowed good programmers to go ahead and extract value quickly, and great programmers to make people

over what can be done. They could also show people getting actual value from their hardware in reasonable amounts of time, rather than just hypothetical demos. Some of that was technical, but most of it was JHH having more good business sense than all the suits at IBM and Sony combined.
Using clusters of relatively simple vector units to produce massive computational power per/mm or watt. The rise of GPGPU has more to do with the possibility of Cell's removal from mainstream use then anything else.
IMO, if a Cell 2 came around in 2007-8 with bigger stores, faster buses, a better PPE, etc., and a Cell 3 came out in 2010-11 with more of that and a reduced or removed need for the semi-coherent explicit DMA push/pull mess, and all the associated marketing each time, NVidia might not have been the only strong player (software support for the Cell
has improved a lot). I mean, I get all the limitations for a 1st gen part, but only researchers and DoD contractors are generally willing to invest in what looks like it could be a dead end. One of the several things that has kept NVidia's products compelling is that you can be reasonably assured there will be even better hardware and software from them every 2-3 years.
Now, if you need compute power you can get it without using an exotic CPU design. Even Intel has dropped billions trying to get into the compute power market, they failed horribly with Larrabbee, but XeonPhi may work good in the HPC space.
Intel will succeed, if just by force. Even if their success ends up only modest, they have too much to lose by letting JHH and others corner the market, and they are the only other company with a technology (x86+libs+compilers) that can come out and rival or best CUDA.