Necessity is the mother of invention. When the cheap and easy performance gains are tapped out then the next level of slightly less cheap and slightly less easy performance enablers will be tapped.
I feel though that in the area of HPC apps, the code is already pretty well squeezed IMO. There is only so much you can do when preparing generic code that needs to be compiled and ran on a whole spectrum of hardware configurations.
I don't mind you asking, and I'd be happy to answer to the best of my abilities.
FWIW, I post on a forex forum (forum.mql4.com to be specific) under the username 1005phillip. As you can imagine from my CPU-related threads here,
my posts over there are "
enthusiast" rated

(albeit towards forex rather than CPUs

)
At any rate, my algo's are not HFT (high frequency trading), I don't build scalper trading strategies. I design trend followers, reversion to the mean arbitrage models, and other channel models.
It is decidedly pedestrian type stuff because the business of foreign currency trade has been rather stymied by the low national interest rates in the major countries since the 2008 banking meltdown. That's all about to change though
It is really fun though. I thought process node development was my dream job, but this is definitely more fun. A different brain teaser every day.