OK, so the big question: how do they intend to change data structures to better utilize memory bus time, and high latency relative to ALUs, when branches need to be determined by the value of data that is not yet known? There's a lot that can be done on lots of data with map, filter, and sort implementations going highly parallel, but special compilers going against OoOE for simplicity's sake try to determine an unknown future, and invariably shift that burden from the calculation hardware to memory. Most potentially viable improvments researchers have simulated aught to work well for improving OoOE, as well as in-order, rather than obsoleting OoOE. While the idea of denser weaker processors had some promise, and may still in some niches, it ended up failing on a large scale due to the much higher latencies from weaker processors making the bigger badder servers worth it. Trying to replace hardware OoOE with software speculation could mean an increase in instruction count of multiple orders of magnitude for some cases, chewing up off-chip RAM time, and on-chip die space and power. That, or an abject failure outside of a couple niches, if they expect the compiler to actually predict memory accesses well, rather than implement parallel branching as a transparent part of the ISA.
Doing it in software, whether before-hand or in the chip (IE, Crusoe, Denver) could reduce power consumption, and explicit branching in the ISA might be able to remove some of the overheads of software-in-the-chip processors, but we've already seen that the market at large won't accept that minor improvement in overall throughput/density, when it means much higher latencies, even on highly parallel tasks. I have a feeling their marketing idea of "data center workloads" is actually an extremely small niche.
Having said that, I could see a different type of processor, like this, performing well for machine learning and AI, if geared towards that style of work from the start. Systems just for certain niches, or specialized coprocessors, may well have a bright future.
As an aside, their note about the brain project does seem reasonable, regardless of how anything else works out. IBM was able to simulate small animal brains running what amounted to G5 chips, at one per neuron, IIRC, some years ago.