Doesn't really seem to be much of an issue, there's going to be optimal ways to code for each architecture and who better to tell the devs how to get the best performance from the parts than the makers?
I'm wondering though, and maybe zlatan or someone with (way) more knowledge than me can help, would this be a simple task like just specifiying some options (maybe when you compile or something) or would devs need to write significantly different codebases for the different architectures?
I guess what I mean is would these different optimistations be fairly trivial to accomplish or would you need to write loads of extra code for each architecture separately?
If you want to build a super optimized engine for the PC, than it is important to consider some architecture specific paths. And this shouldn't be a problem if the engine structure built properly, but it will take additional time and resource. We already spend an awful lot of time to understand the drivers, but with the new explicit APIs this will change a lot, because the kernel driver won't affect the performance. In this case all of the code can be profiled and we will able to solve the performance issues easier. This model is really well-known from the consoles. In the end this will make our lives easier, even with some architecture specific paths.
This post from Nvidia is a good thing for the devs. I know the GCN really well, primarily because of the consoles, and I know how things works on Intel, because they also provide documents. But I don't really know how to optimize for Nvidia. This is a stepping stone from them. It is not much compared to what can we get from AMD and Intel, but finally it's a start.