David Kanter dissects Haswell

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tuna-Fish

Golden Member
Mar 4, 2011
1,699
2,622
136
hrmmm .. one of these genius compiler guys should invent the binary compiler. Take a binary, say targetted 386 and recompile it towards a new arch.
Should be doable.

These things exist, however, they are horribly slow, and generally cannot really improve anything. Compilers can extract speed by relying on provided constraints and invariants. A binary provides almost none -- you really cannot do much better than the original one.

(TSX)
Also supported in VS2012 and Intel compilers (v13). And from GCC 4.8.

Proper use of RTM is not just a recompile away. We need a lot of software infrastructure that isn't there yet to make full use of it.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
- You dont have to wait for your favorite software vendor to get benefit from your new arch. Your software vendor may never get around to it or may even not be in business anymore. There's a ton of scenarios where this makes sense IMO.
We have libre software, and languages based on IRs, both of which are quite efficient ways to do it.

With AVX(2), there is the option to get a performance boost just from vector SSE2 code recompilation (scalar won't get anything). That's not the norm. Getting any kind of meaningful performance boost from an existing binary will not be an easy feat, and may be close to impossible, without taking the time to analyze all code side effects, and determine which ones are and are not made use of.

The information that you really want, to make it work well, is in some form of AST and/or object file set, worked out from the source code. Most of that is, "thrown away," as part of making the final binary. Most of it can still be made use of with an abstract virtual machine language, made to describe the program's activity, not some other machine's activity. The JVM and .NET CLR are the premier examples of this, today, and LLVM is getting pervasive in the odd places you wouldn't normally look (drivers, FI). These pseudo-machine languages are very much made to have enough semantic information for a compiler left in them, while being far enough from the code that the CPU time to compile won't need to be too much. Binaries for actual machines, however, have all their quirks wrapped up in them, and use every little trick the compiler (or programmer, in highly-optimized cases) could figure out, to speed them up. In the case of whole new sets of instructions, the code would need to be reverse-engineered, refactored, and re-optimized.

Now, for compatibility, I'm all for it, like making a program that breaks in 64-bit work fine in 64-bit (I really do wish MS supported thunking all the way down, for that, because I occasionally do need to read files from ancient programs). But, for performance, it may not even be possible to improve things that way, but if it is, it would take an insane amount of work.
 
Last edited: