With all the instruction sets that have been added in recent years (MMX, 3d now!, SSE, SSE2, etc), this seems like a REALLY big hole to me.
It's not only useful for the obvious nothl/htonl function used after every time you resolve a dns name to an IP address. It's also incredibly useful for writing a high speed strcmp, as if the character order is changed to match the native endian byte order you can compare 4 chars at a time (or 8 on AMD64). This occurs a lot more than you might think - consider how lists of things get sorted for instance, or how a DB index works.
I use this technique in this sorting program I wrote, caching the first 8 bytes (in reverse order) of each line being sorted as an __int64, so then when figuring out where each line goes into the priority_queue, a 64-bit integer compare can be done rather than calling strcmp, and a normal string comparison only needs to be resorted to if the first 8 characters of the strings being compared are an exact match.
Better yet, when AMD added the 64-bit extensions, they should have made the new modes mixed-endian, like a PowerPC. Those chips prefer big endian by default, but also natively support little endian because they had emulators in mind when designing it. It would be nice to have a Hammer be mixed endian so the mac emulation speed would still be faster than a comparably priced mac, even after the price drops that came with the G5 introduction.
It's not only useful for the obvious nothl/htonl function used after every time you resolve a dns name to an IP address. It's also incredibly useful for writing a high speed strcmp, as if the character order is changed to match the native endian byte order you can compare 4 chars at a time (or 8 on AMD64). This occurs a lot more than you might think - consider how lists of things get sorted for instance, or how a DB index works.
I use this technique in this sorting program I wrote, caching the first 8 bytes (in reverse order) of each line being sorted as an __int64, so then when figuring out where each line goes into the priority_queue, a 64-bit integer compare can be done rather than calling strcmp, and a normal string comparison only needs to be resorted to if the first 8 characters of the strings being compared are an exact match.
Better yet, when AMD added the 64-bit extensions, they should have made the new modes mixed-endian, like a PowerPC. Those chips prefer big endian by default, but also natively support little endian because they had emulators in mind when designing it. It would be nice to have a Hammer be mixed endian so the mac emulation speed would still be faster than a comparably priced mac, even after the price drops that came with the G5 introduction.