By dedicated hardware, I mean die space in the part of the chip where textures come in from memory, and then are buffered, and translated for use by the rest of the hardware, for the purpose of decompressing texture blocks, to make it a free bandwidth saver, to the developer and user, for common formats. Compressed blocks are small, with the most common size being 4x4 pixels, which is going to be smaller than single DDR burst (512 bits for a 64-bit channel). So, it's easy enough to read in the whole block, and then serve out all the needed pixels from that block, to the shaders that need them, or simply to be applied to a surface. DXT1-5, FI, are going to be free on almost any GPU from the last 10 years.