Ok this is gonna be an explanation in lay terms.
All the programs that a CPU runs are code + data, the CPU is an electronic device that has hardcoded functions it can perform on pieces of data (binary data, zeros and ones, for Athlons P4s, P3s these are 32bits in length, meaning each chunk of data is 32 digits, each digit is either zero or one). Examples of these operations are multiplication, addition, division, and logical operations such as AND, OR, and XOR. These hardcoded operations the CPU supports are called instructions. Athlon64 is a 64bit CPU, and operates on chunks of 64bit data at a time.
A few years ago someone had the bright idea of increasing performance of multimedia apps (sound, video applications such as video playback, encoding, games, etc) because these types of applications ask the CPU to do the same operation over and over, just with different pieces of data. So they came up with the idea of SIMD instructions, Single Instruction Multiple Data. The basic idea is that the same instruction gets run on multiple pieces of data at once, cutting down the amount of time the CPU takes to run the application. I'm not sure though, if all the operations are run on multiple pieces of data, some of them may just be new operations that were never before supported in the CPU's hardware, and therefore took many clockcycles to accomplish the same thing before their introduction as part of MMX/SSE/3DNOW/SSE2.
MMX was the first wave of these SIMD to hit mainstream processors, it was added to the original Pentium, along with more cache, and thus was born the Pentium MMX. Then AMD added 3DNOW! to the mix, then Intel came out with SSE, SSE2, and soon SSE3 will come out from what I understand.
When someone says "this CPU has MMX", they mean that the CPU supports a certain set of SIMD instructions which together make up MMX. Same thing with SSE/and 3DNOW!.