The human ear, like any listening device, senses the instantaneous sum and or difference of the sound wave frequencies hitting the eardrum. If you sound a 800 hz tone at the same time as a 1300 hz tone, (and let's assume they are equal volume for now) the ear will hear a 500 hz difference tone, 800 hz, 1300 hz and 2100 hz. The sum and difference will be lower volume, but present and audible, sometimes called "beat frequencies." If you connected a microphone to an oscilliscope and looked at this resulting waveform, you would see only one waveform, but it would have irregularities all over it where the instantaneous sum and differences of the frequences created a lot of different amplitudes and the frequency would be changing constantly, ranging from 800 up to 2100.
Take the sound from an orchestra and look at it on that same oscilliscope; your waveform display will be MUCH more complex, but the human ear and brain can still discrimate between them all.
Sound recordings store that same sum and difference waveform. When you play it back through a speaker, there is only one waveform going to it. It tries its best to move in sync with the ups and downs and frequency changes, and if it does a good job, you will hear all the subtleties of the original. If the waveform is poorly recorded or poorly amplified or the speaker just can't keep up with the fast changes at high frequencies, then you would say it sounds "crappy" or "distorted" but you would still be able to distinguish the sounds.
While the human ear can hear up to 20,000 hz on a good day, us old farts don't hear quite that well, so we miss some of the subtle sum and difference frequencies. We make up for it with beer. 🙂