A little backgrounder on Demo programming for Windows - by me.
When using DirectX or OpenGL, the programmer only has to provide simple data to the computer in order to generate fairly complex images and models. For example, you can draw an amazing-looking robot by using a few vertex-shaded polygons. You can have OpenGL interpolate the color from one vertex to another, giving you a nice gradient color - with a simple function call.
Demos don't "cheat" by "calling Demo() functions". They simply provide geometry data like "Draw sphere with center at (X,Y)". This is possible to the huge number of function already implemented by DirectX/OpenGL and exposed to the programmer via the correponding API.
Dos demos would be a lot bigger (unless programmed in assembler and compressed with a compressor like Zip). This is because they had to manually implement all those drawing functions.
Consider Doom2 demos. You could record such a "demo" of playing for 2 hours, and it would only be a few kilobytes in size. It's because it records not actual screenshots but just the action data. This is pretty much how those amazing demos do it.
It is very different from encoding motion video - there, you actually need to save frames of video, and audio as well.
But yes, demos are great. Search on Google.com for "64k demo".