I understand memory isnt doubled when run in SLI due to latency issues with the bridge, but why do cards need to have the same amount of memory in order to work? Why cant they just make SLI edition cards with no memory and sell them for cheap just to use as extra GPU power? I thought the only memory being used was from the card that your monitor is plugged into.
Memory bandwidth is not infinite, and memory latency matters. To do what you purpose, memory bandwidth would need to be nearly doubled, so each GPU would have enough to feed it from one source.
PCI-e can't handle that kind of bandwidth, and that bandwidth, in terms of traces on the board, is a non-trivial part of the cost of a video card (mainly because it may necessitate boards with more layers and area as more traces are needed). Even using a custom interconnect, you'd still have the cost of the RAM bandwidth t be concerned with (more RAM chips, more traces, more solder pads).
So, the easiest way to handle it is to duplicate the contents of the RAM in each card, except for buffers, as that is tolerant of timing issues (well, in most games), and uses a very small portion of the available bandwidth.