kylef,
Etherchannel and link aggregation do exactly as you describe. It really is a layer 2 function (well, some switches and cards make the distribution decision based on layer3 address).
A few quick things that might make it make sense:
1) yes, one mac address for both cards. Otherwise you would have arp problems. Switch treats it as one layer2 link, just happens to be two layer1 links
2) To the operating system it is indeed one network card. The load balancing is handled by the driver.
3) OS needs to transmit a frame and presents it to the driver to place on the wire. Nic then performs an XOR on the last 3 bits of the source and destination mac address and decides which "wire" (nic) to put it. The important part here is that if the conversation is between a single src/dst mac pair then the whole conversation happens on a single nic
4) OK, now that we've covered the nic transmitting look at the other end. The switch transmitting. Switch looks at bridge table and needs to place frame on port 1. Port 1 actually is a link aggregation or etherchannel port compromised of port 1, 2, 3, 4. Switch performs XOR of last 3 bits of src/dst mac address (or IP address depending on switch capabilities) and places frame on that port.
That's about it. Works really well with multiple connections (just like any load balancing does). Down fall is to watch how the balancing is done. In most networks the server is only talking to its gateway (router) so layer two distribution isn't all that effective - you need to balance on layer3 addresses.
So there is no real interleaving and reassembly involve - its reduced to which egress port do I place this frame on?
good link...
http://www.cisco.com/en/US/tech/tk389/tk213/technologies_tech_note09186a0080094714.shtml
-edit- to answer your other question. The beauty of it is the OS and TCP/IP stack is completely unaware of separate network cards. It is all handled seamlessly at the driver level. Stack say to driver "send this packet", nic frames it up and sends on whatever nic is decided upon based on the algorithym described above. It really is beautiful and the answer to many a net engineer's prayer. I begged for this in the mid 90s.
What's even better? You can do this with 4 gigabit cards as well. Cisco supports up to 8 physical links in a single channel.