I'm not sure if they still do it, but SiS and VIA chipsets allowed bank interleaving (depending on the BIOS brand too I think). With that, performance could be higher with more than one module because the CPU could access banks on different banks sequentially, rather than waiting for each bank to return data.
http://www.rojakpot.com/default.aspx?location=4&var1=116 -- that explains it in detail. Whether one stick or two would make any difference does depend on how many banks each module has. Two modules with one bank won't be any better than one module with two banks. Two modules with two banks each will. A single module with two banks can also be configured for 2 bank interleaving for a bit better performance with a single module.
But I don't think most chipsets support it now, at least not the popular ones. If yours does, you can consider using two modules. However if you're trying to overclock, or use high speed "value" memory then the issue of stability with more than one module does come into play. If your memory or chipset can't handle two modules at the speeds you want, then being able to interleave isn't going to be any use.
The comparison to a doorway isn't a valid analogy for single-channel memory. Since each module is only 64bit wide, and only one module can be read/written at a time (except for address commands such as with interleaving), the 512MB of data can only be transferred at a certain rate no matter how many modules are used. Two modules without interleaving won't be any better than one module without interleaving.
Dual-channel memory is exactly like the doorway analogy. Each channel is a doorway for the data to pass through, and since two "people" can be transferred at the same time, it takes half as long to transfer the same amount of data. However the CPU bus can limit the impact this makes. The Athlon line has a bus that is easily fed by single-channel memory speeds, if the memory is run at the same clock speed. Using dual-channel gives more bandwidth, but the CPU can't actually use the extra. There can be a bit of performance increase though if the devices on the PCI bus or built into the southbridge do a lot of transfers to memory using DMA mode (which means the CPU bus isn't involved), or if your AGP card requires system memory for texturing, which doesn't really happen much at all. In those cases dual-channel allows the other devices to have all the bandwidth they want without taking away from the CPU's bandwidth to memory. Cutting the memory speed in half and using dual-channel would provide the amount of bandwidth the CPU needs and allow it the performance of dual-channel (and allowing cheaper memory to be used) but due to synchronization problems it would reduce performance.
For Intel processors, the CPU bus is twice as fast as the best DDR (800MHz currently, with 400MHz being the fastest memory available in DDR). So dual-channel allows the memory at 400MHz to fully feed the P4 bus at exactly the amount of bandwidth it needs. Single channel is only half the needed bandwidth.