This whole network thing is just a big onion. Layers ... things inside of things.
The core of the onion is the application (stuff like telnet, ftp, IRC, torrents).
As you go further out, you get closer to the physical media and the signaling used to communicate the data from here to there.
Layer 1, the outside layer of the onion, and it's things like copper, fiber, RF, and light (frequently with some secondary protocols involved like a T1/DS1)
Layer 2 (Media Access and Control (MAC) layer) concerns itself with "between the routers" in the broadcast domain. Layer 2 addresses do not pass through routers (except using tricks like a tunnel) Layer 2 addressing is something like an Ethernet MAC address, usually expressed (in the case of Ethernet) as six groups of Hex numbers (like 00:20:5c:01:02:03).
Layer 3 concerns itself with the Network Layer. Network addresses permit data to be sent beyond the broadcast domain and to other networks. IP addresses are split into two halves: Host address and network address - the Host address is where the host lives within teh broadcast domain, the network address describes where the broadcast domain is within the realm of all the broadcast domains visible on the network.
There are other network protocols besides TCP/IP, it just happens to be the one that "won" the popularity contest.
When you create VLANS, you are establishing discreet broadcast domains (layer two logic and addressing). Each VLAN appears and operates independently, even though it may reside in the same physical box (usually a switch) as other VLANs. Since it "looks" like a totally separate switch, the same rules apply; to get traffic from one segment/network (represented by a switch) over to another, you need a router to accept traffic from one network and send it to another network. The router (or layer 3 switch) keeps track of where all the other networks are, or in some cases where to send traffic when it's addressed to a network the router doesn't know about (a "Default Gateway" or "Gateway of Last Resort").
If you look Telent, for example (Telnet is an Application layer thing ... it interacts with a user). The data typed by the user passes down the stack (works its way out from the core to the skin of the onion). After passing some formatting processes, it gets "segmented" at layer 4 to a size of MSS (Max Segment Size) or less (and in this case is labeled for TCP, not UDP, because Telnet requires a session), the segments are/can be chopped up into packets of the MTU (Max Transmission Unit) size or less and assigned a port number of 23 (23=Telnet), and handed to Layer two, which encapsulates it into (in this case) an Ethernet frame which adjusts the amount of data to create a frame of no less than 64 bytes and no more than 1500 bytes. The frame has some attributes set to indicate the the enclosed packet is IP, and t's passed to layer 1, where it's converted into the electrical or optical signaling and put to the wire, fiber, or air (in the case of RF or free-space optical).
SO, at the other side, the pulses (layer 1 info) get converted into a logical frame (Layer 2- Ethernet in this case), which encloses an IP Packet (layer 3 - IP), which is used to re-assemble the data back into a segment (layer4) where each packet has been verified as received, and in the proper order.
The segments are reassembled and formatted as it passes through the Presentation Layer (Layer 6) and then displayed to the user by way of the Application Interface, whatever it might be (Terminal screen , sound, printer, whatever)
Telnet is actually a really sucky example, because Telnet sends one character at a time (one gets sent, one gets received and echoed back to the user in a standard configuration).
Here's a link to Cisco's Online Inter-networking Guide, which is a free version of a very large, very expensive book that does a very good job of explaining the various protocols and layers & such.
http://www.cisco.com/en/US/doc.../handbook/ito_doc.html
Standard disclaimer applies: Some of the description above bends, folds, spindles, and mutilates the facts to accommodate a general description that will fit the screen and space provided (and suits my laziness). To get the complete and accurate version, read the friggin' text in the supplied link, or any of the other zillion or so links available in the Internet. "It's technical" and doesn't lend itself to short, simple answers.