Quick lesson on entropy:
Entropy is a logarithmic counting of microstates of a system. A microstate is a way of arranging all of the elements of the system, such that the system is determined. Count up the microstates, take the natural log, multiply by Boltzmann's constant and you have entropy.
Let's imagine our system is 4 coins. They each have 2 states, heads or tails. There are 16 microstates:
HHHH
HHHT
HHTH
HTHH
THHH
HHTT
HTHT
HTTH
TTHH
THTH
THHT
TTTH
TTHT
THTT
HTTT
TTTT
Notice how they are grouped. 1, 4, 6, 4, 1. Now, each state is equally probable. That is you are just as likely to get HTTH as you are to get TTTT. However, we tend to view MACROstates. That is, we just look at how many heads and tails there are and disregard order. This means we have a 3/8 chance of getting a 50/50 H/T split. Thus, it is most likely that we find the system in the 50/50 macrostate, even though any of its microstates are just as probable (or improbable) as any other. In fact, in a dynamic system, you would find the system evolving from one microstate to the next, spending a proportion of time in a macrostate equal to its relative probability.
Now let's look at some entropies.
S = k ln(Omega)
S = entropy
k = some number
Omega = # of microstates in a given macrostate
We can see that ln(6) > ln(4) > ln(1), so one would expect that if we started the system in the HHHH state, it would, through random dynamical processes evolve into one of the 50/50 split microstates.
So this is sort of true. In fact, if we just let it go randomly, it would eventually re-order itself back into the HHHH state according to some time constant. Of course it would then move out of that state again. If we let this happen for 15 minutes say, we could expect that the system would be in the HHHH state for about 1 minute of that time. Same with the TTTT state.
So why don't we ever see a box of air molecules randomly and spontaneously condense to form a perfect crystal, with the GUT etched into its surface? After all, it is just as probable for the air molecules in the box to arrange themselves in this way as it is for them to be in any other microstate. Well, if you calculate how many microstates there are in such a box, it is insanely high. Each particle has 6 pieces of information assigned to it if you ignore rotations (so I guess we're dealing with Argon here). Take 10^23 particles and that's a lot of microstates. The fraction of time the air particles would spend in any one microstate is miniscule. I'm not 100% sure of the numbers right now due to not going through the calculations again, but I seem to recall that you would have to wait something like a billion times the age of the universe to observe a macrostate with a single microstate.
Thermodynamics and statistical mechanics works so well because of the huge numbers involved.
edit: I'll add a summary here.
So why do we say systems tend to larger entropies? Well, look at the coins. The macrostate we would find the system in is associated with the largest entropy. Here we are relating ln(6) to ln(1) and it doesn't seem so convincing. But in the real world we're relating ln (10^10^20) to ln(1) (ie, 10^20 to well, 0).