I'm bored enough to reply to this very vague question. But don't trust me so much, google, wikipedia, and howstuffworks are handy. I'm a software guy who has a taste for hardware, so here's what I know.
1) Electricity. They require voltage and current, like any circuit. The act of electrons flowing through a conductor. In the modern case, the juice flows through dopants on silicon.
2) Transistors. According to my first C++ class prof who was an EE guy, he told us day one that a transistor is a circuit that is a switch that goes on an off. The interesting thing here is that Power=Voltage*Current, but when a voltage is switched on (to say, 1) the current is 0. And when the voltage is off, 0, current is 1. So, under textbook conditions, switching a transistor on and off consumes no power. In reality, there is resistance and other electrical factors that cause them to waste power and heat. But if it was all superconducting, it would drain no power.
Transistors form the basis for binary logic, 1 or 0, on or off. They are grouped together to form logic "gates" which represent logical connections of "and," "or," "not," and combinations like "nand." By linking such logical gates together you can do things like add 2 binary numbers.
3) Clock frequency. Everything in a computer runs on a high-speed clock. the CPU syncs to the clock, and other things take longer. The period of each cpu cycle is a "cycle" and some things take more than one cycle for the CPU to do.
4) Computer architecture. This is a whole senior-level class in college. The main thing a CPU does is interface with memory and process data it gets from memory. Nowadays, the ALU is built onto a chip, so math is done there too. The most essential piece is the registers. A register is where a number loaded from RAM goes. They are X bits wide. A Pentium 4 has 32-bit wide registers and an Opteron has 64-bit registers. This means that the biggest number hardware can represent in one register is that many bits, and becomes the hardware limitation. You can't count to infinity with 32 bits, you top out at around 4 billion. With 64, the limit is 4 billion squared. If you want to count higher, you have to get clever in software, by using more data, and ultimately, more registers.
Most modern CPUs have at least a dozen registers. An Itanium has 128 registers, IIRC. Some are used for very specific things. One may hold the address of the program in RAM it is running, and that register tells it where the next instruction is coming from. Your computer works because the software that is loaded into RAM (technically, I should say "Memory") is executed by the CPU. The cpu reads the instruction code, decodes it, and performs the task according to the code, like add two numbers. Depending on how the ISA (Instruction Set Architecture) is designed, this could take 3 instructions like, load number 1 into register 1, load number 2 into register 2, add them and put the result into register 1. Because you can add, you can multiply, and you can subtract if you have a way to represent negative numbers, and because you can multiply, you can divide. All a CPU is is a complicated adding machine that directs traffic into and out of a memory system.
4a) Pipelines. My favorite part of comp arch was learning about how pipelines work. It's just like an assembly line. Say you need to perform one task, like build one car, or execute one add instrcution. Well, both of these things take a set amount of time, adding 2 numbers may take 4 steps like: fetch the instruction, decode the instruction, execute the instruction, write out the final result. That would mean every add takes 4 cpu cycles. So if we break it up into 4 steps, and the CPU has lots of add instructions, we can do 4 adds in the time it takes to do one. If building a car takes 100 steps, we can build 100 cars in the time it takes to build 1 if we have enough resources and we need to build 100 cars.
But, computer programs cause a problem because they are non-determinisitc. The programmer can tell the cpu if register 1 = 0, branch to instruction X, otherwise, don't branch. This changes the flow of program execution. So branching can mean your pipeline has the wrong code qued up, and it has to waste cycles to get the right code. If you have a 4 stage pipe, you could waste 3 cycles because you didn't have the right code in the pipe.
Electrically and logically, it is easier to increase speed (mhz) by lengthening the pipeline. If you break your car assembly line from 100 steps into 1000, you can make them 10 times faster, and each step is simpler, so even faster. But with a pipeline that long, if you say change models of car on the line, you will have to stall and 999 steps will have to wait for their thing to come down the pipe. A pentium 4 northwood has 20 steps, and a prescott has 31. Intel did so to push the ghz as far as they could go. In some benchmarks they pay a penalty because their pipreline stalls are longer than a 12 or 13 stage pipeline which is more common for things like AMD chips. CPUs are made fast by predicting whether or not code will branch to avoid these kinds of stalls. This is called a Branch Prediction Engine.
Most modern CPUs have mutliple pipelines to increase paralellism. If you know you are going to have a lot of data, like in graphics, adding more pipelines to a GPU to break up the work increases overall performance. CPUs also have seperate pipelines for crunching floating point numbers, which can take multiple cycles to multiply/divide.
I just felt like spilling what I know. That's the basics. Intel guys feel free to hammer on me and point out where I went wrong or misled.