Since someone was kind enough to start this thread, let me share some other items with you that you may not have considered.
1) I test laptops/notebooks, not systems. While a GTX 660 might scream through the vast majority of our gaming tests at 1080p, a GTX 660 is also about as fast as the top mobile GPUs.
2) Updating the gaming suite more than once a year makes comparisons with older hardware virtually impossible, particularly on the mobile side of things. Ryan can go and retest all the old GPUs in his benchmarking testbed (and it will take days of work, if not weeks, to go through each card at the various settings); for notebooks, most systems are back with their manufacturer within a month of the review posting -- and often even before then.
3) We have two primary reviewers for notebooks (Dustin and me), which means we now have to coordinate all of the testing parameters between two people. Vivek and Anand occasionally use the mobile test suite as well. Guess what that does if you try to add and change games regularly.
4) For those that say, "An old game getting 150 FPS says nothing about new titles," I respectfully disagree with the general sentiment. If we're totally CPU limited to 150 FPS, then it's pretty useless, but if the game is simply not as demanding so that it hits higher frame rates, it will match up with general performance expectations of other titles. However, what you also need to consider is that every game tested only provides a piece of the picture, and there is no "best" title or set of titles to test with. Dropping one game to replace it with a newer game just means we lose information about, e.g. Batman and gain information about Crysis 2 or whatever.
5) Modded games are out. Sorry. Someone else already said some of this, but basically you're opening a can of worms and when you use a mod you run a major risk of unfairly penalizing companies. NVIDIA or AMD might work hard to get Skyrim to run acceptably at max settings, and then you add a mod that pushes things so far that it might overflow the RAM on a 2GB card but not on a 3GB or 4GB card. Now it looks like the 2GB card sucks at Skyrim, but the 4GB card still runs well. The reason the game didn't ship in a state that it would use more than 2GB is because the developers already looked at that and said, "That doesn't make sense."
6) Right now, testing all of the games and other benchmarks on a notebook requires a pretty solid 20-25 hours of testing and writing, just for the basic suite. If you then have to retest a bunch of titles, you add a minimum of six hours -- I've done this multiple times with the P170EM reviews. Now, if you get paid a flat rate per article, what happens when you suddenly take twice as long? Let's just use a rate of $250 for a review as a baseline. At 25 hours you would make $10 per hour. If you then take an extra six hours, you're down to $8.06 per hour, and if you add games and end up requiring 40 hours of testing you're at a rate of $6.25. If any of you are willing to benchmark and then write a quality article for $6.25 per hour, please send me an email and I'm sure we can find a use for you!

But seriously, this is a full-time job for most of us and we only have so many hours in a day.
Just to conclude, we will be looking at revamping our benchmark suite in the coming months. Windows 8 will already throw out many of our previous results, so very likely I'll change up the battery life and general applications for Win8, with gaming suites up for change by the end of the holiday shopping season. Ideally, I'd love to be able to drop every single game in our current list of seven titles (on notebooks) and replace them with seven popular and yet GPU intensive titles for 2013. So when looking at the games we should add, please consider the stuff coming out in the next two months as a higher priority than stuff that has been released this summer or earlier.
If someone wants to send me an email once you reach a consensus, please do so; otherwise, I'll try to keep an eye on this thread for when it's time to really mix up the games. Very likely, we'll start in 2013 and use about seven or eight games for the whole year (and occasionally look at other games as warranted). We like to have a wide selection of titles as well, so FPS/Action, RPG, Strategy, Driving/Simulation, and MMO should all be present if possible, with multiple titles for the most popular categories (usually FPS).
Until the list is finalized, thanks for the input! I'll point Ryan at this as well, since most of you seem to be talking more towards his realm of testing (desktop GPUs).