FAH Update: Stats Will be Done a Bit Differently

CupCak3

Golden Member
Nov 11, 2005
1,318
1
81
We've had a pretty busy few days. Unfortunately, on the day of the PS3 launch, the network went down throughout the whole Stanford Medical School, taking FAH servers off the internet. That unfortuantely lead to a major outtage last thursday, but that was resolved and now FAH is running smoothly. The PS3 machines are isolated from the rest of FAH in that they have their own AS and data servers and all seems to be running smoothly there.

The main issue over the last few days has been slow stats and web access. We've rewritten how the stats work to improve performance. There are a couple of new changes:

1) We've updated how the daily_*_summary.txt files get updated and we can now update them more frequently than before the PS3 launch (we are now updating them every 3 hours instead of 6). note that our bandwidth scripts check IPs which download these and other files too often, so to avoid getting caught by that script, keep the downloads of each of these files to under 10 per day. Since there are only 24/3 = 8 updates per day, this should hopefully not be a problem.

2) We've instituted a new policy where we update the stats db every hour with new WU's, but turn off the cgi web pages to read the stats during the update. This will avoid some of the very long updates seen in the past. The main downside is that the stats are down every hour for about 10 minutes (roughly from the 10 to 20 minute period in each hour). We are considering ways to improve this, including updating the stats every 2 hours (leading to less down time).

So, with the new changes, it looks like FAH is back to running smoothly. The PS3 clients bring a great new capability to our scientific research and so we're excited about what we'll be able to do now. It's important for us to stress that the other clients still play a key role, as the PS3 client (like the GPU clients) are limited in what they can do (although what they do do, they do fast). In particular, we are getting wonderful results and throughput from the SMP client and we expect that to play a very important role for years to come.