Rosetta memory usage

Fardringle

Diamond Member
Oct 23, 2000
9,200
765
126
I decided that my stats were slipping too much so I started running Rosetta again on my quad Xeon server. However, I may have to stop it if I can't resolve an annoying problem.

Task Manager reports that each Rosetta process is using right around 100 MB of RAM. However, the actual total memory used by the processes (the difference in commit charge when running a process and when BOINC is suspended) is over 500MB. The server can handle Rosetta using 400-500MB of RAM without any problems, but I can't have it using over 2GB of RAM as that causes significant performance problems for my users when they try to access files and programs on the server.

I set BOINC preferences on the machine to never use more than 1% of total memory on the machine but it still takes 2GB while running four process of Rosetta.


Edit: I just ran some tests and my home PC (running XP Pro) doesn't do this so I imagine it may just be the way Small Business Server 2003 handles the BOINC/Rosetta operations, but if anyone knows of a way to get Rosetta to play nice on the server I'd really appreciate it!
 

Rattledagger

Elite Member
Feb 5, 2001
2,994
19
81
Rosetta@home has atleast 2 different types of wu's, the "small" wu's using around 100 MB "real" memory and 200-300 MB swap-space.

Or, the "big" wu's it seems to be using upto 200 MB "real" memory, but grabs a hefty 800 MB swap-space...

That type of wu you're getting is basically random, except if you don't have atleast 800 MB memory, you won't get the "big" ones. It should also use your memory-settings, and if memory-settings is lower than 800 MB, you shouldn't be assigned these wu's...

As for setting to 1%, this is not a good idea, since they'll either be stuck forever in "waiting for memory", or if they uses too much, they gets aborted.


But anyway, the most important is, the wu's "looks" like they're using very much memory, but most of this is just assigned swap-space, not even sure if it's actually using all this swap-space or not... The actual memory-usage AFAIK peaks around 200 MB.

Now, on a 4-way this can still be too much actual memory used, if so try limit memory-usage to 700 MB or something and see if you're only getting "small" wu's... If they've not changed the wu-parameters, you should not get the "big" wu's.

 

JC

Diamond Member
Feb 1, 2000
5,843
67
91
Maybe the new beta will be better:

Dec 20, 2007 Rosetta has been updated to version 5.90. Changes include a reduction in virtual memory usage and a better exploration of proteins with complex symmetries.


I had a string of botched WUs a while back, seems to have settled down now.
 

Fardringle

Diamond Member
Oct 23, 2000
9,200
765
126
That's interesting news, JC. I'll keep an eye on the server to see if it makes a difference.

I only set memory usage to 1% previously to test and see if it would make any difference. I also tried limiting it to 250MB, 500MB, 750MB, and 1GB to see if those would help (I can handle BOINC using up to 1GB now and then, but 2GB causes problems). None of the settings had any effect at all so I put it back to 30% for now. If the new beta version doesn't calm things down I guess I'll just force the server to only run two work units at a time instead of four...
 

Rattledagger

Elite Member
Feb 5, 2001
2,994
19
81
Originally posted by: Fardringle
I only set memory usage to 1% previously to test and see if it would make any difference. I also tried limiting it to 250MB, 500MB, 750MB, and 1GB to see if those would help (I can handle BOINC using up to 1GB now and then, but 2GB causes problems). None of the settings had any effect at all so I put it back to 30% for now. If the new beta version doesn't calm things down I guess I'll just force the server to only run two work units at a time instead of four...
Hmm, if you set memory-usage to only 1%, you shouldn't have gotten any Rosetta@home-work at all, if you've not got atleast 10 GB memory that is...

Also, since the "small" wu's seems to be using a little over 100 MB, they should have been aborted due to using too much memory...

Are you sure you set memory-preferences correctly? Remember there's 2 settings, one "Use at most N% of memory when computer is in use", and another "Use at most N%of memory when computer is not in use". Assignment of wu's (and abortions) uses the highest of these, regardless of which is higher.

Also, setting the preferences on web doesn't work if you're using local override, has set your preferences in client.

Oh, and AFAIK "Use at most N% of page file (swap space)", not sure if this works correctly or not... It will atleast not limit which wu's is downloaded...


Example, then limited to 5% on a 1 GB-computer:
21.12.2007 18:17:50|rosetta@home|Message from server: No work sent
21.12.2007 18:17:50|rosetta@home|Message from server: Your preferences limit memory usage to 51 MB, and 95 MB is needed

To see if the preferences is set correctly, choose "Advanced/Read local prefs file", and look on Message-tab. You'll see something like:
21.12.2007 18:30:54||General prefs: from SETI@home (last modified 24-Oct-2007 18:05:27)
21.12.2007 18:30:54||Host location: home
21.12.2007 18:30:54||General prefs: using separate prefs for home
21.12.2007 18:30:54||Reading preferences override file
21.12.2007 18:30:54||Preferences limit memory usage when active to 911.98MB
21.12.2007 18:30:54||Preferences limit memory usage when idle to 962.65MB
21.12.2007 18:30:54||Preferences limit disk usage to 0.68GB

If there's no "Reading preferences override file"-message in message-tab, you don't have any local override, but client is using the web-based preferences.

 

Fardringle

Diamond Member
Oct 23, 2000
9,200
765
126
I set it using the local preferences on the machine, set every memory option including swap file to 1%, and reset the project completely. It downloaded new work and immediately started using 2GB of RAM (for four processes) again.
 

Rattledagger

Elite Member
Feb 5, 2001
2,994
19
81
Originally posted by: Fardringle
I set it using the local preferences on the machine, set every memory option including swap file to 1%, and reset the project completely. It downloaded new work and immediately started using 2GB of RAM (for four processes) again.
After setting the preferences, that does the message-tab say the memory-limits are?

 

Fardringle

Diamond Member
Oct 23, 2000
9,200
765
126
At 1%, it says the memory limit is 60 MB (including swap file there's 6GB total memory).

At 30%, it says the limit is 1800 MB. This is close to the 2GB that Rosetta is actually using and it would be understandable for Rosetta to use that much on this setting. The problem is that it doesn't matter what I set the limit to, it still uses 2GB with the four processes running. For now I've set it back to only run two copies of Rosetta at a time so it doesn't use more than 1GB total memory.
 

Rattledagger

Elite Member
Feb 5, 2001
2,994
19
81
Originally posted by: Fardringle
At 1%, it says the memory limit is 60 MB (including swap file there's 6GB total memory).

At 30%, it says the limit is 1800 MB. This is close to the 2GB that Rosetta is actually using and it would be understandable for Rosetta to use that much on this setting. The problem is that it doesn't matter what I set the limit to, it still uses 2GB with the four processes running. For now I've set it back to only run two copies of Rosetta at a time so it doesn't use more than 1GB total memory.
Hmm, if 1% is only 60 MB usable memory, it should not give you any Rosetta-wu at all, since AFAIK the smallest wu's needs atlest 96 MB memory. Any already-downloaded work will be tried run.

2% and higher will give you work, and since Scheduling-server doesn't take into account #cpus, it will happily give you a full set to keep all cores busy.

At 1%, the moment total memory-usage is over 60 MB, one or more of the tasks should be suspended with "waiting on memory". But, in case a task has not checkpointed since start, it can still reside in memory... If a single task uses more than 60 MB memory, it will be aborted.

Note, the memory-limits is on "real" physical memory-usage, whatever used of the swap-space is not counted, and not sure if %pagefile used is checked or not...


If you limit memory-usage to 500 MB or something, you should only get the "low-memory"-wu's, and with 4 wu's they would likely use around 400-500 MB "real" memory. Now, according to Task Manager the "commit charge" will maybe be 1 GB - 1.2 GB or something, all past 500 MB will only be swap-space, and shouldn't normally be a problem.

 

Fardringle

Diamond Member
Oct 23, 2000
9,200
765
126
After some more investigation, it looks like the problem is that any time BOINC goes in to suspended mode (when the server console is active), it leaves the Rosetta client in memory even though the manager is set not to do that. Then, when the server console is inactive again and the BOINC manager comes out of suspended mode, it starts a new work unit instead of the one it was working on previously so that there are now two copies of the Rosetta application in memory for each CPU. Then, if BOINC is suspended again, it does the same thing again leaving three copies in memory.

This only happens on my server running Small Business Server 2003 so I suspect it is something odd with the way BOINC and SBS interact.
 

Rattledagger

Elite Member
Feb 5, 2001
2,994
19
81
If a task haven't checkpointed after it's started, it will be kept in memory even memory-preference is set otherwise. This so there should atleast be some progress even if pauses, as long as don't shut down BOINC/computer before checkpoints.

As for starting multiple tasks even some has already been started, are by chance one or more of them showing "Running, High Priority"? Is the deadline on them the same also? It's a bug that v5.10.xx switches to a non-running task with same deadline if client is in deadline-trouble, but if not mistaken this is fixed in v5.10.30, so try upgrading if not already running v5.10.30...



 

Fardringle

Diamond Member
Oct 23, 2000
9,200
765
126
The jobs that are being left in memory have anywhere from a few minutes of processing time up to near completion. It doesn't seem to be an initial checkpoint issue unless BOINC isn't marking a checkpoint until after two hours of processing... :)

None of the work units are anywhere near the reporting deadlines and they aren't running at high priority so that particular item shouldn't be a problem either. The server is running 5.10.28 right now so I'll try installing 5.10.30 and see if it makes any difference.
 

Fardringle

Diamond Member
Oct 23, 2000
9,200
765
126
Thank you for the tip on version 5.10.30. The Rosetta application is still staying in memory if it isn't allowed to run long enough to reach a checkpoint before it is suspended, but the BOINC manager is starting up the same work units again instead of leaving the previous ones in memory and starting up new ones as well. As a side benefit, this version is only using about 200MB of RAM per active process instead of 500MB so as long as it continues to only have two copies in memory at a time the server should be happy.

I'll let it run this way for a while and if it plays nice I may start up the other two processes as well. The way Xeons handle hyperthreading, it really doesn't make a big difference (20-30 more points per day running 4 threads instead of 2) but those points do add up over time. :)
 

Fardringle

Diamond Member
Oct 23, 2000
9,200
765
126
I guess I spoke too soon. After a day, I have six copies of the rosetta app stuck in memory on the server. That's really unfortunate. The server was adding enough work to push me back into the top 100 overall for the project. :(

I'm going to suspend Rosetta on the machine and run a different project for a day or two to see if other project applications do the same thing or if it's just Rosetta that doesn't like the server operating system. I'll post back later. ;)
 

Rattledagger

Elite Member
Feb 5, 2001
2,994
19
81
Originally posted by: Fardringle
I guess I spoke too soon. After a day, I have six copies of the rosetta app stuck in memory on the server. That's really unfortunate. The server was adding enough work to push me back into the top 100 overall for the project. :(

I'm going to suspend Rosetta on the machine and run a different project for a day or two to see if other project applications do the same thing or if it's just Rosetta that doesn't like the server operating system. I'll post back later. ;)
Well, my experience is mostly in SETI@home, if downloads a bunch of work there some of them is in deadline-trouble, as files is downloaded many tasks can be started due to different deadlines. But, then enough tasks is crunched so not in deadline-trouble any longer, all tasks that has already been partially crunched is done before starts on any new work... This is with v5.10.30.

I'm not using "pause then active", so it is possible there is a bug somwhere... Neither is it hitting any memory-preference-limit...


But anyway, if the server has a permanent internet-connection, one work-around is to set cache-size to 0.01 days, and "additional days" to zero. Can also increase Rosetta's run-time preference to 24 hours. With these settings the client will most of the time only have 1 task per cpu, and therefore can't start any extra copies...

Hmm, another method that likely will work is to set cache-size to 5 days, since with Rosetta@home's 10-days deadline means all rosetta-work is run in earliest-deadline-mode... But please note, client won't ask for more Rosetta-work again except if idle cpu, so should still have permanent internet-connection...
 

Fardringle

Diamond Member
Oct 23, 2000
9,200
765
126
Sadly, it does appear to just be a problem with Rosetta. Over the past several days I have let a handful of other projects run on the server and there have not been any problems at all with memory usage or copies of the client being left in memory.

The office where this server is located has a decent speed DSL connection. I do want this computer to run Rosetta so I'll try setting cache size to .01 days and run time to 24 hours to see what happens. It was at 5 day cache and 12 hour run time previously.


I need to leave it set to suspend when the computer is active, though. The rare times when the server console is actually being used, I need to have CPU and memory usage as low as possible. There's a long story behind it that I don't feel like typing out, but suffice it to say that I can't change that option.. :p
 

Assimilator1

Elite Member
Nov 4, 1999
24,151
516
126
I do want this computer to run Rosetta so I'll try setting cache size to .01 days and run time to 24 hours to see what happens. It was at 5 day cache and 12 hour run time previously.

That didn't work then?
 

Fardringle

Diamond Member
Oct 23, 2000
9,200
765
126
Unfortunately, no. It's still leaving extra copies of the Rosetta application stuck in memory when the Server desktop becomes active and the BOINC service is suspended.
 

Fardringle

Diamond Member
Oct 23, 2000
9,200
765
126
I haven't posted in the Rosetta forums yet. I'll probably do that tomorrow.

I just set up completely new preferences for this server separate from all of my other machines. I told it to switch between apps once per day :p and to have a target processing time of 8 hours, to use a max of 10% of memory while the computer is in use and 75% when not in use, keep .1 days of work queued, use up to 4 processors, not work while the computer is in use, and to not leave applications in memory.

I reset the project on the server to update it with these changes and it immediately downloaded 8 new work units (processing time of 3 hours each) and put all of them into memory using 150-250MB each, and started processing one of them with a "waiting for memory" message on the others in the Task list. It's interesting that the client is partially following the "only use 10% of memory when the computer is in use" setting but completely disregarding every other configuration option.
 

Strebor

Member
Dec 2, 2006
132
0
0
Had a similar problem a few weeks ago (Rosetta gobbling up 8 gigs of swap space) on my quad. Went away with an updated BOINC client and Rosetta apps. As for settings not sticking, have you tried completly uninstalling BOINC and deleting any leftovers in the folder? After getting back to school I reformatted a few of my machines and started from scratch (was moving hardware around). But my quad and two other comps were left as they were, and they refuse to get 3 days worth of work as I have them set up to. I updated the clients on them but they still sit with only 12 hours of work. I'm going to try this later today and I'll let you know what happens.
 

Fardringle

Diamond Member
Oct 23, 2000
9,200
765
126
I completely uninstalled and deleted all traces of BOINC from this server last week and installed the newest client on the machine with no improvement.

It doesn't seem to be a problem with settings not being saved. They are showing exactly the way I set them when I go back into the preferences, the client is just ignoring them when running Rosetta.

BOINC ran other projects properly while testing for the past several weeks, so I really think it's a problem with the way the Rosetta application interacts with the Small Business Server 2003 operating system.
 

Strebor

Member
Dec 2, 2006
132
0
0
Well that's no good. Clearing out the folder and reinstalling got my other comps to pick up their three days of work, but now they want to run them for 8 hours...