Linux OOM Killer ...

Armitage · Mar 18, 2002

What does it kill?
As I understand the OOM killer was introduced in some of the 2.4 kernels. It's job is to look for & kill certain processes when the system completely runs out of memory (physical + swap).

So, my question is, does this feature exist in the 2.4.17 kernel, and if so, is the identity of the OOM killer's victims logged anywhere?

I've got a situation here where an important process is dying unexpectedly when the machine is under very heavy memory load. I suspect the OOM killer.

Sometimes 1GB of ram just isn't enough...

Nothinman · Mar 18, 2002

Yes it's still around and yes it should log the process name and pid in the kernel logs (/var/log/kern.log on Debian systems), dmesg will print a log of recent kernel logs if there's no file it's going to.

Armitage · Mar 18, 2002

Thanks, that's it.
I don't have root on the machine in question, but grepping dmesg confirmed that my process is falling victim to the oom killer

Nothinman · Mar 18, 2002

If you can talk to the maintainer of the box you may want to try 2.4.18 or even the -rmap patches, I believe Rik has made the OOM killer a good bit smarter in the -rmap patches.

Armitage · Mar 18, 2002

Yea, we may have an "opportunity" to rebuild this machine soon anyway. But I doubt a new kernel will help this situation to much. We're going to have to balance the load out better, and maybe rework the memory model of the app somewhat. But it may be worth try the new kernel.

Nothinman · Mar 19, 2002

We're going to have to balance the load out better, and maybe rework the memory model of the app somewhat. But it may be worth try the new kernel.

That's really the better solution, if your app is starving the system of memory you need to either fix the app or add more memory/swap.

Armitage · Mar 19, 2002

It's already got 1GB ram and 2GB swap. And alpha ram isn't cheap!
The situation hasn't been good for awhile. But recently two big cron jobs that used to run sequentially have started to overlap due to increased run size and other loads on the machine. That's what spiked it. So, spacing out the cron jobs a bit will solve the problem in the short term, but we really have a long term scalability problem.
Off to look at the code...

Search

Linux OOM Killer ...

Armitage

Banned

Nothinman

Elite Member

Armitage

Banned

Nothinman

Elite Member

Armitage

Banned

Nothinman

Elite Member

Armitage

Banned

TRENDING THREADS