Linux Kernel Oops???

BCYL

Diamond Member
Jun 7, 2000
7,803
0
71
I am running RedHat 8.0 using 2.4.18 kernel... Sometimes I get kernel oops, and the system doesn't function properly afterwards...

Is there a way to make the system automatically reboot after a kernel oops? I tried doing:

echo 1 > /proc/sys/kernel/panic

But that didn't help... seems like this will only reboot the system after a kernel panic at the interrupt level...

Any help would be very much appreciated...
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
An oops isn't a panic so it makes sense that /proc/sys/kernel/panic wouldn't do anything.

There's something in the kdb patches that might help, it coupled with the lkcd patches seem to cause a crash dump and reboot on oops or panic. I havn't used it though so I have no idea how well it really works.
 

Buddha Bart

Diamond Member
Oct 11, 1999
3,064
0
0
sign up for your free demo redhat network account by running "rhn_register"

from there, you can use "up2date" to update any package in the entire redhat distribution that has had a bugfix, security fix, or even the rare feature enhancement. For instance, here's a list of updates that are available for Redhat 8.0: https://rhn.redhat.com/errata/rh8-errata.html
This includes 5 security fix releases and two bugfix releases of the kernel.

up2date will not defaulty update kernels as a 'just in case' measure, but you can edit the config file to change this.

Once you're running the latest and theoretically bug-free kernel, let us know if you still get your Oops.
 

BCYL

Diamond Member
Jun 7, 2000
7,803
0
71
Well I believe the oops is coming from a 3rd party kernel module which I need, so updating the RedHat kernel wouldn't help much...

After the oops, the 3rd party module usually hangs and cannot be killed... thus I have to do a reboot to clean it up... Therefore I need a way to do this automatically without having someone going in and do it manually...

Actually I'm not that clear what's the difference between a kernel panic and kernel oops? What does a kernel panic look like?
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
Actually I'm not that clear what's the difference between a kernel panic and kernel oops? What does a kernel panic look like?

A panic is something so bad that the kernel can't figure out what to do next so it halts the system, an oops is when something in the kernel did something bad like try to access a NULL pointer, it's bad but not bad enough to halt the system usually.

A kernel panic just halts the system, i believe. nothing can continue after a kernel panic.

Actually the proc file he mentioned should cause the box to reboot instead of hang, but even if that's not true it wouldn't be terribly hard to add a reboot to the end of the panic() function if you really wanted to.

After the oops, the 3rd party module usually hangs and cannot be killed... thus I have to do a reboot to clean it up... Therefore I need a way to do this automatically without having someone going in and do it manually...

Contact the 3rd party and get them to fix their sh!t.
 

BCYL

Diamond Member
Jun 7, 2000
7,803
0
71
Originally posted by: Nothinman
Actually I'm not that clear what's the difference between a kernel panic and kernel oops? What does a kernel panic look like?

A panic is something so bad that the kernel can't figure out what to do next so it halts the system, an oops is when something in the kernel did something bad like try to access a NULL pointer, it's bad but not bad enough to halt the system usually.

A kernel panic just halts the system, i believe. nothing can continue after a kernel panic.

Actually the proc file he mentioned should cause the box to reboot instead of hang, but even if that's not true it wouldn't be terribly hard to add a reboot to the end of the panic() function if you really wanted to.

After the oops, the 3rd party module usually hangs and cannot be killed... thus I have to do a reboot to clean it up... Therefore I need a way to do this automatically without having someone going in and do it manually...

Contact the 3rd party and get them to fix their sh!t.

I did contact the 3rd party, but no fix available yet, and I am on a deadline... so basically now I am looking for a way to automatically recover from the oops... this way the system is at least still functional after reboot and hopefully fixes itself...
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
I don't think there is one, an oops isn't always bad enough to cause a panic so an automated reboot normally won't happen. The only thing I can think of is the kdb/lkcd patches I mentioned earlier but I'm not even 100% sure that'll help you because the 3rd party module might not play well with them or you might not even be able to use it with a custom kernel.
 

BCYL

Diamond Member
Jun 7, 2000
7,803
0
71
Originally posted by: Nothinman
I don't think there is one, an oops isn't always bad enough to cause a panic so an automated reboot normally won't happen. The only thing I can think of is the kdb/lkcd patches I mentioned earlier but I'm not even 100% sure that'll help you because the 3rd party module might not play well with them or you might not even be able to use it with a custom kernel.

Well thanks anyways... I'll look into the patches you mentioned and see if that helps... If they don't I guess I will just have to find another way to monitor for the oops...