Fork problems :)

Quaggoth

Senior member
Jun 23, 2000
800
0
0
OK. I know there isn't much info here, but I should be able to give you more shortly. I just got a call from a friend at another company and he says that one of their customers who has a Unix server is getting the error "Cannot Fork"........ . He is getting the rest of the error for me, and I will post it ASAP. He also says that they are getting an error relating to the "Mail Daemon". I'll be honest with you, I don't know much about Unix, but I DO know quite a bit about hardware. He seems to think it's some sort of a hardware problem. Anyway, if someone could offer some insight.....
 

StuckMojo

Golden Member
Oct 28, 1999
1,069
1
76

with this little information, this is a complete guess, but: if the system is unable to fork a new process, it may be running out of resources. check free memory and page file space with the "free" command.
 

Quaggoth

Senior member
Jun 23, 2000
800
0
0
OK, here is the rest of the error message.

"Could not fork to run/tcb/lib/initcond : resource temporarily unavailable"
then something about error 11.

Also,

"Could not fork to users shell : resource temporarily unavailable"
then something about error 11

 

Quaggoth

Senior member
Jun 23, 2000
800
0
0
Stuck - That sounds logical. Now, for a realy stooopid question, How do I check free memory, and how do I check how much space is free on the swap partition (I am guessing it has one because Linux does). Thank you.
 

StuckMojo

Golden Member
Oct 28, 1999
1,069
1
76

in most *nix systems, you can check swap file usage with "swapon" and an option...digital unix is -s...dont know for others and cant remember for linux...do a "man swapon". umm..some nixs also have the "vmstat" command for seeing virtual memory statistics. linux has the "free" command for listing free memory (both virtual and physical if i remember right) might be in other nix flavors too. try "man -k free" and "man -k memory" and "man -k swap" and/or "man -k page" for clues as to that particular system.

are there any other clues in the console messages? tell him to look back at the last few 100 lines to see if he can pinpoint when this started.
 

Quaggoth

Senior member
Jun 23, 2000
800
0
0
OK Stuck, I am probably annoying the heck out of you, but how do I check the console messages? (I am guessing that is something like the event log in NT?)

I promise I will RTFM later, but right now I do not have the time. :(

Hey, great sig BTW.
 

StuckMojo

Golden Member
Oct 28, 1999
1,069
1
76

hehe, yers too

as far as console messages go, its different for every system. what flavor of unix is this? however, the one thing that is the same is that the messages and other log files live under the /var directory. in redhat its /var/log/messages and the other log files are in /var/log . in digital unix (now known as True64 Unix) they used to be in /var/adm/messages. it may be useful to examine the other relavent log files too.

try "find /var -name messages -print" to find the file. then go there and use "tail -100 messages" to see the last 100 lines. -200 would show the last 200, etc. you'll probably actually want "tail -100 messages | more" to get one screen of output at a time.

if it is an option, he may be able to temporarily clear up the problem with a reboot...give himself some time to look for the problem, while getting services running again...until it runs out of resources again....
 

StuckMojo

Golden Member
Oct 28, 1999
1,069
1
76

hehe, yers too

as far as console messages go, its different for every system. what flavor of unix is this? however, the one thing that is the same is that the messages and other log files live under the /var directory. in redhat its /var/log/messages and the other log files are in /var/log . in digital unix (now known as True64 Unix) they used to be in /var/adm/messages. it may be useful to examine the other relavent log files too.

try "find /var -name messages -print" to find the file. then go there and use "tail -100 messages" to see the last 100 lines. -200 would show the last 200, etc. you'll probably actually want "tail -100 messages | more" to get one screen of output at a time.

if it is an option, he may be able to temporarily clear up the problem with a reboot...give himself some time to look for the problem, while getting services running again...until it runs out of resources again....

um..also, it may just be that he has hit his process limit. most unixes set a limit on the number of processes that can be run. this is to prevent denial of service if 1000 processes get run and bring the machine to its knees. ps -aux should show all processes. he should look of anything out of the ordinary that is sucking up cpu, like a few runaway cgis or something.
 

StuckMojo

Golden Member
Oct 28, 1999
1,069
1
76

hmm..sorry for the double post. the anandtech forum cluster burped there and i wasnt sure it got posted.
 

StuckMojo

Golden Member
Oct 28, 1999
1,069
1
76

also, look at the ps output for a buttload of occurances of the same process, like a cgi or whatever. if there are like 100 processes all with the same name, he should kill them. it would indicate some sort of fork loop or bad program that keeps getting started but never dies. if this is the case, he should disable that program until he figures out the problem.
 

Quaggoth

Senior member
Jun 23, 2000
800
0
0
He is getting the flavor for me as we type. Is there a way to tell unix to allow like 1500 processes? If this is hardware, what would be your best guess? mine would be either add memory, or maybe a HDD went down. The thing that points me to HDD is just the error that states

"Could not fork to run/tcb/lib/initcond : resource temporarily unavailable"
then something about error 11.

That seems to make me think that it is expecting that drive to be swapped or something.

Do you have any idea what error 11 is? or what man page to look at.

He said that the first time he went over there, the drive that the error is referring to was unmapped, and he remapped it.
 

StuckMojo

Golden Member
Oct 28, 1999
1,069
1
76

"mapped"? you mean mounted? he can check what drives are mounted with "mount".

where is he seeing the error message anyway? on the console? can he get the full text of it?

the reason why im leaning toward resources is that in unix, "fork" is a system call that a process makes to start a new copy of itself. this is how everything used to work before there were threads. where as today you start a new thread, you used to instead fork a new copy of yourself to do things, and then communicated between the processes with shared memory. lots of programs still use forking instead of multithreading.

you could be right however, the message may actually mean that it cant fork a new process with the working directory as the one listed, because its not there (ie the drive isnt mounted).

if the problem were that the process limit has been reached, you would hardly be able to do anything on the machine. you wouldnt be able to login in to any new sessions, because it wouldnt be able to start a shell. you would only be able to use shells that were already started, and from them, only use builtin commands. basically, it means you wouldnt be able to start any new programs, period.

i wouldnt worry about how to fix this yet, at this point, just diagnose the problem. then worry about fixing it ;) (that refers to your ? on how to up the process limit).

is the machine acting normally otherwise? what process is generating this error? how is he reading this error? is he sitting at the console? we need more info...

as far as seeing what drives are currently mounted, "mount", and what is normally mounted at startup "cat /etc/fstab".
 

Quaggoth

Senior member
Jun 23, 2000
800
0
0
OK, I am going over with him tomorrow to take a look at the machine. I will get all the info I can then. PLEASE check back tomorrow. I will create a new thread at about 7:30PM GMT (12:30 PM here in AZ). I will call it fork problems 2. Thank you for all of your help.
 

StuckMojo

Golden Member
Oct 28, 1999
1,069
1
76
no problem, good luck.

i have to go help my GF move now anyway ;)

PS, a good idea would be to copy the messages file to a disk or ftp it to your machine so you have it to cut and paste into this thread.

[edit] fixed speggleing erroz :p [/edit]