How does the OS pass unexpected data to a process?

chrstrbrts

Senior member
Aug 12, 2014
522
3
81
Hello everyone,

Here's what I mean:

I wrote an assembly program running on Linux that simply puts the string "Hello!" on the screen in the terminal window from where the program is run.

I compiled, linked, and ran it with no problems.

Then, I added this function to the source code:

crash:
push 5
ret

and called this function right before making the screen-write software interrupt.

I compiled, and linked it with no problems.

When I ran it, the words 'segmentation fault' popped up on the screen and the process terminated.

This is to be expected; it's what I intended to do.

But, here's my question:

How did the OS pass this data to the process when the process wasn't expecting it?

I asked a question a couple of weeks ago about attaching event handlers to processes at the machine level.

The answer I received was that processes poll the OS and check for messages left in a queue for that process and then take appropriate action when a message has been detected.

But, you have to tell a process to check for messages.

In my example, there was no instruction to tell the process to check for messages from the OS.

So, how did the OS pass that segmentation fault data to my process for display?

Unless, it wasn't my process that put that string on the screen.

Maybe, my process was terminated by the OS before the data was placed on the screen, and the OS passed a message to the terminal window which is itself a process running in user space.

I would expect the terminal window process to check for messages from the OS as its sole purpose is communication with the OS.

What do you guys think?

Thanks.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,515
4,370
75
Unless, it wasn't my process that put that string on the screen.

Maybe, my process was terminated by the OS before the data was placed on the screen, and the OS passed a message to the terminal window which is itself a process running in user space.

There you go. The OS caught a segmentation fault, killed the offending process, and reported the error to the terminal.
 

exdeath

Lifer
Jan 29, 2004
13,679
10
81
There is also a mechanism called exception handling. An application using exception handling can install exception handlers that catch errors like divide by zero, memory faults, etc.

Ultimately a hardware fault, such as invalid memory access, divide by zero, invalid opcode, etc will cause a trap into kernel mode to invoke an OS owned interrupt handler for each type of fault and give the OS ultimate control. If the kernel sees that the faulting process has an exception handler installed to catch that type of fault, it can play along amusingly and pass the exception on to the process and restart the process, albeit in it's exception handler, interrupting normal flow of execution. From the application this has the appears as your code suddenly stopping at :

try { a = b / 0; }

and resuming suddenly at catch (DividebyZeroExcpetion e) { // oops... } to allow the application a second chance to gracefully deal with it's mistake and prompt the user.

Exception handlers are registered with the OS through syscalls like any other user <> kernel interaction. In a supported compiler it would be installed in the code between the process entry point at 0x00400000 and main() by the C runtime stub (crt0.s?) that sets up the stack, heap, main args, etc before calling main() (which also undoes all this and calls ExitProcess after main returns). Most of the data structures to handle exceptions in the application are contained in the C runtime and embedded in the user stack of the process and are mostly hidden from the programmer.

What you see when a process does not use exception handling or does not catch the type of exception being thrown, is the OS will invoke it's default kernel mode exception handler which simply terminates the process and logs an app crash.

I'm rusty on *nix but a segmentation fault or SIGSEGV is a POSIX style exception called a "signal" and essentially the same exact thing.

Now for the exact reason you are crashing here if you aren't aware, is in x86 terminology is interrupt 0x0E Invalid Page Fault.

Classically for debugging purposes that have to do with setting pointers to NULL or 0 to aid in identify bugs if you access uninitialized pointers, an entirely arbitrary convention, is the first entry in the page table of a user process that maps virtual address 0x00000000 through 0x00000FFF is purposely left empty and flagged not present in every single user mode process memory map in order to cause a fault on purpose if you access an uninitialized pointer.

push 5
ret

sets CS:EIP = 0x1b:0x00000005 which attempts to load the next instruction from a non present page. It's important to note that the fault doesn't occur on the ret, it occurs on the instruction fetch after the ret. Ret itself would only fault if ESP was pointed to invalid memory assuming and interrupt didn't occur first:

mov esp, 0
ret (or any implied [esp] access including push/pop would fault)

This is entirely on purpose by convention. An OS designer can put whatever they want at page 0.

Their is several kilobytes of process initialization code which sets up numerous things for you that you may have taken for granted such as stack, heap, both reserving heap memory from the OS (VirtualAlloc) and setting up the C malloc/free data structures in the heap to prepare for use, setting up handles to stdin and stdout for printf, setup and registration of exception handler callbacks, calling global constructors, parsing command line arguments, etc.

The OS doesn't care about any of this stuff. It simply maps the process into memory after the shell calls CreateProcess with your executable path and calls the new process entry point at 0x00400000. Absolutely everything else from there on is the responsibility of the process. You may have noticed that debugging your applications that main actually starts usually something like 0x00430000. There is quite of behind the scenes stuff happening before and after main() is actually called, including syscalls to the OS to tell it where to call back into the process to deliver exceptions, and it's all handled for you by crt0.s and perhaps dozens of other default runtime libraries and OS dlls included in your exe automatically. Ever wonder why "hello world" takes up several tens or hundreds of kilobytes in the final .exe?

In Windows, setting the exception handler callback address is actually simple, it's located at fs:[0] or the first dword of the thread information block for the process. Of course the value stored here is a function pointer and it has to point to valid code and you didn't write it so where is it? It's also part of the CRT linked automatically into your process by the compiler.

What is fs? It's just an old x86 convention to use another segment descriptor aliased on top of your process' existing segment so you can conveniently access thread info from a 0 base with a simple segment selector prefix without having to know where in your process ds=es=ss relative flat address this same exact structure is located. That is to say the locations fs:[0] and ds:[ptr_to_thead_info_block] refer to the same exact location, which as said, is the start of the thread info block, the first DWORD of which is the pointer to the user process exception handler chain. This segment nonsense is a whole 'nother can of zombie worms from the 1970s you're better off ignoring for now :D

I'm more familiar with Windows but the API names (CreateProcess, VirtualAlloc, etc) and concepts have identical equivalents and meanings in Linux.
 
Last edited:

chrstrbrts

Senior member
Aug 12, 2014
522
3
81
Exception handlers are registered with the OS through syscalls like any other user <> kernel interaction. In a supported compiler it would be installed in the code between the process entry point at 0x00400000 and main() by the C runtime stub (crt0.s?) that sets up the stack, heap, main args, etc before calling main() (which also undoes all this and calls ExitProcess after main returns). Most of the data structures to handle exceptions in the application are contained in the C runtime and embedded in the user stack of the process and are mostly hidden from the programmer.

Wait. Are you saying that the C compiler has dynamic run-time aspects?

I thought that the C compiler, any compiler really, took your ASCII file and converted it into a relocatable object file to be linked and converted into an executable object file.

That is, I thought calls to compiler provided functions, which are usually just wrappers for OS API functions, were converted into machine code and assembled statically leaving you with a single, self-contained executable file that is loaded and run without reliance upon any other software entity except for the OS which provides the loading and memory management as the process is executed.

But, you seem to imply that these compiler functions are instead called out to as the program is running and that the compiler has some dynamic linked library quality to it.
 
Last edited:

Merad

Platinum Member
May 31, 2010
2,586
19
81
When you link to a library (the C runtime or anything else) you choose whether to link statically (library contents are part of your executable) or dynamically (they are loaded at run time).
 

exdeath

Lifer
Jan 29, 2004
13,679
10
81
Wait. Are you saying that the C compiler has dynamic run-time aspects?

I thought that the C compiler, any compiler really, took your ASCII file and converted it into a relocatable object file to be linked and converted into an executable object file.

That is, I thought calls to compiler provided functions, which are usually just wrappers for OS API functions, were converted into machine code and assembled statically leaving you with a single, self-contained executable file that is loaded and run without reliance upon any other software entity except for the OS which provides the loading and memory management as the process is executed.

But, you seem to imply that these compiler functions are instead called out to as the program is running and that the compiler has some dynamic linked library quality to it.

I'm saying your C program does not actually start at main(). The compiler/linker puts a whole bunch of initialization code at the start of the process at true exe entry point that gets to run before main is even called. main() is not the process entry point.

The C runtime is not "compiler functions", but any C library functions and house keeping code that is linked in to your exe. If you call malloc or fopen you are using collectively what is referred to as the C runtime which includes the crt0.s entry point stub that initializes the process before calling main() which among other things, initializes the structured exception handling chain and registers it with the OS.

In a modern OS like Windows there are numerous things that are linked dynamically via DLLs, namely user32.dll, kernel32.dll, ntdll.dll, which the C runtime "wrappers" as you accurately describe them, depend upon (fopen ultimately calls OpenFileEx which calls NtCreateFile which ultimately ends in ntdll.dll performing a syscall usually via software interrupt int 2e)

The C language compiler and the C standard library functions written in C on the target platform SDK are two diff things. All a compiler does is turn a .c into a .o. It's the linker that does most of the dirty work.
 
Last edited: