Originally posted by: sao123
Quite an interesting assignment you got there.... As a one time instructor in the areas of C++, Data Structures, with particular interest in also the areas of Assembly Language, and System Architecture id be interested to see the solution to it. :sun:
Right. When the timeslice ends, the kernel saves all the registers. Among the registers it saves are EIP and ESP. It can see whether the instruction pointer is in the stack.
Read over your explanation of buffer overflows. The exploit code starts out on the stack. To get that code to execute, you set the return address to point to your code on the stack.
Thats not what i said...
The instruction pointer (EIP) never points to an address in the stack (in the range EBP, ESP). What i said was... the return instruction pointer contents (an address) are pushed in the stack, and then popped off the stack.
In all of my solutions to the assignment I linked to, my exploit code most certainly resides on the stack. The return address is overwritten with the expected location of the exploit code - somewhere on the stack. When the vulnerable function returns, EIP is pointing to the stack.
When a function jump is called the address in the EIP is incremented to the next and then the EIP contents (current instruction address) is pushed (stored) as DATA in the stack which eventually becomes the return address when it is popped. If this DATA (return address) is overwritten by a different valid address, when the return instruction executes: the return address is popped from the stack back into the EIP, and code will continue to execute at the new return address.
Right.
No, that would cause an overflow, which would either be caught or ignored. If your data type is specified as int, no matter what you do, the CPU isn't going to generate more than 32 bits of data.
what your suggesting to me here is that buffer overflow usually must take place on unformatted data...interesting. Now I see, arrays of any type though formatted by a particular type...are very much subject to overflow because there is no true boundry... I missed that linking concept last time i studied this. Thanks for the insight.
Pass by value parameters are passed in array form inside the stack which facilitates buffer overflow during function calling.
buffer overflows happen with arrays of anything. If the array is statically defined (e.g. char mystring[10]), it will be on the stack, and thus a buffer overflow can result in manipulation of the return address. If the array is dynamically allocated (e.g. char* my string = (char*) malloc(10*sizeof(char))), it will live on the heap, and worst case, your data will be overwritten.
edit:
Let's say we have the following function:
void crackMe()
{
char myBuf[12];
printf("enter your name\n");
Gets(myBuf);
}
Right after crackMe is called, the stack looks like:
eip - where crackMe's caller will return to
ebp - saved ebp
...
caller's local variables
...
eip - where crackMe should return to
ebp - saved ebp
The next step is to create room for the buffer on the stack:
eip - where crackMe's caller will return to
ebp - saved ebp
...
caller's local variables
...
eip - where crackMe should return to
ebp - saved ebp
_ _ _ _
_ _ _ _
_ _ _ _
Note that we now have a 12 byte buffer on the stack.
Ignore the printf.
Now we're at the Gets. We push the address of that buffer and call our Gets function. Let's say the user enters 11 characters.
The stack when Gets returns will be:
eip - where crackMe's caller will return to
ebp - saved ebp
...
caller's local variables
...
eip - where crackMe should return to
ebp - saved ebp
h e r 0 - note terminating zero.
s t o p
C h r i
All is well. What if, instead they enter something longer?
eip - where crackMe's caller will return to
ebp - saved ebp
...
caller's local variables
...
n j 0 0 - where crackMe should return to
x I p w - saved ebp
3 t h a - note terminating zero.
m y l 3
F e a r
Now, the return address has been overwritten. In this case, the call will return to address 30304A4E (I think... 0 0 J N ascii codes), and the program will most likely crash (segfault, protection violation, illegal instruction, whatever).
Instead, let us enter a string which contains some machine language, and the 4th set of 4 bytes will be specially crafted to point somewhere on the stack:
eip - where crackMe's caller will return to
ebp - saved ebp
...
caller's local variables
...
22 a3 ff bf - where crackMe should return to
# # # # - saved ebp
# # # # - note terminating zero.
# # # #
# # # # - let us assume that this is location 0xbfffa322, for convenience
Obviously that return address is somewhere on the stack in linux... it would have to be crafted to fit the program you're exploiting. The #s represent machine language I don't feel like generating for this post. Now, when CrackMe returns, it will find that return address, (0xbfffa322) and jump there - now it is executing our exploide code off the stack. If the code is complex enough, or we get lucky timing, a context switch will occur while we're executing exploit code off the stack. The kernel sees EIP pointing in the stack, and uh oh, we know we've been exploited.
Exploits that jump to existing code (let's say IIS had a downloadAndExecuteFile function somewhere) could escape detection most of the time of course, because they'd just jump to an allowed location, but if your exploit code itself does anything fancy, it could be caught.
This is what my
prof said when I asked:
Chris,
This would help with that particular exploit. However, some
OS's such as Linux actually depend on being able to run
user code on the stack in order to process signal handlers,
so somehow you'd have to allow for this. Also, there are
other exploits that execute existing code and others that
work by overwriting heap and .data buffers that would slip by.
edit2: I won't post my exploits because I know at least a few schools use this same assignment... but if you want to talk about it, pm me or AIM/ICQ me.