-------------------------------------------
Stack Shield tecnical info file v0.6
-------------------------------------------

Index:

  1.  Introduction
  2.  The linux process memory model
  3.  Functions
  4.  The "stack smashing" technique
  5.  The Stack Shield protection system
  6.  Where to get more detailed info

1.  Introduction
The pourpose of this document is to describe the "stack smashing" technique and
how Stack Shield protect programs aganist it.
This document refers to an Intel 386 processor class and the Linux operating
system.
This documents assumes also that you know the foundamentals of the Intel 386
processor class assembler, the ANSI-C, and the Linux operating system.
Before explaining the "stack smashing" technique the next two sections of this
document cover the process memory model under linux and the assembler
rappresentation of the functions in a C program (compiled with the GCC).
These are key concepts to understend the "stack smashing" tecnique and the
Stack Shield protection system.

2.    The linux process memory model
When a linux process is invoked a memory area is allocated for it. The process
can access only this memoy area and some special memory areas. It cannot access
areas of other process.
The memory area is divided in 3 segments: the TEXT segment, the DATA segment
and the STACK segment.
The TEXT segment is the first in the process memory area and contains the
process binary code and costant data, after the process is started it cannot
be written (each attemp to write in this segment or in the memory area of
another process will cause the kernel to kill the process with a segmentation
fault error).
The DATA segment is right after the TEXT segment and contains some process
data. In a C (GCC compiled) program this segment contains the global variables
and the local static variables. Both can contain a predefined value or not,
those that do are stored at the begining of the segment, the others are stored
after them. After the variables there is a free space of variable size.
The STACK segment is used to hold a various type of data. It is a LIFO
structure, this means that the last data pushed in the stack will be the first
data popped. In an Intel 80x86 processor the stack has the opposite verse of
the memory. The stack starts on the last word (4 bytes) of the memory area,
when a value is pushed in the stack it grows back using the free space of the
DATA segment and the new value is placed before the last pushed value, when
a value is popped from the stack it deallocates the memory used for it.
In a C (GCC compiled) program the stack is used to perform function calling
and returning, to pass parametres to functions and to store local auto
variables.

3.  Functions
Functions (sometimes called procedures) are a foundamental costruct for the most
of the modern programming languages since they are the base of the structured
programming.
To facilitate the assembler implementation of them the processors have specific
istructions. In the Intel 80x86 processor series the two major for implementing
functions are the CALL and the RET istructions.
A C (GCC compiled) program in order to call a function pushes all the function
parameters in the stack an issue the CALL istruction followed by the function
address. This istruction pushes the value of the EIP register in the stack and
set in it the address of the function causing the processor to execute the
first istruction in it.
When a function has to return to his caller it issues the RET istruction that
pops back in EIP the value pushed by the CALL istruction.
Usualy the stack location where is stored that value for a function is called
RET.
The CALL and the RET istructions provides the basics for the function
implementation howewer the compiler adds fixed pieces of code at the begining
and at the end of the function to perform other necessary tasks. These pieces
of codes are called respectively prolog and epilog.
As we said before the C functions use the stack to store they local auto
variables. To refer them they don't use a absolute address but uses a relative
offset that is a negative number that is added to the EBP register value to
determinate they address. The function prolog pushes the EBP value in the stack
and set it to the ESP value so it will point to the stack top.
When the function pushes other data in the stack the ESP register changes but
the EBP register remains costant so it can be used as the base address for
local auto variables. In fact just after executing the prolog the function
decrements the ESP register to allocate space for them.
When a function has to return before executing the RET istruction it executes
the epilog that resets the ESP register to the EBP value (freeing all local
auto variables and other pushed data) then restores the EBP value popping it
from the stack to preserve the correct EBP value of the caller.
The location where the EBP value is stored in the stack for a function is
usualy called SFP (Saved Frame Pointer).

4.  The "stack smashing" technique
The "stack smashing" is a technique used in some exploits to violate the system
security by causing a process to execute arbitrary assembler code contained
in data sent to it as input. It is the most common technique used in exploits.
It is based on a lack of input data validation checks in the programs:
Unlike other languages the C has a very flexible data type system that allows
a low level access to memory but this results that the program developer has to
care about data validation checks that in other languages are performed
automaticaly.
Specificaly the C hasn't a string type but rappresent the strings has array of
characters terminated by a NULL (0) value. Arrays (and buffers) have always a
defined size.
It may be fixed or determinated when the buffer is allocated. It can be changed
with an explicit call to the C function realloc() but when data is being
written in a buffer there is no automatic check to avoid overflows.
Implementing these checks is up to the programmer.
When a local autobuffer overflow occours the excedent data overwrites the local
auto variable allocated before it, if this variable is too small to hold the
excedent data it overwrites the previous allocated variable and so on.
When the first allocated local auto variable is overflowed the excedent data
overwrites the SFP first, then the RET and last the function parameters, from
the first to the last.
The "stack smashing" technique consist in sending to a program input data
that uses lacks of input validation checks in the program to cause buffer
overflows. The data size is calculated to overflow the RET (and sometimes the
function parameters) writing in it an address usualy of a memory location
somewhere in the buffer. As the function returns the RET istruction pops this
address in the EIP register, causing the execution of the code placed in the
buffer.

5.  The Stack Shield protection system
In the current version (0.6 beta) the Stack Shield protection system support
one primary protection method plus a secondary for special attacks. The primary
consists in adding assembler code in the prolog and epilog of the program
function that use a sort of separated stack to store the RET address.
In the assembler file is added the declaration of a global long array, and two
global pointers to long. By default the array has 256 elements but this number
can be changed using the -l option. The first pointer points to the begining
of the array, while the other points just after his last element.
In the function prolog the first pointer (called retptr) is compared with the
second one (called rettop). If the retptr is equal or major than the rettop
the retptr is just incremented of one element. If it is minor the RET address
is copied in the memory location that it points and it is incremented.
In the function epilog the retptr is decremented and is compared again with the
rettop. If retptr is equal or major than rettop the function just executes the
RET istruction. If it is minor the value of the memory location that is pointed
by it is copied in the RET before executing the RET istruction.
So when a function is called its RET address is placed in the array and the
retptr pointer is incremented to point to the next free element of the it.
Before this howewer the retptr is compared with the rettop to ensure that it is
pointing to somewhere in the array. If retptr is equal or major than rettop
means that it is not pointing into the array since it is full, so the RET
address will not be copied in it. The pointer retptr is incremented anyway to
keep count of the number of RET that are not copied in the array.
In the epilog the retptr is decremented and compared again with the rettop. If
retptr is equal or major than rettop means that in the prolog the RET address
has not be copied in the array so the function just return. If the retptr is
minor than rettop means that in the prolog the RET array has be copied so it is
restored form the array before the execution of the RET istruction.
So if in a function there is an overflow that overwrites its RET address but it
was copied in the array by the prolog, the epilog will restore it before the
function returns, avoiding the eventual execution of the code placed in the
overflowed buffer. Actualy no comparsion is done between the RET address in the
stack and its copy in the array, so "stack smashing" attacks are blocked but
not detected. If there is an overflow in a function but the RET address was not
copied in the array since it was full the attack will not be blocked. So the
number of nested calls (including the call to the main function) that can be
protected is the number of the elements in the array. The bytes of memory used
by Stack Shield are the element number x 4 + 8.
As we said before if the number of nested calls if major than the number of the
elements in the array the excedent functions have not protection from the
"stack smashing" technique but can still work correctly. This if true unless
retptr is smaller than 4294967295. If it grows more than 4294967295, the
functions will return abnormaly causing unrecoverable errors in the program.
This is a very rare condition howewer the -c flag disables the protection
system when it occours, leaving the program unprotected but still working.
The secondary protection method handles the function pointer overwrite exploit
class. When a buffer overflow causes the overwrite of a function pointer with
an arbitrary address (usualy of some location in the buffer) and the function
pointer is called, the program will execute the attacker's code without being
detected by the primary method, since the RET address will not have been
modified. Also the execution of the shell code may take place before the
execution of the function epilog.
The secondary method adds a portion of code in the begining of the asm file and
before each function call with a non-costant parameter. The header declares a
variable in the DATA segment. The part inserted before the calls checks if the
parameter value is not in the DATA or in the STACK segment. This is done by
comparing the parameter with the previously declared variable address. If the
parameter is greater, it is in the DATA or in the STACK segment (or outside the
process memory space). In this case the program is terminated via an exit()
system call, returning a nonzero value.
This method can cause errors in programs that normaly execute asm code in the
DATA or in the STACK segment. If you experience unexpected program terminations
not caused by attack attemps use the Stack Shield -f flag to disable this
protection method.

6.  Where to get more detailed info
This tecnical info file is primary based on the document "Smashing the stack
for fun and profit" Phrack 49 File 14 by Aleph One, which describes in detail
the "stack smashing" technique. It can be found on BugTraq
(http://www.geek-girl.com/bugtraq), I think.
Other useful info about this topic can be found in the Immunix Stack Guard site
(http://www.cse.ogi.edu/DISC/projects/immunix/StackGuard/). Stack Guard is a
GCC patch that protects programs for Intel 386 class processors and Linux
aganist the "stack smashing" tecnique like the Stack Shield, but using a
different protection method, the canary protection metod (that could be
a bit faster and has no nested call limitations but is less secure, expecialy
when using the "terminator" canary).

I apologize for any error in this document and for my English, please report
any error to vendicator@usa.net
