GNU C Compiler Internals/Return Address Defense 4 1

Return Address Defense

A prerequisite to understanding the proposed extension is the following paper:

Tzi-Cker Chiueh, Fu-Hau Hsu,"RAD: A Compile-time Solution to Buffer Overflow Attacks," 21st IEEE International Conference on Distributed Computing Systems (ICDCS), Phoenix, Arizona, USA, April 2001.

http://www.ecsl.cs.sunysb.edu/tr/TR96.pdf

The paper proposed to instrument function prolog and epilog. The return address is saved in a well-protected buffer so that overwriting stack frame does not hurt it. The copy of the address is cross-checked with the one on the stack to detect a possible attack.

Implementing the proposed method has a number of difficulties. Different architectures have different stack frame layout. For example, the x86 prolog has the code to save clobbered registers which is generated in the machine-specific component of GCC. Therefore, instrumenting prolog at the RTL level will not take into account those registers because their number is unknown at that time. There are two approaches to this problem. The first is to instrument prolog at the machine level which is not portable across different architectures. The second approach is to instrument the call-site instead.

The idea of the proposed implementation is to find the approximation of the return address using the address of a code label immediately following the call instruction. It is also possible to find the address on the stack where the return address is written using the number of actual arguments that are pushed on the stack. In case of x86 frame layout, the space is reserved at the frame creation time, therefore the stack pointer not modified when a function is called. The return address approximation and the stack pointer are saved in a well-protected buffer, similarly to the original RAD approach. The following code snippet illustrates the idea:

func1() {
     ...
     jmp L2
L1:  // the return address is on stack
     push $esp // we also need stack pointer
     rad_prolog() // save the two arguments in a safe buffer
     add $esp, 8 // make sure no traces left
     func2(); // the original call-site
     jmp L3
L2:  call L1
L3:  ....

When the instrumented code is executed, the control flow is directed to a label after the original call instruction. At that point, a call instruction saves the current EIP value and directs the control flow back to label L1. Therefore, the return address approximation argument is already on stack. We also need to push the ESP as the second argument. Function rad_prolog adds the pair to a safe buffer. The arguments are popped and the original function is called. The added call instruction is jumped over upon return.

We also need to instrument function epilog. GIMPLE level works in this case because there is a TRY_FINALLY_EXPR code. The most recent pair of return address and its location in the well-protected buffer are used to validate the return address on the stack. An alarm is generated in case the two values mismatch.

This extension does not use any machine-specific information. Therefore, it is portable across different architectures, even though we were unable to test it under anything except x86.