x86 Disassembly/The Stack

From Wikibooks, open books for an open world
Jump to: navigation, search

The Stack[edit]

Data stack.svg

Generally speaking, a stack is a data structure that stores data values contiguously in memory. Unlike an array, however, you access (read or write) data only at the "top" of the stack. To read from the stack is said "to pop" and to write to the stack is said "to push". A stack is also known as a LIFO queue (Last In First Out) since values are popped from the stack in the reverse order that they are pushed onto it (think of how you pile up plates on a table). Popped data disappears from the stack.

All x86 architectures use a stack as a temporary storage area in RAM that allows the processor to quickly store and retrieve data in memory. The current top of the stack is pointed to by the esp register. The stack "grows" downward, from high to low memory addresses, so values recently pushed onto the stack are located in memory addresses above the esp pointer. No register specifically points to the bottom of the stack, although most operating systems monitor the stack bounds to detect both "underflow" (popping an empty stack) and "overflow" (pushing too much information on the stack) conditions.

When a value is popped off the stack, the value remains sitting in memory until overwritten. However, you should never rely on the content of memory addresses below esp, because other functions may overwrite these values without your knowledge.

Users of Windows ME, 98, 95, 3.1 (and earlier) may fondly remember the infamous "Blue Screen of Death" -- that was sometimes caused by a stack overflow exception. This occurs when too much data is written to the stack, and the stack "grows" beyond its limits. Modern operating systems use better bounds-checking and error recovery to reduce the occurrence of stack overflows, and to maintain system stability after one has occurred.

Push and Pop[edit]

The following lines of ASM code are basically equivalent:

push eax
sub esp, 4
mov DWORD PTR SS:[esp], eax
pop eax
mov eax, DWORD PTR SS:[esp]
add esp, 4

but the single command actually performs much faster than the alternative. It can be visualized that the stack grows from right to left, and esp decreases as the stack grows in size.

Push Pop
ReverseEngineeringPush.JPG ReverseEngineeringPop.JPG

ESP In Action[edit]

Let's say we want to quickly discard 3 items we pushed earlier onto the stack, without saving the values (in other words "clean" the stack). The following works:

pop eax
pop eax
pop eax

However there is a faster method. We can simply perform some basic arithmetic on esp to make the pointer go "above" the data items, so they cannot be read anymore, and can be overwritten with the next round of push commands.

add esp, 12  ; 12 is 3 DWORDs (4 bytes * 3)

Likewise, if we want to reserve room on the stack for an item bigger than a DWORD, we can use a subtraction to artificially move esp forward. We can then access our reserved memory directly as a memory pointer, or we can access it indirectly as an offset value from esp itself.

Say we wanted to create an array of byte values on the stack, 100 items long. We want to store the pointer to the base of this array in edi. How do we do it? Here is an example:

sub esp, 100  ; num of bytes in our array
mov edi, esp  ; copy address of 100 bytes area to edi

To destroy that array, we simply write the instruction

add esp, 100

Reading Without Popping[edit]

To read values on the stack without popping them off the stack, esp can be used with an offset. For instance, to read the 3 DWORD values from the top of the stack into eax (but without using a pop instruction), we would use the instructions:

mov eax, DWORD PTR SS:[esp]
mov eax, DWORD PTR SS:[esp + 4]
mov eax, DWORD PTR SS:[esp + 8]

Remember, since esp moves downward as the stack grows, data on the stack can be accessed with a positive offset. A negative offset should never be used because data "above" the stack cannot be counted on to stay the way you left it. The operation of reading from the stack without popping is often referred to as "peeking", but since this isn't the official term for it this wikibook won't use it.

Data Allocation[edit]

There are two areas in the computer memory where a program can store data. The first, the one that we have been talking about, is the stack. It is a linear LIFO buffer that allows fast allocations and deallocations, but has a limited size. The heap is typically a non-linear data storage area, typically implemented using linked lists, binary trees, or other more exotic methods. Heaps are slightly more difficult to interface with and to maintain than a stack, and allocations/deallocations are performed more slowly. However, heaps can grow as the data grows, and new heaps can be allocated when data quantities become too large.

As we shall see, explicitly declared variables are allocated on the stack. Stack variables are finite in number, and have a definite size. Heap variables can be variable in number and in size. We will discuss these topics in more detail later.