x86 Disassembly/Branches
Branching
[edit | edit source]Computer science professors tell their students to avoid jumps and goto instructions, to avoid the proverbial "spaghetti code." Unfortunately, assembly only has jump instructions to control program flow. This chapter will explore the subject that many people avoid like the plague, and will attempt to show how the spaghetti of assembly can be translated into the more familiar control structures of high-level language. Specifically, this chapter will focus on If-Then-Else and Switch branching instructions.
If-Then
[edit | edit source]Let's consider a generic if statement in pseudo-code followed by its equivalent form using jumps:
if (condition) then do_action; if not (condition) then goto end; do_action; end: |
What does this code do? In English, the code checks the condition and performs a jump only if it is false. With that in mind, let's compare some actual C code and its Assembly translation:
if(x == 0)
{
x = 1;
}
x++;
|
mov eax, $x
cmp eax, 0
jne end
mov eax, 1
end:
inc eax
mov $x, eax
|
Note that when we translate to assembly, we need to negate the condition of the jump because--like we said above--we only jump if the condition is false. To recreate the high-level code, simply negate the condition once again.
Negating a comparison may be tricky if you're not paying attention. Here are the correct dual forms:
Instruction | Meaning |
---|---|
JNE | Jump if not equal |
JE | Jump if equal |
JG | Jump if greater |
JLE | Jump if less than or equal |
JL | Jump if less than |
JGE | Jump if greater or equal |
And here are some examples.
mov eax, $x //move x into eax
cmp eax, $y //compare eax with y
jg end //jump if greater than
inc eax
mov $x, eax //increment x
end:
...
Is produced by these C statements:
if(x <= y)
{
x++;
}
As you can see, x is incremented only if it is less than or equal to y. Thus, if it is greater than y, it will not be incremented as in the assembler code. Similarly, the C code
if(x < y)
{
x++;
}
produces this assembler code:
mov eax, $x //move x into eax
cmp eax, $y //compare eax with y
jge end //jump if greater than or equal to
inc eax
mov $x, eax //increment x
end:
...
X is incremented in the C code only if it is less than y, so the assembler code now jumps if it's greater than or equal to y. This kind of thing takes practice, so we will try to include lots of examples in this section.
If-Then-Else
[edit | edit source]Let us now look at a more complicated case: the If-Then-Else instruction.
if (condition) then do_action else do_alternative_action; if not (condition) goto else; do_action; goto end; else: do_alternative_action; end: |
Now, what happens here? Like before, the if statement only jumps to the else clause when the condition is false. However, we must also install an unconditional jump at the end of the "then" clause, so we don't perform the else clause directly afterwards.
Now, here is an example of a real C If-Then-Else:
if(x == 10)
{
x = 0;
}
else
{
x++;
}
Which gets translated into the following assembly code:
mov eax, $x
cmp eax, 0x0A ;0x0A = 10
jne else
mov eax, 0
jmp end
else:
inc eax
end:
mov $x, eax
As you can see, the addition of a single unconditional jump can add an entire extra option to our conditional.
Switch-Case
[edit | edit source]Switch-Case structures can be very complicated when viewed in assembly language, so we will examine a few examples. First, keep in mind that in C, there are several keywords that are commonly used in a switch statement. Here is a recap:
- Switch
- This keyword tests the argument, and starts the switch structure
- Case
- This creates a label that execution will switch to, depending on the value of the argument.
- Break
- This statement jumps to the end of the switch block
- Default
- This is the label that execution jumps to if and only if it doesn't match up to any other conditions
Lets say we have a general switch statement, but with an extra label at the end, as such:
switch (x)
{
//body of switch statement
}
end_of_switch:
Now, every break statement will be immediately replaced with the statement
jmp end_of_switch
But what do the rest of the statements get changed to? The case statements can each resolve to any number of arbitrary integer values. How do we test for that? The answer is that we use a "Switch Table". Here is a simple, C example:
int main(int argc, char **argv)
{ //line 10
switch(argc)
{
case 1:
MyFunction(1);
break;
case 2:
MyFunction(2);
break;
case 3:
MyFunction(3);
break;
case 4:
MyFunction(4);
break;
default:
MyFunction(5);
}
return 0;
}
And when we compile this with cl.exe, we can generate the following listing file:
tv64 = -4 ; size = 4
_argc$ = 8 ; size = 4
_argv$ = 12 ; size = 4
_main PROC NEAR
; Line 10
push ebp
mov ebp, esp
push ecx
; Line 11
mov eax, DWORD PTR _argc$[ebp]
mov DWORD PTR tv64[ebp], eax
mov ecx, DWORD PTR tv64[ebp]
sub ecx, 1
mov DWORD PTR tv64[ebp], ecx
cmp DWORD PTR tv64[ebp], 3
ja SHORT $L810
mov edx, DWORD PTR tv64[ebp]
jmp DWORD PTR $L818[edx*4]
$L806:
; Line 14
push 1
call _MyFunction
add esp, 4
; Line 15
jmp SHORT $L803
$L807:
; Line 17
push 2
call _MyFunction
add esp, 4
; Line 18
jmp SHORT $L803
$L808:
; Line 19
push 3
call _MyFunction
add esp, 4
; Line 20
jmp SHORT $L803
$L809:
; Line 22
push 4
call _MyFunction
add esp, 4
; Line 23
jmp SHORT $L803
$L810:
; Line 25
push 5
call _MyFunction
add esp, 4
$L803:
; Line 27
xor eax, eax
; Line 28
mov esp, ebp
pop ebp
ret 0
$L818:
DD $L806
DD $L807
DD $L808
DD $L809
_main ENDP
Lets work our way through this. First, we see that line 10 sets up our standard stack frame, and it also saves ecx. Why does it save ecx? Scanning through the function, we never see a corresponding "pop ecx" instruction, so it seems that the value is never restored at all. In fact, the compiler isn't saving ecx at all, but is instead simply reserving space on the stack: it's creating a local variable. The original C code didn't have any local variables, however, so perhaps the compiler just needed some extra scratch space to store intermediate values. Why doesn't the compiler execute the more familiar "sub esp, 4" command to create the local variable? push ecx is just a faster instruction that does the same thing. This "scratch space" is being referenced by a negative offset from ebp. tv64 was defined in the beginning of the listing as having the value -4, so every call to "tv64[ebp]" is a call to this scratch space.
There are a few things that we need to notice about the function in general:
- Label $L803 is the end_of_switch label. Therefore, every "jmp SHORT $L803" statement is a break. This is verifiable by comparing with the C code line-by-line.
- Label $L818 contains a list of hard-coded memory addresses, which here are labels in the code section! Remember, labels resolve to the memory address of the instruction. This must be an important part of our puzzle.
To solve this puzzle, we will take an in-depth look at line 11:
mov eax, DWORD PTR _argc$[ebp]
mov DWORD PTR tv64[ebp], eax
mov ecx, DWORD PTR tv64[ebp]
sub ecx, 1
mov DWORD PTR tv64[ebp], ecx
cmp DWORD PTR tv64[ebp], 3
ja SHORT $L810
mov edx, DWORD PTR tv64[ebp]
jmp DWORD PTR $L818[edx*4]
This sequence performs the following pseudo-C operation:
if( argc - 1 >= 4 ) { goto $L810; /* the default */ } label *L818[] = { $L806, $L807, $L808, $L809 }; /* define a table of jumps, one per each case */ // goto L818[argc - 1]; /* use the address from the table to jump to the correct case */
Here's why...
The Setup
[edit | edit source] mov eax, DWORD PTR _argc$[ebp]
mov DWORD PTR tv64[ebp], eax
mov ecx, DWORD PTR tv64[ebp]
sub ecx, 1
mov DWORD PTR tv64[ebp], ecx
The value of argc is moved into eax. The value of eax is then immediately moved to the scratch space. The value of the scratch space is then moved into ecx. Sounds like an awfully convoluted way to get the same value into so many different locations, but remember: I turned off the optimizations. The value of ecx is then decremented by 1. Why didn't the compiler use a dec instruction instead? Perhaps the statement is a general statement, that in this case just happens to have an argument of 1. We don't know why exactly, all we know is this:
- eax = "scratch pad"
- ecx = eax - 1
Finally, the last line moves the new, decremented value of ecx back into the scratch pad. Very inefficient.
The Compare and Jumps
[edit | edit source] cmp DWORD PTR tv64[ebp], 3
ja SHORT $L810
The value of the scratch pad is compared with the value 3, and if the unsigned value is above 3 (4 or more), execution jumps to label $L810. How do I know the value is unsigned? I know because ja is an unsigned conditional jump. Let's look back at the original C code switch:
switch(argc)
{
case 1:
MyFunction(1);
break;
case 2:
MyFunction(2);
break;
case 3:
MyFunction(3);
break;
case 4:
MyFunction(4);
break;
default:
MyFunction(5);
}
Remember, the scratch pad contains the value (argc - 1), which means that this condition is only triggered when argc > 4. What happens when argc is greater than 4? The function goes to the default condition. Now, let's look at the next two lines:
mov edx, DWORD PTR tv64[ebp]
jmp DWORD PTR $L818[edx*4]
edx gets the value of the scratch pad (argc - 1), and then there is a very weird jump that takes place: execution jumps to a location pointed to by the value (edx * 4 + $L818). What is $L818? We will examine that right now.
The Switch Table
[edit | edit source] $L818:
DD $L806
DD $L807
DD $L808
DD $L809
$L818 is a pointer, in the code section, to a list of other code section pointers. These pointers are all 32bit values (DD is a DWORD). Let's look back at our jump statement:
jmp DWORD PTR $L818[edx*4]
In this jump, $L818 isn't the offset, it's the base, edx*4 is the offset. As we said earlier, edx contains the value of (argc - 1). If argc == 1, we jump to [$L818 + 0] which is $L806. If argc == 2, we jump to [$L818 + 4], which is $L807. Get the picture? A quick look at labels $L806, $L807, $L808, and $L809 shows us exactly what we expect to see: the bodies of the case statements from the original C code, above. Each one of the case statements calls the function "MyFunction", then breaks, and then jumps to the end of the switch block.
Ternary Operator ?:
[edit | edit source]Again, the best way to learn is by doing. Therefore we will go through a mini example to explain the ternary operator. Consider the following C code program:
int main(int argc, char **argv)
{
return (argc > 1)?(5):(0);
}
cl.exe produces the following assembly listing file:
_argc$ = 8 ; size = 4
_argv$ = 12 ; size = 4
_main PROC NEAR
; File c:\documents and settings\andrew\desktop\test2.c
; Line 2
push ebp
mov ebp, esp
; Line 3
xor eax, eax
cmp DWORD PTR _argc$[ebp], 1
setle al
dec eax
and eax, 5
; Line 4
pop ebp
ret 0
_main ENDP
Line 2 sets up a stack frame, and line 4 is a standard exit sequence. There are no local variables. It is clear that Line 3 is where we want to look.
The instruction "xor eax, eax" simply sets eax to 0. For more information on that line, see the chapter on unintuitive instructions. The cmp instruction tests the condition of the ternary operator. The setle function is one of a set of x86 functions that works like a conditional move: al gets the value 1 if argc <= 1. Isn't that the exact opposite of what we wanted? In this case, it is. Let's look at what happens when argc = 0: al gets the value 1. al is decremented (al = 0), and then eax is logically anded with 5. 5 & 0 = 0. When argc == 2 (greater than 1), the setle instruction doesn't do anything, and eax still is zero. eax is then decremented, which means that eax == -1. What is -1?
In x86 processors, negative numbers are stored in two's-complement format. For instance, let's look at the following C code:
BYTE x;
x = -1;
At the end of this C code, x will have the value 11111111: all ones!
When argc is greater than 1, setle sets al to zero. Decrementing this value sets every bit in eax to a logical 1. Now, when we perform the logical and function we get:
...11111111 &...00000101 ;101 is 5 in binary ------------ ...00000101
eax gets the value 5. In this case, it's a roundabout method of doing it, but as a reverser, this is the stuff you need to worry about.
For reference, here is the GCC assembly output of the same ternary operator from above:
_main:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
xorl %eax, %eax
andl $-16, %esp
call __alloca
call ___main
xorl %edx, %edx
cmpl $2, 8(%ebp)
setge %dl
leal (%edx,%edx,4), %eax
leave
ret
Notice that GCC produces slightly different code than cl.exe produces. However, the stack frame is set up the same way. Notice also that GCC doesn't give us line numbers, or other hints in the code. The ternary operator line occurs after the instruction "call __main". Let's highlight that section here:
xorl %edx, %edx
cmpl $2, 8(%ebp)
setge %dl
leal (%edx,%edx,4), %eax
Again, xor is used to set edx to 0 quickly. Argc is tested against 2 (instead of 1), and dl is set if argc is greater than or equal. If dl gets set to 1, the leal instruction directly thereafter will move the value of 5 into eax (because lea (edx,edx,4) means edx + edx * 4, i.e. edx * 5).