x86 Assembly/Interfacing with Linux

From Wikibooks, open books for an open world
Jump to navigation Jump to search

System calls[edit | edit source]

System calls are the interface between user programs and the Linux kernel. They are used to let the kernel perform various system tasks, such as file access, process management and networking. In the C programming language, you would normally call a wrapper function which executes all required steps or even use high-level features such as the standard IO library.

On Linux, there are several ways to make a system call. This page will focus on making system calls by calling a software interrupt using int $0x80 or syscall. This is an easy and intuitive method of making system calls in assembly-only programs.

Making a system call[edit | edit source]

For making a system call using an interrupt, you have to pass all required information to the kernel by copying them into GPRs.

Each system call has a fixed number. Linux persistently guarantees backward compatibility, thus once a number was assigned to a system call it will never change. Ever.

Warning The numbers differ between int $0x80 and syscall!

You specify the system call by writing the number into the eax/rax register.

Most system calls take parameters to perform their task. Those parameters are passed by writing them in the appropriate registers before making the actual call. Each parameter index has a specific register. See the tables in the subsections as the mapping differs between int $0x80 and syscall. Parameters are passed in the order they appear in the function signature of the corresponding C wrapper function. You may find system call functions and their signatures in every Linux ABI documentation, like the reference manual (type man 2 open to see the signature of the open system call).

After everything is set up correctly, you call the interrupt using int $0x80 or syscall and the kernel performs the task.

The return / error value of a system call is written to eax/rax.

The kernel uses its own stack to perform the actions. The user stack is not touched in any way.

Via interrupt[edit | edit source]

On both Linux x86 and Linux x86_64 systems you can make a system call by calling interrupt $0x80 using the int instruction. Parameters are passed by setting the general purpose registers as following:

register mapping for system call invocation using int $0x80
system call number 1st parameter 2nd parameter 3rd parameter 4th parameter 5th parameter 6th parameter result
eax ebx ecx edx esi edi ebp eax

The system call numbers are described in the Linux generated file $build/‌arch/‌x86/‌include/‌generated/‌uapi/‌asm/‌unistd_32.h or $build/‌usr/‌include/‌asm/‌unistd_32.h. The latter could also be present on your Linux system, just omit the $build.

All registers are preserved during a system call with int $0x80 except eax, where the return value is stored.

Via dedicated system call invocation instruction[edit | edit source]

The x86_64 architecture introduced a dedicated instruction to make a system call. It does not access the interrupt descriptor table and is faster. Parameters are passed by setting the GPRs as following:

register mapping for system call invocation using syscall
system call number 1st parameter 2nd parameter 3rd parameter 4th parameter 5th parameter 6th parameter result
rax rdi rsi rdx r10 r8 r9 rax

The syscall numbers are described in the Linux generated file $build/‌usr/‌include/‌asm/‌unistd_64.h. This file could also be present on your Linux system, just omit the $build.

All registers, except rcx and r11 (and the return value, rax), are preserved during the system call with syscall.

Choice[edit | edit source]

In order to achieve maximum compatibility, on 64-bit platforms Linux clips input and output of system calls using the interrupt method. That means, for instance, you cannot pass, nor receive (complete) 64-bit address pointers on an x86-64 platform using the int $0x80 method, because the upper 32 bits of all arguments and the result are zeroed. This usually aligns with the general preference of syscall, since it is faster than an interrupt.

library call[edit | edit source]

In call of x86-64 Linux's C library functions, parameter 6 is passed on r9 and further parameters, onto the stack (in reverse order).

register mapping for library call
1st parameter 2nd parameter 3rd parameter 4th parameter 5th parameter 6th parameter
rdi rsi rdx rcx r8 r9

The caller can expect to find the return value of the subroutine in the register rax.

Examples[edit | edit source]

To summarize and clarify the information, let's have a look at a very simple example: the hello world program. It will write the text "Hello World" to stdout using the write syscall and quit the program using the _exit syscall.

Syscall signatures:

ssize_t write(int fd, const void *buf, size_t count);
void _exit(int status);

This is the C program which is implemented in assembly below:

#include <unistd.h>

int main(int argc, char *argv[])
{
    write(1, "Hello World\n", 12); /* write "Hello World" to stdout */
    _exit(0);                      /* exit with error code 0 (no error) */
}

Both examples start alike: a string stored in the data segment and _start as a global symbol.

.data
msg: .ascii "Hello World\n"

.text
.global _start

int $0x80[edit | edit source]

As defined in $build/usr/include/asm/unistd_32.h, the syscall numbers for write and _exit are:

#define __NR_exit 1
#define __NR_write 4

The parameters are passed exactly as one would in a C program, using the correct registers. After everything is set up, the syscall is made using int $0x80.

_start:
    movl $4, %eax   ; use the `write` [interrupt-flavor] system call
    movl $1, %ebx   ; write to stdout
    movl $msg, %ecx ; use string "Hello World"
    movl $12, %edx  ; write 12 characters
    int $0x80       ; make system call
    
    movl $1, %eax   ; use the `_exit` [interrupt-flavor] system call
    movl $0, %ebx   ; error code 0
    int $0x80       ; make system call

syscall[edit | edit source]

In $build/usr/include/asm/unistd_64.h, the syscall numbers are defined as following:

#define __NR_write 1
#define __NR_exit 60

Parameters are passed just like in the int $0x80 example, except that the order of the registers is different. The syscall is made using syscall.

_start:
    movq $1, %rax   ; use the `write` [fast] syscall
    movq $1, %rdi   ; write to stdout
    movq $msg, %rsi ; use string "Hello World"
    movq $12, %rdx  ; write 12 characters
    syscall         ; make syscall
    
    movq $60, %rax  ; use the `_exit` [fast] syscall
    movq $0, %rdi   ; error code 0
    syscall         ; make syscall

library call[edit | edit source]

Here is the C Prototype of an example library function.

Window XCreateWindow(display, parent, x, y, width, height, border_width, depth,
                       class, visual, valuemask, attributes)

Parameters are passed just like in the int $0x80 example, except that the order of the registers is different.

Library function is declared at the beginning of the source file (and the path to the library, at compilation-linking time).

extern XCreateWindow
		mov rdi, [xserver_pdisplay]
		mov rsi, [xwin_parent]
		mov rdx, [xwin_x]
		mov rcx, [xwin_y]
		mov r8, [xwin_width]
		mov r9, [xwin_height]
		mov rax, attributes
		push rax				; ARG 12
		sub rax, rax
		mov eax, [xwin_valuemask]
		push rax				; ARG 11
		mov rax, [xwin_visual]
		push rax				; ARG 10
		mov rax, [xwin_class]
		push rax				; ARG 9
		mov rax, [xwin_depth]
		push rax				; ARG 8
		mov rax, [xwin_border_width]
		push rax				; ARG 7
		call XCreateWindow
		mov [xwin_window], rax

Note the last parameters of function, pushed into the stack, is done in reverse order.