Operating System Fundamentals

From Wikibooks, open books for an open world
Jump to navigation Jump to search


What is an Operating System[edit]

The Unix Shell[edit]

The Unix shell is a command interpreter program that serves as the primary interface between uses and the OS in a command line environment, e.g. a terminal or terminal simulator. A shell is an essential (often preferred) tool, but it is just a ordinary user program that uses system calls to get most of the work done - so its just a "shell".

Functions of a Shell[edit]

A shell presents its users a command prompt (e.g. $), waits for a command, and executes the command. executes user commands The primary duty of a shell

Creating a Shell[edit]

The overall structure of a shell can be:

 repeat forever
   read one line 
   parse the command into a list of arguments
   if the line starts with a command name (e.g. cd and exit)
   then 
     perform the function (if it's exit, break out of the loop)
   else (it invokes a program, e.g. ls and cat) 
     execute the program 

To read a command we read one line at a time and tokenize the line into tokens.

To execute a program the following needs to take place: Use fork() system call to duplicate the current process:

use fork() to clone a process[edit]

/* modified from fork.c in Advanced Linux Programming (page 49) */
/* http://www.makelinux.net/alp/024 */

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

int main(){
  pid_t child_pid;
  
  printf("the main program process ID is %d\n", (int) getpid());

  child_pid = fork();
  if(child_pid != 0){
    printf("this is the parent process, with id %d\n", (int)getpid());
    printf("child_pid=%d\n", child_pid);
  }else{
    printf("this is the child  process, with id %d\n", (int)getpid());
    printf("child_pid=%d\n", child_pid);
  }
}

This example shows how to create a new process by forking the current process. Note that the fork() function call (system call) is invoked once but return twice because when the call finishes there are two processes executing the same code.

use execvp() to run a new program in the background[edit]

/* from Advanced Linux Programming (page 51) */
/* http://www.makelinux.net/alp/024 */
#include <stdio.h> 
#include <stdlib.h> 
#include <sys/types.h> 
#include <unistd.h> 

/* Spawn a child process running a new program. PROGRAM is the name 
   of the program to run; the path will be searched for this program. 
   ARG_LIST is a NULL-terminated list of character strings to be 
   passed as the program's argument list. Returns  the process ID of 
   the spawned process.  */ 

int spawn (char* program, char** arg_list) {
  pid_t child_pid; 

  /* Duplicate this process. */ 
  child_pid = fork (); 

  if (child_pid != 0){
    /* This is the parent process. */ 
    return child_pid; 
  }else {
    /* Now execute PROGRAM, searching for it in the path.  */ 
    execvp (program,  arg_list); 
    /* The execvp  function returns only if an error occurs.  */ 
    fprintf (stderr,  "an error occurred in execvp\n"); 
    abort (); 
  } 
} 

int main () {
  /*  The argument list to pass to the "ls" command.  */ 
  char* arg_list[] = {
    "ls",     /* argv[0], the name of the program.  */ 
    "-l", 
    "/", 
    NULL      /* The argument list must end with a NULL.  */ 
  }; 

  /* Spawn a child process running the "ls" command. Ignore the 
     returned child process ID.  */ 
  spawn ("ls", arg_list); 
  printf ("done with main program\n"); 
  return 0; 
}

use execvp() to run a new program in the foreground[edit]

/* from Advanced Linux Programming (page 51) */
#include <stdio.h> 
#include <stdlib.h> 
#include <sys/types.h> 
#include <sys/wait.h>
#include <unistd.h> 

/* Spawn a child process running a new program. PROGRAM is the name 
   of the program to run; the path will be searched for this program. 
   ARG_LIST is a NULL-terminated list of character strings to be 
   passed as the program's argument list. Returns  the process ID of 
   the spawned process.  */ 

int spawn (char* program, char** arg_list) {
  pid_t child_pid; 

  /* Duplicate this process. */ 
  child_pid = fork (); 

  if (child_pid != 0){
    /* This is the parent process. */ 
    return child_pid; 
  }else {
    /* Now execute PROGRAM, searching for it in the path.  */ 
    execvp (program,  arg_list); 
    /* The execvp  function returns only if an error occurs.  */ 
    fprintf (stderr,  "an error occurred in execvp\n"); 
    abort (); 
  } 
} 

int main () {
  /*  The argument list to pass to the "ls" command.  */ 
  char* arg_list[] = {
    "ls",     /* argv[0], the name of the program.  */ 
    "-l", 
    "/", 
    NULL      /* The argument list must end with a NULL.  */ 
  }; 

  /* Spawn a child process running the "ls" command. Ignore the 
     returned child process ID.  */ 
  pid_t pid = spawn ("ls", arg_list); 

  /* Wait for the child process to complete.  */ 
  int child_status;
  waitpid (pid, &child_status, 0); 
  
  if (WIFEXITED (child_status)){
    printf ("the child process exited normally, with exit code %d\n", 
	     WEXITSTATUS (child_status)); 
  }else{ 
    printf ("the child process exited abnormally\n"); 
  }
  
  return  0; 
}

Note that the argument list must contain the program name as the first argument and MUST end with a NULL, which indicates the end of the list. Otherwise, execvp() won't function correctly.

use dup2() to redirect standard output[edit]

#include <stdio.h>
#include <sys/types.h>
#include <fcntl.h>
#include <unistd.h>
 
int main(){
  int overwrite = 0;
  int fd;
  if(overwrite){
    printf("open test.txt to overwrite.\n");
    fd =  open("test.txt", O_WRONLY | O_CREAT, S_IRWXU);
  }else{
    printf("open test.txt to append.\n");
    fd =  open("test.txt", O_WRONLY | O_APPEND | O_CREAT, S_IRWXU);
  }
  dup2 (fd, STDOUT_FILENO);
  printf("hello world!");
}

This example opens a file ("test.txt") for write and makes the standard output synonymous to the open file - bytes sent to the standard output will go to the file. The "O_WRONLY | O_CREAT" flags cause the file to be opened for write and the file to be created if it doesn't already exist.

Information about each open file is recorded in a table managed by the OS. Each entry corresponds to an open file and the index of each entry is an integer (file descriptor), which is returned to the process opening the file as the return value of the open() system call. The entries for standard input, standard output, and standard error are reserved with predefined indices: STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO (defined in <unistd.h>). The dup2(index1, index2) function will copy the entry content at index1 to the entry at index2, which make the file descriptor index2 synonymous to the file descriptor index1.

redirect standard output in a child process[edit]

The "dup2 (fd, STDOUT_FILENO)" line (in the previous example) redirects the standard output of the current (main) process to the open file. If you want to redirect the standard output of a child process to a file, you will need to wait till the child process is created - after the fork() function call. The following example demonstrate the idea. You will see that the "hello" message from the parent process still goes to the standard output, but the standard output from the child process gets redirected to the file.

#include <stdio.h> 
#include <stdlib.h> 
#include <sys/types.h> 
#include <sys/wait.h>
#include <unistd.h> 
#include <sys/stat.h>
#include <fcntl.h>

int main () {
  /*  The argument list to pass to the "ls" command.  */ 
  char* arg_list[] = {
    "ls",     /* argv[0], the name of the program.  */ 
    "-l", 
    "/", 
    NULL      /* The argument list must end with a NULL.  */ 
  }; 

  /* Spawn a child process running the "ls" command. */
  pid_t child_pid = fork (); 

  if (child_pid == 0){
    /* This is the child process. */ 
    char * filename = "test.txt";
    int outfile = open(filename, O_CREAT | O_WRONLY, S_IRWXU);
    if (outfile == -1){
      fprintf(stderr, "Error: failed to create file %s\n", filename);
    }else{
      /* redirect the standard output from this process to the file. */
      if(dup2(outfile, STDOUT_FILENO) != STDOUT_FILENO){
        fprintf(stderr, "Error: failed to redirect standard output\n");
      }

      /* Now execute PROGRAM, searching for it in the path.  */ 
      execvp (arg_list[0],  arg_list); 
      /* The execvp  function returns only if an error occurs.  */ 
      fprintf (stderr,  "an error occurred in execvp\n"); 
      abort (); 
    }
  }
  /* only the parent process executes the following code. */
  fprintf(stdout, "Hello from the parent process.\n");

  /* Wait for the child process to complete.  */ 
  int child_status;
  waitpid (child_pid, &child_status, 0); 
  
  if (WIFEXITED (child_status)){
    printf ("the child process exited normally, with exit code %d\n", 
	     WEXITSTATUS (child_status)); 
  }else{ 
    printf ("the child process exited abnormally\n"); 
  }
  
  return  0; 
}

use pipe() and dup2() to create pipes[edit]

/* from Advanced Linux Programming (page 113) */
/* http://www.makelinux.net/alp/038 */
#include <stdio.h> 
#include <sys/types.h> 
#include <sys/wait.h> 
#include <unistd.h> 

int main () {
  int fds[2]; 
  pid_t pid; 

  /* Create a pipe. File descriptors for the two ends of the pipe are 
     placed in fds.  */ 
  pipe (fds); 

  printf("fds[0]=%d, fds[1]=%d\n", fds[0], fds[1]);

  /* Fork a child process.  */ 
  pid = fork (); 

  if (pid == (pid_t) 0) {
    /* This is the child process. Close our copy of the write end of 
       the file descriptor.  */ 
    close (fds[1]); 

    /* Connect the read end of the pipe to standard input.  */ 
    dup2 (fds[0], STDIN_FILENO); 

    /* Replace the child process with the "sort" program.  */ 
    execlp ("sort", "sort", NULL); 
  } else {
    /* This is the parent process.  */ 
    FILE* stream; 

    /* Close our copy of the read end of the file descriptor.  */ 
    close (fds[0]); 

    /* Connect the write end of the pipe to standard out, and write 
       to it.  */ 
    dup2(fds[1], STDOUT_FILENO);
    printf ("This is a test.\n"); 
    printf ("Hello, world.\n"); 
    printf ("My dog has fleas.\n"); 
    printf ("This program is great.\n"); 
    printf ("One fish, two fish.\n"); 
    fflush(stdout);
    close(fds[1]);
    close(STDOUT_FILENO);

    /* Wait for the child process to finish.  */ 
    waitpid (pid, NULL, 0); 
    printf("parent process terminated.\n");
  } 
  return 0; 
}

An example pipe connecting two commands[edit]

#include <stdio.h> 
#include <stdlib.h> 
#include <sys/types.h> 
#include <unistd.h> 

int run_command(char** arg_list, int rd, int wd) {
  pid_t child_pid; 

  /* Duplicate this process. */ 
  child_pid = fork(); 

  if (child_pid != 0){
    /* This is the parent process. */ 
    return child_pid; 
  }else {
    if (rd != STDIN_FILENO){
      if(dup2(rd, STDIN_FILENO) != STDIN_FILENO){
	fprintf(stderr, "Error: failed to redirect standard input\n");
        return -1;
      }
    }

    if (wd != STDOUT_FILENO){
      printf("redirect stdout to %d.", wd);
      if(dup2(wd, STDOUT_FILENO) != STDOUT_FILENO){
        fprintf(stderr, "Error: failed to redirect standard output\n");
        return -1;
      }
    }
    /* Now execute PROGRAM, searching for it in the path.  */ 
    execvp (arg_list[0], arg_list); 
    /* The execvp  function returns only if an error occurs.  */ 
    fprintf (stderr,  "an error occurred in execvp\n"); 
    abort (); 
  } 
} 

int main () {
  /*  The argument list to pass to the "ls" command.  */ 
  char* arg_list[] = {
    "ls",     /* argv[0], the name of the program.  */ 
    "-l", 
    "|",      /* the pipe symbol is at index 2 */
    "wc",
    "-l", 
    NULL      /* The argument list must end with a NULL.  */ 
  }; 
 
  int pipe_index = 2;
  int rd = STDIN_FILENO;
  int wd = STDOUT_FILENO;
  int fds[2];
  if (pipe(fds) != 0) {
    fprintf(stderr, "Error: unable to pipe command '%s'\n", arg_list1[0]);
    return -1;
  }
  
  wd = fds[1]; /*file descriptor for the write end of the pipe*/

  // delete the pipe symbol and insert a null to terminate the
  // first command's argument list
  args[pipe_index] = NULL;

  // run first command: read from STDIN and write to the pipe
  run_command(arg_list, rd, wd);
  close(fds[1]);

  rd = fds[0];
  wd = STDOUT_FILENO;

  // run the second command: read from the pipe and write to STDOUT
  // the argument for this command starts at pipe_index+1
  run_command(arg_list+pipe_index+1, rd, wd);

  fprintf (stderr, "done with main program\n"); 
  return 0; 
}

In this example, the original list of arguments is broken into two list at the pipe symbol, which is replaced by a null value. This allows us to use the two argument lists to run the two separate commands/programs connected by the pipe.

A process is a program in execution. It is a metaphor for a entity managed by the OS. A process has its own address space and other information in OS managed data structures.

Input and Output[edit]

The Von Neumann Architecture.
The Von Neumann Architecture Scheme

File System[edit]

File System Concepts[edit]

file abstraction - a byte stream, meaning is imposed by file system users file system users - end users (humans) and direct users (programs, e.g. an application or the shell) user perspective - a collection of system calls, such as creat, open, close, seek, delete ... file attributes - meta data: owner, size, timestamps ... directory abstraction - a list of files and directory, map the name of a file/directory onto the information needed to locate the data. absolute path and relative path directory operations - create, delete, open, close, rename, link, unlink

File System Implementations[edit]

layout: partition, boot block, superblock, ... disk block allocation:

 contiguous allocation
 linked list allocation: the first word in each block is used as a pointer to the next one.
 file allocation table: linked list allocation using a table in memory. 
 i-node (index-node): an i-node is in memory when the corresponding file is open. With a i-node design we can calculate the largest possible file size.

directory implementation: i-node, long file names file sharing: symbolic v.s. hard link disk block size: tradeoff and compromise, wasting disk space v.s. performance (data rate) keeping track of free blocks: linked list v.s. bitmap system backup: caching: steps in looking up a file under a path


Resources[edit]