A-level Computing/CIE/Advanced Theory/System software

From Wikibooks, open books for an open world
Jump to navigation Jump to search
A-level Computing
Hardware System Software Security


Specification link

Purposes of an operating system

  • show understanding of how an OS can maximise the use of resources
  • describe the ways in which the user interface hides the complexities of the hardware from the user
  • show understanding of processor management: multiprogramming, including:
    • the concept of multiprogramming and a process
    • the process states: running, ready and blocked
    • the need for low-level scheduling and high-level scheduling
    • the concept of an interrupt
    • how the kernel of the OS acts as the interrupt handler and how interrupt handling is used to manage low-level scheduling
  • show understanding of paging for memory management including:
    • the concepts of paging and virtual memory
    • the need for paging
    • how pages can be replaced
    • show how disk thrashing can occur

Virtual machine

  • show understanding of the concept of a virtual machine
  • give examples of the role of virtual machines
  • show understanding of the benefits and limitations of virtual machines

Translation software

  • show awareness of the need for different types of translation software
  • show understanding of how an interpreter can execute programs without producing a translated version
  • show understanding of the various stages in the compilation of a program: lexical analysis, syntax analysis, code generation and optimisation
  • show understanding of how the grammar of a language can be expressed using syntax diagrams or Backus-Naur Form (BNF) notation
  • show understanding of how Reverse Polish notation (RPN) can be used to carry out the evaluation of expressions

Purposes of an Operating System[edit | edit source]

OS Resources[edit | edit source]

The OS manages three main resources: the CPU, the memory, and the input/output (I/O) system. I/O takes a significantly longer time to access, so the OS must balance the use of these resources so that the CPU is not idle when waiting for I/O to take place.

User Interfaces[edit | edit source]

The user interface is how the user interacts with the computer and the OS. There are two main flavours of OS: command line interface (CLI) and graphical user interface (GUI).

A Command Line Interface
CLI
A Graphical User Interface
GUI
Examples of a CLI and a GUI

Command Line Interface[edit | edit source]

A command line interface is a text-only interface that was common until around the 1990s. In order to use a CLI, the user needed to know the commands that were required to do what they want to be done, which had the disadvantage of being unintuitive.

Graphical User Interface[edit | edit source]

A graphical user interface is a more visual user interface. GUIs typically involve a system of Windows, Icons, Menus, and Pointers. GUIs are generally considered to be more intuitive and user-friendly than CLIs.

Processor Management[edit | edit source]

Processes[edit | edit source]

A process is a program in memory that is currently being executed or is waiting to be executed. Processes are controlled by process control blocks (PCBs).

PCBs keep track of:

Process ID
A unique identifier for the process.
State
Whether the process is running, ready, or blocked.
Privileges
What data, memory, and system functions the process has access to.
Register states
The values of each of the registers in the CPU as the process is being executed.
Priority
Whether the process is important enough to have priority in the scheduling algorithm.
Memory information
The amount of memory allocated for the process.
I/O information
A list of the I/O devices associated with the process.

Threads[edit | edit source]

A thread is a part of a process that is being executed. Processes can be single-threaded or multi-threaded. Multi-threaded processes are typically more efficient to execute, as different threads can run independently.

Threads can be run at the user level or at the kernel level. User-level threads are scheduled at the user level, so the kernel is not involved in thread scheduling. Kernel-level threads are scheduled at the kernel level, which makes it easier to access priveleged system functions.

Scheduling[edit | edit source]

Scheduling is the allocation of time to executing each process in the order that optimises the use of resources. A scheduling algorithm is an algorithm that determines which processes are run in what order.

Each process is in one of three states: ready, running, or blocked. Only one process can be running at a time. New processes always begin in the ready state.

To explain the scheduling algorithms, we will use an example of several processes being run. The first process, A, takes 10 cycles, the second process, B, takes 15 cycles, and the third process, C, takes 5 cycles. B and C arrive after A.


Clipboard

To do:
Illustrate the examples


First Come First Served[edit | edit source]

First Come First Served (FCFS) is a scheduling algorithm in which processes are done in the order that they arrived in the Ready queue.

In the example, this would mean process A is done first, process B next, and finally process C

Shortest Job First[edit | edit source]

Shortest Job First (SJF) is a scheduling algorithm which prioritises the shortest job in the queue.

In the example, process A would still go first, because B and C have not arrived, but it will be followed by process C, then process B.

Round Robin[edit | edit source]

A Round Robin algorithm allocates each time slice to processes in a repeating sequential order.

In the example with time slices of length 5 cycles, the time slices would go A, B, C, A, B, B.

Highest Response Ratio Next[edit | edit source]

With the highest response ratio next scheduling algorithm, the OS chooses the process with the highest response ratio. The response ratio is defined by where W is the time the process has been waiting and S is the time the process will take to run.

Priority-based[edit | edit source]

A priority-based scheduling algorithm chooses the next process to be done based on priority criteria. If a process is more important, it will be executed earlier.

Interrupts[edit | edit source]

An interrupt is a signal that is sent to the OS to tell it to stop the current process and focus on a different task.

Interrupts are used to enforce time slices in pre·emptive scheduling algorithms.

Memory Management[edit | edit source]

The OS is not only responsible for determining how processes are to be given time, but also how processes are to be given memory. Thus, the OS needs some kind of method to determine how memory is to be allocated amongst processes.

Dynamic Partitioning[edit | edit source]

If a partition between two partitions is released, the free space is now discontinuous
External Fragmentation

Dynamic partitioning is where the main memory is split into contiguous blocks which are exactly the right size for the process. This has the advantage of avoiding internal fragmentation but causes external fragmentation.

Paging[edit | edit source]

Paging is where the OS divides memory into equal-sized pages. Each process uses a certain amount of these pages.

The advantage of paging is that it can be used with virtual memory in order to increase the amount of memory that can be used at a given time. To do so, the main memory is divided into page-sized spaces called page frames. Pages can be loaded from the disk into these page frames. When a process is complete or not being executed, the pages can be loaded back to the disk.

To make sure that pages are being loaded in and out of memory efficiently, a page replacement algorithm needs to be used. A common page replacement algorithm is to put the least recently used page back into disk to make room for new pages.

Segmentation[edit | edit source]

Segmentation is like paging, except the memory is divided into blocks of varying, rather than fixed, size. Each segment is mapped to a section of main memory, similar to how pages are mapped to page frames in a paging system.

Virtual Memory[edit | edit source]

Virtual memory is where paging is used to expand the amount of memory available. In a virtual memory system, pages are loaded in and out of main memory as needed. Pages are given logical addresses which are mapped to page frames using a page map.

The main problem of virtual memory is that it causes disk thrashing. Disk thrashing is where pages are loaded in and out of the disk too often, which may damage the disk.

Virtual Machine[edit | edit source]

An example of a virtual machine
VirtualBox is an example of a virtual machine manager

A virtual machine is a way of emulating an operating system or process within a different operating system. To do this, the host OS interfaces with the virtual machine, while the user interfaces with the host OS.

Virtual machines are either system virtual machines or process virtual machines. System virtual machines emulate an entire operating system, whereas process virtual machines emulate a program running within a different operating system. Process virtual machines can run inside of system virtual machines.

Translation Software[edit | edit source]

Types of Translation software[edit | edit source]

Compilers[edit | edit source]

A compiler is a program that converts the program source code into an executable file, which can then be distributed.

Interpreters[edit | edit source]

An interpreter is a program which reads the source code line-by-line and executes it without producing an executable file.

Assemblers[edit | edit source]

An assembler is a program which converts assembly code into an executable file.

The translation process[edit | edit source]

Lexical analysis[edit | edit source]

Lexical analysis is the process of breaking the source code into its lexemes. A lexeme is a word or symbol that appears in the program.

e.g. x = y + 1 consists of five lexemes: x, =, y, +, and 1.

Lexical analysis also examines what type of token each lexeme is. Tokens can be of the following types:

Identifier
A variable, like x.
Keyword
A word that is used to control the program, like if or while. Keywords cannot be used as identifiers.
Operators
Symbols that act on the variables, such as + and =.
Literal
Literal values that are used in the program, such as "Hello" or 1.
Separator
Symbols that separate one statement from another, such as semicolons or line breaks.
Comment
A note by the programmer that clarifies what the code does. Comments are ignored by the compiler.

The end goal of lexical analysis is to produce a list of tokens which can be parsed in syntax analysis.

e.g. The code x = y + 1 may be converted by lexical analysis to [(identifier,x),(operator,=),(identifier,y),(operator,+),(literal,1)]. This array of tokens is then parsed by syntax analysis.

Syntax analysis[edit | edit source]

Syntax analysis is the process of converting the string of tokens from lexical analysis into a parse tree. A parse tree is a data structure which stores the tokens according to the laws of precedence.

A parse tree, as would be produced from syntax analysis
This parse tree results from the expression (x+y)*x-z*y/(x+x)

The laws of precedence, also called the order of operations, determines what operations in an expression are applied first. For instance, multiplication is applied before addition.

The full order of operations will typically resemble this:

  1. Function calls, array access, field access
  2. Unary operators
  3. Multiplication & Division
  4. Addition & Subtraction
  5. Comparisions
  6. Bitwise and, xor, then or
  7. Logical and, then or
  8. Assignment

Because operations with a lower precedence are applied later, operations with higher precedence tend to be at the bottom of the parse tree.

Syntax diagrams which define digits, constants, variables, factors, terms, and expressions in a certain system
Syntax diagrams are another way of representing the grammar of a language.

Backus-Naur Form (BNF)[edit | edit source]

Backus-Naur Form is a way of expressing the grammar of a programming language. BNF does so using a system of placeholders and alternatives.

Every place holder is defined as consisting of one of several alternatives. The symbol ::= means "is defined as". The symbol | means "or".

e.g. The following BNF definitions can be constructed based on the syntax diagrams to the right.

<digit> ::= 0|1|2|3|4|5|6|7|8|9
<variable> ::= x|y|z
<constant> ::= <digit>|<constant><digit>
<factor> ::= <constant>|<variable>|(<expression>)
<term> ::= <factor>|<term>*<factor>
<expression> ::= <term>|<expression>+<term>

Note that some definitions, such as the definition for <constant>, are recursive. This is useful as it allows us to create constants that are potentially infinite in length.

The advantage of using BNF as opposed to a syntax diagram is that BNF can be entered into and interpreted by a computer, whereas a syntax diagram can't be interpreted in the same way.

Optimisation[edit | edit source]

Optimisation is the process of making code more efficient. This means removing unnecessary steps and rephrasing complicated expressions.

e.g. The expression z = x^2 + 2*x*y + y^2 can be simplified to z = (x+y)^2

Optimisation is typically done in the back-end of the compiler, as it converts the program to assembly code and then an executable file.

Optimisation is beneficial because:

  • It reduces the amount of time that the program will take to run.
  • It reduces the amount of space that the program will take up in memory.

Reverse Polish Notation (RPN)[edit | edit source]

Reverse Polish Notation is a way of representing expressions such that it is easier to evaluate them with a stack. It is to be contrasted against the more common infix notation.

In infix notation, operands are placed on both sides of an operator, with the operator sandwiched between them. (e.g. 2+2) Whereas, in RPN, the operands are written first and the operator is placed at the end. (e.g. 2 2 +)

Converting from Infix notation to RPN with a parse tree[edit | edit source]

To convert an expression, such as (2+x)^(3-y), to RPN, it can be useful to make a parse tree.

A parse tree for the expression (2+x)^(3-y)
Evaluating RPN using a stack[edit | edit source]
Evaluating the RPN expression 3 10 5 + * with a stack

When evaluating an RPN expression with a stack, we need to follow a few rules:

  1. When we encounter a variable or number, we push its value onto the stack.
  2. When we encounter an operator, we pop the top values from the stack then apply the operation to them.
  3. When the evaluation is complete, there should be one item left in the stack, which is the answer.