Introduction to Numerical Methods/System of Linear Equations

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Objectives

  • define vector and matrix
  • add, subtract, and multiply matrices
  • find the transpose of a square matrix
  • define the inverse of a matrix
  • setup simultaneous linear equations in matrix form

Resources

Vectors and Matrices[edit]

A matrix is a rectangular array of things, such as numbers, symbols, or expressions. Matrices are commonly used to express linear transformations and system of linear equations.

A triangular matrix is a special type of square matrices. If all entries of A below the main diagonal are zero, A is called an upper triangular matrix.

for all

Similarly if all entries of A above the main diagonal are zero, A is called a lower triangular matrix.

for all

If all entries outside the main diagonal are zero, A is called a diagonal matrix.

for all

This matrix

is upper triangular and this matrix

is lower triangular.

The identity matrix of size n is the n-by-n matrix in which all the elements on the main diagonal are equal to 1 and all other elements are equal to 0.

Matrix Multiplication[edit]

Matrix multiplication diagram 2.svg

Matrix multiplication takes two matrices and produces another matrix. The rule for the operation is illustrated as follows:

The values at the intersections marked with circles are:

Transpose of a Matrix[edit]

The transpose of a matrix is defined as follows: The transpose of a matrix A is another matrix AT created by any one of the following equivalent actions:

  • reflect A over its main diagonal (which runs from top-left to bottom-right) to obtain AT
  • write the rows of A as the columns of AT
  • write the columns of A as the rows of AT:

The following figure illustrates the idea:

The transpose AT of a matrix A can be obtained by reflecting the elements along its main diagonal. Repeating the process on the transposed matrix returns the elements to their original position.

Inverse of a Matrix[edit]

An n-by-n square matrix A is called invertible if there exists an n-by-n square matrix B such that

where In denotes the n-by-n identity matrix and the multiplication used is ordinary matrix multiplication. B is called the inverse of A, denoted by A−1.

Represent System of Linear Equations in Matrix Form[edit]

We can represent this system of linear equations in matrix form as shown below:

Then we can use matrix multiplication to separate the variables.

The general matrix form for system of linear equations is as follows:

Matrix A is the coefficient matrix. X is the solution vector (matrix) and C is the right hand side vector. If we multiply the inverse of A on bother sides we can see the solution is closely related to the inverse of A.

Gaussian Elimination[edit]

Sources:

Gaussian elimination is an algorithm for solving a system of linear equations, which is similar to finding the inverse of a invertible square matrix. The algorithm consists of a sequence of row reduction operations performed on the associated matrix of coefficients. There are three types of elementary row operations:

  • swapping two rows
  • multiplying a row by a non-zero constant
  • adding a multiple of one row to another row.

For example the first linear equation (L1, pivot equation) can be used to eliminate from the next two equations: :

Then L2 (pivot equation) can be used to eliminate from L3. This process is called forward elimination. Now L3 can be solved with one unknown , which can be used to substitute the in L2 to solve . This process is called backward substitution.

The algorithm for the Gaussian Elimination method can be implemented as follows:

''' 
x = gauss_elimin(a, b)
Solves [a][x] = [b] by Gauss elimination.
'''
from numpy import dot, array

def gauss_elimin(a, b):
  (rows, cols) = a.shape
  # elimination phase
  for row in range(0, rows-1): # pivot equation/row
    for i in range(row+1, rows):
      if a[i, row] != 0.0:
        factor = a[i, row]/a[row, row]
        a[i, row+1:rows] = a[i, row+1:rows] - factor*a[row, row+1:rows]
        b[i] = b[i] - factor*b[row]
  # back substitution 
  for k in range(rows-1,-1,-1):
    b[k] = (b[k] - dot(a[k, k+1:rows],b[k+1:rows]))/a[k, k]
  return b

a = array([[3, 2.0], [-6, 6]])
b = array([7, 6])
print gauss_elimin(a, b)

Gaussian Elimination with Partial Pivoting[edit]

Gaussian elimination method is prone to round-off errors especially when a large number equations allows the error to accumulate. Another problem is the division by zero error. For example the following matrix will break the method because the second pivot element is zero and the method requires the pivot elements to be non-zero.

Gaussian elimination with partial pivoting is similar to the Gaussian elimination except that in the th forward elimination step the maximum of is found and the row that contains the maximum will be swapped with the pivot row. The previous coefficient matrix would look as follows after the swaps.

Gauss-Seidel Method[edit]

Gaussian elimination is a direct (straightforward) method that transforms the original equations to equivalent ones that are easier to solve. Some systems of equations have no solution because for example the number of equations is less than the number of unknowns or one equation contradicts another equation.

Gauss-Seidel method is an iterative (or indirect) method that starts with a guess at the solution and repeatedly refine the guess till it converges (the convergence criterion is met). This method is advantageous because it is more computationally efficient when the coefficient matrix is large and sparse and the round-off errors can be control by adjusting the error tolerance.

The algorithm consists the following steps:

  1. rewrite each equation for the corresponding unknowns, for example, the first equation is written with on the left-hand side and the second equation with on the left-hand side:
  1. find initial guess for the 's.
  2. use the rewritten equations to calculate the new estimates - always using the most recent estimates.
  3. repeat the previous step till the largest absolute relative approximate error among all 's is less than the specified tolerance.

The drawback of most iterative method is that it may not converge when the system of equations has a coefficient matrix that is not diagonally dominant. A system of equations where the coefficient matrix is diagonally dominant does always converge. The matrix A is diagonally dominant if

and

An implementation in Python using numpy simply iterates to produce the solution vector.

LU Decomposition[edit]

LU decomposition factors the coefficient matrix A to the product of a lower triangular matrix and an upper triangular matrix: A = LU. The process of deriving L and U from A is called LU decomposition or LU factorization, which is similar to Gaussian elimination method.

Once LU decomposition is done, we can use the system of linear equations represented by A using forward and back substitutions:

Let , which means and can be solved by forward elimination. From we can solve X using backward elimination.

One method of LU decomposition is similar to Gaussian elimination. For a :

From the last matrix we can see to eliminate we need to multiple the first row of A by and similarly to eliminate by multiplying the first row by . To calculate L we can simply record the multipliers used in the forward elimination steps and the matrix U is the same as the resultant upper triangular matrix from forward elimination.

Sample code in Python

Find the Inverse of a Matrix[edit]

Finding the inverse of a matrix is a process related to solving the system of linear equations. If the inverse of a coefficient matrix is found then the system of equations represented by the coefficient matrix is solved.

One way to find the inverse of a matrix is to find each column of the inverse matrix separately by solving a system of linear equations. If and

then

.

We can calculate the first column of by solving a system of equations using the LU decomposition method. The rest of the columns can be calculated in a similar fashion. As you can see the LU decomposition method is advantageous the LU decomposition is performed once and the result is used many times, which lowers the computational complexity of the whole solution.