Linear Algebra/Eigenvalues and eigenvectors

From Wikibooks, open books for an open world
< Linear Algebra
Jump to: navigation, search

Eigenvalues and eigenvectors are related to fundamental properties of matrices.

The word eigenvalue comes from the German Eigenwert which means "proper or characteristic value."

Motivations[edit]

Large matrices can be costly, in terms of computational time, to use, and may have to be iterated hundreds or thousands of times for a calculation. Additionally, the behavior of matrices would be hard to explore without important mathematical tools. One mathematical tool, which has applications not only for Linear Algebra but for differential equations, calculus, and many other areas, is the concept of eigenvalues and eigenvectors. Eigenvalues and eigenvectors are based upon a common behavior in linear systems. Let's look at an example.

Let

A=\begin{pmatrix} 
1 & 2  \\
0 & -2 \\
\end{pmatrix}

and

\mathbf{x}=\begin{pmatrix} -2 \\ 3 \\ \end{pmatrix},\quad \mathbf{y}=\begin{pmatrix} 1 \\ 0 \\ \end{pmatrix}.

What happens with x and y if they are transformed by A? Well,

A\mathbf{x}=\begin{pmatrix} 4 \\ -6 \\ \end{pmatrix}
A\mathbf{y}=\begin{pmatrix} 1 \\ 0 \\ \end{pmatrix}

But what is remarkable is that

A\mathbf{x}=(-2)\begin{pmatrix} -2 \\ 3 \\ \end{pmatrix}=-2\mathbf{x}
A\mathbf{y}=(1)\begin{pmatrix} 1 \\ 0 \\ \end{pmatrix}=\mathbf{y}=(1)\mathbf{y}

So when we operate on the vector x with the matrix A, instead of getting a different vector (as we would normally do), we get the same vector x multiplied by some constant. And the same goes for vector y.

We call the values 1 and -2 the eigenvalues of the matrix A, and the vectors x and y are called eigenvectors for the matrix A.

Definitions[edit]

We now generalize this concept of when a matrix/vector product is the same as a product by a scalar as above: essentially if we have a n×n matrix A, we seek solutions in v to find the eigenvectors, and solutions in λ to find the eigenvalues for the equation

Avv

How are we to do this? Let us rearrange the equation

Avv=0
(A-λI)v=0 (note we must multiply the scalar by the identity matrix otherwise A-λ makes no sense)

But (A-λI) is a matrix, so we are trying to solve Bv=0 where B=(A-λI), and this solution is merely the kernel of B, ker B. So the eigenvectors are in ker (A-λI), where λ is an eigenvalue. But how do we find the eigenvalues?

Bv=0 has nonzero solution if |B| = det(B) is zero. So to find the eigenvalues, we let |A-λI|=0 and then solve for λ. We will thus obtain a polynomial equation over the complex numbers (eigenvalues can be complex), known as the characteristic equation. The roots of the characteristic equation are the eigenvalues.

Note that we exclude 0 as an eigenvector, because it is trivially a solution to Avv and is not really interesting to consider. Additionally, if the zero vector were to be included, it would allow for an infinite number of eigenvalues, since any value of λ satisfies A00.

If we have an eigenvalue λ of a matrix A, together with a corresponding eigenvector, x, then any multiple of x is also an eigenvector for the same eigenvalue. To see that kx is also an eigenvector, follow this argument: If Axx, then A(kx)=kAx=kλx=λ(kx). (Here k may be any scalar.) Thus, every multiple of an eigenvector is also an eigenvector.

Note the asymmetry here: eigenvalues are unique, while an eigenvalue has many eigenvectors.

Finding eigenvalues and eigenvectors[edit]

Here are some examples of finding eigenvalues and eigenvectors using our definitions.

Let

A = \begin{pmatrix} 3 & 0 \\
                           -1 & 2 \end{pmatrix}

Firstly, we expand |A-λI|=0 to find the eigenvalues:

\left| \begin{pmatrix} 3 & 0 \\  
                                                -1 & 2 \end{pmatrix}-
                \begin{pmatrix} \lambda & 0 \\ 0 & \lambda\end{pmatrix}
               \right|=0


\begin{vmatrix} 3 - \lambda & 0 \\
                                 -1 & 2 - \lambda \end{vmatrix}=0
 (3-\lambda)(2-\lambda)-(0)(-1)=0
 (3-\lambda)(2-\lambda)=0

Now, elementary algebra tells us the roots of this equation are 3 and 2, and thus these are our eigenvalues.

(Exercise: prove that in a 2×2 triangular matrix the eigenvalues are on the principal diagonal. Harder: generalize this result)

Now we can find our eigenvectors. Consider the first eigenvalue λ=3. To find our first eigenvector

 \mbox{ker} (A-3I) =  \mbox{ker} 
            \begin{pmatrix} 3-3 & 0 \\
                           -1 & 2-3 \end{pmatrix}
                            =  \mbox{ker} 
            \begin{pmatrix} 0 & 0 \\
                           -1 & -1 \end{pmatrix}

At this point we can row-reduce and back-substitute, but usually it suffices to guess the kernel since our matrix is small and we have linearly dependent columns. Now, observe:

 \begin{pmatrix}
                            0 & 0 \\
                           -1 & -1 \end{pmatrix} 
\begin{pmatrix} a \\ -a \end{pmatrix}=\mathbf{0}

So, for any scalar a, the vector

\begin{pmatrix} a \\ -a \end{pmatrix} is an eigenvector. Stated another way, the set of all eigenvectors of the matrix A includes the set \mbox{span} \{\begin{pmatrix} 1 \\ -1 \end{pmatrix}\}. In the plane, this represents a line of slope -1 through the origin.

As noted above the eigenvalues of a matrix are uniquely determined, but for each eigenvalue there are many eigenvectors. We usually choose an eigenvector for some convenience such as "most whole number entries", "first entry is 1", or "length of the eigenvector is 1". Most Computer Algebra Systems choose unit vectors for eigenvectors.

So here we may take \begin{pmatrix} 1 \\ -1 \end{pmatrix} to be the eigenvector, for example.

Similarly for our second eigenvalue λ=2, to find our second eigenvector:

 \mbox{ker} (A-2I) =  \mbox{ker} 
            \begin{pmatrix}1 & 0 \\
                           -1 & 0 \end{pmatrix}=
\mbox{span} \{\begin{pmatrix} 0 \\ 1 \end{pmatrix}\}=\mathbf{0}

And so, our second eigenvector is chosen as

\begin{pmatrix} 0 \\ 1 \end{pmatrix}.

Our eigenvalues then are λ=2,3, with eigenvectors \begin{pmatrix} 1 \\ -1 \end{pmatrix}, \begin{pmatrix} 0 \\ 1 \end{pmatrix}, as may be checked by multiplying each by the given matrix.

(We also could choose \begin{pmatrix} 1/\sqrt(2) \\-1/\sqrt(2)  \end{pmatrix} as an eigenvector for the eigenvalue λ=3 . Check this.)

Problem set[edit]

Given the above, find the eigenvalues and eigenvectors of the following matrices (Answers follow to even-numbered questions):

  1. \begin{pmatrix} 3 & 0 \\ -4 & 5 \end{pmatrix}
  2. \begin{pmatrix} 1 & 1 \\ 3 & -1 \end{pmatrix}
  3. \begin{pmatrix} -2 & 0 & 3 \\ 2 & 4 & 0 \\ 1 & 0 & 0 \end{pmatrix}
(Harder. Hint: one eigenvalue is 4.)

Answers[edit]

  1. eigenvalues: 3, 5; eigenvectors: \begin{pmatrix} 1 \\ 2 \end{pmatrix}, \begin{pmatrix}  0\\1  \end{pmatrix}
  2. eigenvalues: -2, 2; eigenvectors: \begin{pmatrix} -1 \\ 3 \end{pmatrix}, \begin{pmatrix}  1\\1  \end{pmatrix}
  3. eigenvalues: -3, 1, 4; eigenvectors: \begin{pmatrix}21\\-6\\-7\end{pmatrix},\begin{pmatrix}3\\-2\\3\end{pmatrix},\begin{pmatrix}0\\1\\0\end{pmatrix}

Applications[edit]

Eigenvalues and eigenvectors are not mere pretty facts about these vectors; they have relevant and important applications.

Matrix powers[edit]

Let us first examine a certain class of matrices known as diagonal matrices: these are matrices in the form

\begin{pmatrix} 
a_0 & 0 & 0 & \ldots & 0 \\
0 & a_1 & 0 & \ldots & 0 \\
0 & 0 & a_2 & \ldots & 0 \\
0 & 0 & 0 & \ldots & a_k \end{pmatrix}

Now, observe that

\begin{pmatrix} 
a_0 & 0 & 0 & \ldots & 0 \\
0 & a_1 & 0 & \ldots & 0 \\
0 & 0 & a_2 & \ldots & 0 \\
0 & 0 & 0 & \ldots & a_k \end{pmatrix}^k=
\begin{pmatrix} 
a_0^k & 0 & 0 & \ldots & 0 \\
0 & a_1^k & 0 & \ldots & 0 \\
0 & 0 & a_2^k & \ldots & 0 \\
0 & 0 & 0 & \ldots & a_k^k \end{pmatrix}

This is a very useful property! However, the number of matrices to which we can apply this fact is clearly limited, so we ask ourselves whether we can transform a given matrix into a diagonal matrix.

The answer to this question is "sometimes", but for the moment, we will only look at matrices for which this answer is "yes".

What we seek is a matrix P such that

PAP-1=D

where D is diagonal.

If such a matrix P exists, we say that A is diagonalizable. (Note that xyx-1 is often called a similarity transformation).

Then

PAP-1=D
AP-1=P-1D

by multiplying throughout forward by P-1, then

A=P-1DP

by multiplying backward by P.

Now, we have

Ak=(P-1DP)k
=(P-1DP)(P-1DP)(P-1DP)... (k times)
=P-1D(PP-1)D(PP-1)DP... (k times)

The PP-1 terms cancel to give

=P-1DDD...P (k times)
=P-1DkP

We can calculate Dk easily, so we need to find P.

It turns out (the entire proof is quite difficult) that we simply create a matrix from concatenating the linearly independent eigenvectors to create P.

D, then, is the diagonal matrix containing the eigenvalues on the main diagonal corresponding to the associated eigenvectors (the eigenvalue in the first place corresponds to the eigenvector it is created from, in the first column).

Example[edit]

Let's work through an example to show these ideas.

A=\begin{pmatrix} 
3 & 1 \\
4 & 0 \\
\end{pmatrix}

So what do we do if we want to find A14? Let's use the method we've just described.

Find the eigenvalues:

|A-λI|=0
(3-λ)(-λ)-4=0
λ2-3λ-4=0
λ=-1, 4

Find the eigenvectors:

for λ=-1 
\mbox{ker}\begin{pmatrix} 
4 & 1 \\
4 & 1 \\
\end{pmatrix}=\mbox{span}\{\begin{pmatrix}-1 \\ 4\end{pmatrix}\}
for λ=4 
\mbox{ker}\begin{pmatrix} 
-1 & 1 \\
4 & -4 \\
\end{pmatrix}=\mbox{span}\{\begin{pmatrix}1 \\ 1\end{pmatrix}\}

The eigenvectors are then

\begin{pmatrix}-1 \\ 4\end{pmatrix}, \begin{pmatrix}1 \\ 1\end{pmatrix}

so put the eigenvectors together to form the matrix P

P=\begin{pmatrix}-1 & 1 \\ 4 & 1\end{pmatrix}

Now -1 generated the eigenvector in the first column, and 4 generated the eigenvector in the second column, so form D in this way:

D=\begin{pmatrix} 
-1 & 0 \\
0 & 4 \\
\end{pmatrix}

We can easily calculate (-1)14=1, so we get

D^{14}=\begin{pmatrix} 
1 & 0 \\
0 & 4^{14} \\
\end{pmatrix}

and we have the fast method for creating inverses of 2×2 matrices:

P^{-1} = -\frac{1}{5}\begin{pmatrix}1 & -1 \\
                             -4 &-1 \end{pmatrix}

So now we can now directly multiply out

-\frac{1}{5}
\begin{pmatrix}
1 & -1 \\ 
-4 & -1\end{pmatrix}
\begin{pmatrix} 
1 & 0 \\
0 & 4^{14} \end{pmatrix}
\begin{pmatrix}
-1 & 1 \\ 
4 & 1\end{pmatrix}

Simplifying we get


\begin{pmatrix}
 214748365 & 53687091 \\
 214748364 & 53687092 \end{pmatrix}

Problem set[edit]

Given the above, find the following matrix powers (Answers follow to even-numbered questions):

  1. \begin{pmatrix} 1 & 0 \\ -2 & 5 \end{pmatrix}^6
  2. \begin{pmatrix} 1 & 3 \\ 3 & -7 \end{pmatrix}^5
  3. \begin{pmatrix} -1 & 9 \\ -10 & 16 \end{pmatrix}^4
  4. \begin{pmatrix} -2 & 0 & 3 \\ 0 & 4 & 0 \\ 0 & 0 & 2 \end{pmatrix}^3
(More tedious: only slightly easier because this matrix is in row echelon form)
Answers[edit]
2.\begin{pmatrix} -3248 & 9840 \\ 9840 & -29488\end{pmatrix}
4.\begin{pmatrix}1/4\end{pmatrix}*\begin{pmatrix} -32 & 0 & 48\\  0 & 256 & 0\\ 0 & 0 & 32\end{pmatrix}

Coupled ordinary differential equations[edit]

We can use the method of diagonalisation to solve coupled ordinary differential equations . For example, let x(t) and y(t) be differentiable functions and x' and y' their derivatives. The differential equations are relatively difficult to solve:

x' = 4x - y
y' = 2x + y

but

u' = ku for a constant k is easy to solve

it has solution

u = Aekx where A is a constant

remembering this fact, we translate the ODEs into matrix form


\begin{pmatrix}
x'\\
y'
\end{pmatrix}
=
\begin{pmatrix}
4&-1\\
2&1
\end{pmatrix}
\begin{pmatrix}
x\\
y
\end{pmatrix}

Diagonalise the square matrix, we get:


\begin{pmatrix}
x'\\
y'
\end{pmatrix}
=
\begin{pmatrix}
1&1\\
1&2
\end{pmatrix}
\begin{pmatrix}
3&0\\
0&2
\end{pmatrix}
\begin{pmatrix}
1&1\\
1&2
\end{pmatrix}^{-1}
\begin{pmatrix}
x\\
y
\end{pmatrix}

we put


\begin{pmatrix}
u\\
v
\end{pmatrix}
=
\begin{pmatrix}
1&1\\
1&2
\end{pmatrix}^{-1}
\begin{pmatrix}
x\\
y
\end{pmatrix}

then it follows that


\begin{pmatrix}
u'\\
v'
\end{pmatrix}
=
\begin{pmatrix}
1&1\\
1&2
\end{pmatrix}^{-1}
\begin{pmatrix}
x'\\
y'
\end{pmatrix}

thus


\begin{pmatrix}
u'\\
v'
\end{pmatrix}
=
\begin{pmatrix}
3&0\\
0&2
\end{pmatrix}
\begin{pmatrix}
u\\
v
\end{pmatrix}

as discussed above the solutions are easy. We have

u = Ce^{3t}
v = De^{2t}

for some constants C and D. Now that


\begin{pmatrix}
u\\
v
\end{pmatrix}
=
\begin{pmatrix}
1&1\\
1&2
\end{pmatrix}^{-1}
\begin{pmatrix}
x\\
y
\end{pmatrix}

we get



\begin{pmatrix}
1&1\\
1&2
\end{pmatrix}
\begin{pmatrix}
u\\
v
\end{pmatrix}
=
\begin{pmatrix}
x\\
y
\end{pmatrix}

and so

x = Ce^{3t} + De^{2t}
y = Ce^{3t} + 2De^{2t}

This method generalises well into higher dimensions.

Coupled differential equations[edit]

Matrices, strangely enough, have a great use in relation to calculus in the calculation of solutions to coupled differential equations, where one differential equation has some function that depends on another differential equation. For example:

D y = 3y + x
D x = y + 3x

Without going any further, the solution to these differential equations looks very difficult! However if we formulate this in terms of matrices, it becomes a little bit easier to analyze.

Example[edit]

Let's take the above example, so

D y(t) = 3y + x
D x(t) = y + 3x

Now form a vector:

 \mathbf{v}(t) = \begin{pmatrix} y \\ x \end{pmatrix}

Then

 D \mathbf{v}(t) = \begin{pmatrix} D y \\ D x \end{pmatrix}

Now the problem becomes

 D(\mathbf{v}) = \begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix}\mathbf{v}

This is reminiscent of the differential equation we have already encountered in calculus, that of

D y = ky

in which the solution is y = cekt. We can make a wild guess then the solution to the above matrix equation will have a solution in a similar form.

So let's try a solution v = weλt. Then D v = λweλt.

Let us then try and substitute this guess solution into our equation:

\lambda\mathbf{w}e^{\lambda t} = \begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix}\mathbf{w}e^{\lambda t}

If we let

 A = \begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix}

we see that the equation above becomes, on dividing through by e^{\lambda t} (since it is never zero)

 A\mathbf{w} = \lambda\mathbf{w}

But wait - this is the equation before to find the eigenvalues - and we have that the solution v = weλt is a solution if and only if λ is an eigenvalue of A and w is its corresponding eigenvector.

The eigenvalues are 4, 2, with eigenvectors

\begin{pmatrix} 1 \\ 1 \end{pmatrix}, \begin{pmatrix} -1 \\ 1 \end{pmatrix}

respectively.

So we have two solutions

\begin{pmatrix} 1 \\ 1 \end{pmatrix}e^{4 t}

and

\begin{pmatrix} -1 \\ 1 \end{pmatrix}e^{2 t}

Note that if we have two solutions to the differential equation D v = Av, linear combinations of the two solutions will give the same solution. So then we have then the general solution:

\mathbf{v}=j\begin{pmatrix} 1 \\ 1 \end{pmatrix}e^{4 t}+k\begin{pmatrix} -1 \\ 1 \end{pmatrix}e^{2 t}=
\begin{pmatrix} y(t) \\ x(t) \end{pmatrix}=j\begin{pmatrix} 1 \\ 1 \end{pmatrix}e^{4 t}+k\begin{pmatrix} -1 \\ 1 \end{pmatrix}e^{2 t}=

Seperating into the first and second components we get our two solutions

 y(t) = je^{4 t}-ke^{2 t}, x(t) = je^{4 t}+ k e^{2 t}.

Problem set[edit]

Given the above solve the following problems (Answers follow to even-numbered questions)

  1. Find y(t) and x(t) where D y(t)=3x(t)+6y(t) and D x(t)=x(t)+4y(t)
  2. Find y(t) and x(t) where D y(t)=2x(t)+2y(t) and D x(t)=x(t)-2y(t)
Answers[edit]

Form the matrix

A=\begin{pmatrix} 2 & 2 \\ 1 & -2 \end{pmatrix}.

The eigenvalues of this matrix are

\pm\sqrt{6}

and the eigenvectors are

\begin{pmatrix} 2-\sqrt{6} \\ 1 \end{pmatrix}, \begin{pmatrix} 2+\sqrt{6} \\ 1 \end{pmatrix}

So now

\begin{pmatrix} y(t) \\ x(t) \end{pmatrix}=\alpha\begin{pmatrix} 2-\sqrt{6} \\ 1 \end{pmatrix}e^{-\sqrt{6}t}+\beta\begin{pmatrix} 2+\sqrt{6} \\ 1 \end{pmatrix}e^{\sqrt{6}t}

and y(t) and x(t) can be read off by inspection.

External links[edit]