Commutative Algebra/The Cayley–Hamilton theorem and Nakayama's lemma

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Determinants within a commutative ring[edit]

We shall now derive the notion of a determinant in the setting of a commutative ring.

Definition 7.1 (Determinant):

Let be a commutative ring, and let . A determinant is a function satisfying the following three axioms:

  1. , where is the identity matrix.
  2. If is a matrix such that two adjacent columns are equal, then .
  3. For each we have , where are columns and .

We shall later see that there exists exactly one determinant.

Theorem 7.2 (Properties of a (the) determinant):

  1. If has a column consisting entirely of zeroes, then .
  2. If is a matrix, and one adds a multiple of one column to an adjacent column, then does not change.
  3. If two adjacent columns of are exchanged, then is multiplied by .
  4. If any two columns of a matrix are exchanged, then is multiplied by .
  5. If is a matrix, and one adds a multiple of one column to any other column, then does not change.
  6. If is a matrix that has two equal columns, then .
  7. Let be a permutation, where is the -th symmetric group. If , then .


1. Let , where the -th column is the zero vector. Then by axiom 3 for the determinant setting ,


Alternatively, we may also set and to obtain


from which the theorem follows by subtracting from both sides.

Those proofs correspond to the proofs for for a linear map (in whatever context).

2. If we set or (dependent on whether we add the column left or the column right to the current column), then axiom 3 gives us


where the latter determinant is zero because we have to adjacent equal columns.

3. Consider the two matrices and . By 7.2, 2. and axiom 3 for determinants, we have


4. We exchange the -th and -th column by first moving the -th column successively to spot (using swaps) and the -th column, which is now one step closer to the -th spot, to spot using swaps. In total, we used an odd number of swaps, and all the other columns are in the same place since they moved once to the right and once to the left. Hence, 4. follows from applying 3. to each swap.

5. Let's say we want to add to the -th column. Then we first use 4. to put the -th column adjacent to , then use 2. to do the addition without change to the determinant, and then use 4. again to put the -th column back to its place. In total, the only change our determinant has suffered was twice multiplication by , which cancels even in a general ring.

6. Let's say that the -th column and the -th column are equal, . Then we subtract column from column (or, indeed, the other way round) without change to the determinant, obtain a matrix with a zero column and apply 1.

7. Split into swaps, use 4. repeatedly and use further that is a group homomorphism.

Note that we have only used axioms 2 & 3 for the preceding proof.

The following lemma will allow us to prove the uniqueness of the determinant, and also the formula .

Lemma 7.3:

Let and be two matrices with entries in a commutative ring . Then



The matrix has -th columns . Hence, by axiom 3 for determinants and theorem 7.2, 7. and 6., we obtain, denoting :

Theorem 7.4 (Uniqueness of the determinant):

For each commutative ring, there is at most one determinant, and if it exists, it equals



Let be an arbitrary matrix, and set and in lemma 7.3. Then we obtain by axiom 1 for determinants (the first time we use that axiom)


Theorem 7.5 (Multiplicativity of the determinant):

If is a determinant, then



From lemma 7.3 and theorem 7.4 we may infer


Theorem 7.6 (Existence of the determinant):

Let be a commutative ring. Then

is a determinant.


First of all, has nonzero entries everywhere except on the diagonal. Hence, if , then vanishes except , i.e. is the identity. Hence .

Let now be a matrix whose -th and -th columns are equal. The function

is bijective, since the inverse is given by itself. Furthermore, since amounts to composing with another swap, it is sign reversing. Hence, we have


Now since the -th and -th column of are identical, . Hence .

Linearity follows from the linearity of each summand:


Theorem 7.7:

The determinant of any matrix equals the determinant of the transpose of that matrix.


Observe that inversion is a bijection on the inverse of which is given by inversion (). Further observe that , since we just apply all the transpositions in reverse order. Hence,


Theorem 7.8 (column expansion):

Let be an matrix over a commutative ring . For define to be the matrix obtained by crossing out the -th row and -th column from . Then for any we have


Proof 1:

We prove the theorem from the formula for the determinant given by theorems 7.5 and 7.6.

Let be fixed. For each , we define



Proof 2:

We note that all of the above derivations could have been done with rows instead of columns (which amounts to nothing more than exchanging with each time), and would have ended up with the same formula for the determinant since

as argued in theorem 7.7.

Hence, we prove that the function given by the formula satisfies 1 - 3 of 7.1 with rows instead of columns, and then apply theorem 7.4 with rows instead of columns.


Set to obtain



Let have two equal adjacent rows, the -th and -th, say. Then


since each of the has two equal adjacent rows except for possibly and , which is why, by theorem 7.6, the determinant is zero in all those cases, and further , since in both we deleted "the same" row.


Define , and for each define as the matrix obtained by crossing out the -th row and the -th column from the matrix . Then by theorem 7.6 and axiom 3 for the determinant,


Hence follows linearity by rows.

For the sake of completeness, we also note the following lemma:

Lemma 7.9:

Let be an invertible matrix. Then is invertible.


Indeed, due to the multiplicativity of the determinant.

The converse is also true and will be proven in the next subsection.


  • Exercise 7.1.1: Argue that the determinant, seen as a map from the set of all matrices (where scalars are -matrices), is idempotent.

Cramer's rule in the general case[edit]

Theorem 7.10 (Cramer's rule, solution of linear equations):

Let be a commutative ring, let be a matrix with entries in and let be a vector. If is invertible, the unique solution to is given by


where is obtained by replacing the -th column of by .

Proof 1:

Let be arbitrary but fixed. The determinant of is linear in the first column, and hence constitutes a linear map in the first column mapping any vector to the determinant of with the -th column replaced by that vector. If is the -th column of , . Furthermore, if we insert a different column into , we obtain zero, since we obtain the determinant of a matrix where the column appears twice. We now consider the system of equations

where is the unique solution of the system , which exists since it is given by since is invertible. Since is linear, we find an matrix such that for all


in fact, due to theorem 7.8, . We now add up the lines of the linear equation system above in the following way: We take times the first row, add times the second row and so on. Due to our considerations, this yields the result


Due to lemma 7.9, is invertible. Hence, we get

and hence the theorem.

Proof 2:

For all , we define the matrix

this matrix shall represent a unit matrix, where the -th column is replaced by the vector . By expanding the -th column, we find that the determinant of this matrix is given by .

We now note that if , then . Hence


where the last equality follows as in lemma 7.9.

Theorem 7.11 (Cramer's rule, matrix inversion):

Let be an matrix with entries in a ring . We recall that the cofactor matrix of is the matrix with -th entry


where is obtained from by crossing out the -th row and -th column. We further recall that the adjugate matrix was given by


With this definition, we have


In particular, if is a unit within , then is invertible and



For , we set , where the zero is at the -th place. Further, we set to be the linear function from proof 1 of theorem 7.10, and its matrix. Then is given by

due to theorem 7.8. Hence,

where we used the properties of established in proof 1 of theorem 7.10.

The theorems[edit]

Now we may finally apply the machinery we have set up to prove the following two fundamental theorems.

Theorem 7.12 (the Cayley–Hamilton theorem):

Let be a finitely generated -module, let be a module morphism and let be an Ideal of such that . Then there exist and such that


this equation is to be read as


where means applying to times.

Note that the polynomial in is monic, that is, the leading coefficient is , the unit of the ring in question.

Proof: Assume that is a generating set for . Since , we may write


where for each . We now define a new commutative ring as follows:


where we regard each element of as the endomorphism on . That is, is a subring of the endomorphism ring of (that is, multiplication is given by composition). Since is -linear, is commutative.

Now to every matrix with entries in we may associate a function


By exploiting the linearities of all functions involved, it is easy to see that for another matrix with entries in called , the associated function of equals the composition of the associated functions of and ; that is, .

Now with this in mind, we may rewrite the system (*) as follows:


where has -th entry . Now define . From Cramer's rule (theorem 7.11) we obtain that


which is why

, the zero vector.

Hence, is the zero mapping, since it sends all generators to zero. Now further, as can be seen e.g. from the representation given in theorem 7.4, it has the form

for suitable .

Theorem 7.13 (Nakayama's lemma):

Let be a ring, a finitely generated -module and an ideal such that . Then there exists an such that .


Choose in theorem 7.12 to obtain for that

for suitable , since the identity is idempotent.