Objects and morphisms

Basics

Definition 1.1 (categories):

A category ${\mathcal {C}}$ is a collection of objects together with morphisms, which go from an object $a\in {\mathcal {C}}$ to an object $b\in {\mathcal {C}}$ each (where $a$ is called the domain and $b$ the codomain), such that

Any morphism $f:a\to b$ can be composed with a morphism $g:b\to c$ such that the composition of the two is a morphism $g\circ f:a\to c$ .
For each $a\in {\mathcal {C}}$ , there exists a morphism $1_{a}:a\to a$ such that for any morphism $f:a\to b$ we have $f\circ 1_{a}=f$ and for any morphism $g:b\to a$ we have $1_{a}\circ g=g$ .

Examples 1.2:

The collection of all groups together with group homomorphisms as morphisms is a category.
The collection of all rings together with ring homomorphisms is a category.
Sets together with ordinary functions form the category of sets.

To every category we may associate an opposite category:

Definition 1.3 (opposite categories):

Let ${\mathcal {C}}$ be a category. The opposite category of ${\mathcal {C}}$ is the category consisting of the objects of ${\mathcal {C}}$ , but all morphisms are considered to be inverted, which is done by simply define codomains to be the domain of the former morphism and domains to be codomains of former morphisms.

For instance, within the opposite category of sets, a function $f:S\to T$ (where $S$ , $T$ are sets) is a morphism $T\to S$ .

Algebraic objects within category theory

A category is such a general object that some important algebraic structures arise as special cases. For instance, consider a category with one object. Then this category is a monoid with composition as its operation. On the other hand, if we are given an arbitrary monoid, we can define the elements of that monoid to be the morphisms from a single object to itself, and thus have found a representation of that monoid as a category with one object.

If we are given a category with one object, and the morphisms all happen to be invertible, then we have in fact a group structure. And further, just as described for monoids, we can turn every group into a category.

Special types of morphisms

The following notions in category may have been inspired by stuff that happens within the category of sets and similar categories.

In the category of sets, we have surjective functions and injective functions. We may characterise those as follows:

Theorem 1.4:

Let $X,Y$ be sets and $f:X\to Y$ be a function. Then:

$f$ is surjective if and only if for all sets $Z$ and functions $g,h:Y\to Z$ $g\circ f=h\circ f$ implies $g=h$ .
$f$ is injective iff for all sets $W$ and functions $g,h:W\to X$ $f\circ g=f\circ h$ implies $g=h$ .

Proof:

We begin with the characterisation of surjectivity.

$\Rightarrow$ : Let $f$ be surjective, and let $g\circ f=h\circ f$ . Let $y\in Y$ be arbitrary. Since $f$ is surjective, we may choose $x\in X$ such that $f(x)=y$ . Then we have $g(y)=g(f(x))=h(f(x))=h(y)$ . Since $y\in Y$ was arbitrary, $g=h$ .

$\Leftarrow$ : Assume that for all sets $Z$ and functions $g,h:Y\to Z$ $g\circ f=h\circ f$ implies $g=h$ . Assume for contradiction that $f$ isn't surjective. Then there exists $y_{0}\in Y$ outside the image of $f$ . Let $Z=\{1,2\}$ . We define $g,h:Y\to Z$ as follows:

g(y)=1\forall y\in Y

,

h(y)={\begin{cases}1&y\neq y_{0}\\2&y=y_{0}\end{cases}}

.

Then $g\circ f=h\circ f=1$ (since $y_{0}$ , the only place where the second function might be $2$ , is never hit by $f$ ), but $g\neq h$ .

Now we prove the characterisation of injectivity.

$\Rightarrow$ : Let $f$ be injective, let $W$ be another set and let $g,h:W\to X$ be two functions such that $f\circ g=f\circ h$ . Assume that $g(w)\neq h(w)$ for a certain $w\in W$ . Then $f(g(w))\neq f(h(w))$ due to the injectivity of $f$ , contradiction.

$\Leftarrow$ : Assume that for all sets $W$ and functions $g,h:W\to X$ $f\circ g=f\circ h$ implies $g=h$ . Let $x,y\in X$ be arbitrary such that $f(x)=f(y)$ . Take $W=\{1\}$ and $g(1)=x,h(1)=y$ . Then $x=y$ and hence surjectivity. $\Box$

It is interesting that the change from injectivity and surjectivity swapped the use of indirect proof from the $\Leftarrow$ -direction to the $\Rightarrow$ -direction.

Since in the characterisation of injectivity and surjectivity given by the last theorem there is no mention of elements of sets any more, we may generalise those concepts to category theory.

Definition 1.5:

Let ${\mathcal {C}}$ be a category, and let $f$ be a morphism of ${\mathcal {C}}$ . We say that

$f:X\to Y$ is an epimorphism if and only if for all objects $Z$ of ${\mathcal {C}}$ and all morphisms $g,h:Y\to Z$ $g\circ f=h\circ f\Rightarrow g=h$ , and
$f:X\to Y$ is a monomorphism if and only if for all objects $W$ of ${\mathcal {C}}$ and all morphisms $g,h:W\to X$ $f\circ g=f\circ h\Rightarrow g=h$ .

Exercises

Exercise 1.3.1: Come up with a category ${\mathcal {C}}$ , where the objects are some finitely many sets, such that there exists an epimorphism that is not surjective, and a monomorphism that is not injective (Hint: Include few morphisms).

Terminal, initial and zero objects and zero morphisms

Within many categories, such as groups, rings, modules,... (but not fields), there exist some sort of "trivial" objects which are the simplest possible; for instance, in the category of groups, there is the trivial group, consisting only of the identity. Indeed, within the category of groups, the trivial group has the following property:

Theorem 1.6:

Let $|G|=1$ and let $H$ be another group. Then there exists exactly one homomorphism $f:H\to G$ and exactly one homomorphism $g:G\to H$ .

Futhermore, if ${\tilde {G}}$ is any other group with the property that for every other group $H$ , there exists exactly one homomorphism ${\tilde {G}}\to H$ and exactly one homomorphism $H\to {\tilde {G}}$ , then $|{\tilde {G}}|=1$ .

Proof: We begin with the first part. Let $f:H\to G$ be a homomorphism, where $|G|=1$ . Then $f$ must take the value of the one element of $G$ everywhere and is thus uniquely determined. If furthermore $g:G\to H$ is a homomorphism, by the homomorphism property we must have $g(\iota )=1_{H}$ (otherwise obtain a contradiction by taking a power of $\iota$ ).

Assume now that $|{\tilde {G}}|>1$ , and let $\tau$ be an element within ${\tilde {G}}$ that does not equal the identity. Let $n:={\text{ord}}\tau$ . We define a homomorphism $f:Z_{n}\to {\tilde {G}}$ by $f(k):=\tau ^{k}$ . In addition to that homomorphism, we also have the trivial homomorphism $Z_{n}\to {\tilde {G}}$ . Hence, we don't have uniqueness. $\Box$

Using the characterisation given by theorem 1.6, we may generalise this concept into the language of category theory.

Definition 1.7:

Let ${\mathcal {C}}$ be a category. A zero object of ${\mathcal {C}}$ is an object $Z$ of ${\mathcal {C}}$ such that for all other objects $X,Y$ of ${\mathcal {C}}$ there exist unique morphisms $f:X\to Z$ and $g:Z\to Y$ .

Within many usual categories, such as groups (as shown above), but also rings and modules, there exist zero objects. However, not so within the category of sets. Indeed, let $S$ be an arbitrary set. If $|S|\geq 2$ , then from any nonempty set there exist at least 2 morphisms with codomain $S$ , namely the two constant functions. If $|S|=1$ , we may pick a set $T$ with $|t|>2$ and obtain two morphisms from $S$ mapping to $T$ . If $S=\emptyset$ , then there does not exist a function $T\to S$ .

But, if we split the definition 1.6 in half, each half can be found within the category of sets.

Definition 1.8:

Let ${\mathcal {C}}$ be a category. An object $X$ of ${\mathcal {C}}$ is called

terminal iff for every other object $Y$ of ${\mathcal {C}}$ there exists exactly one morphism $Y\to X$ ;
initial iff for every other object $Y$ of ${\mathcal {C}}$ there exists exactly one morphism $X\to Y$ .

In the category of sets, there exists one initial object and millions (actually infinitely many, to be precise) terminal objects. The initial object is the empty set; the argument above definition 1.7 shows that this is the only remaining option, and it is a valid one because any morphism from the empty set to any other set is the empty function. Furthermore, every set with exactly one element is a terminal object, since every morphism mapping to that set is the constant function with value the single element of that set. Hence, by generalizing the concept of a zero object in two different directions, we have obtained a fine description for the symmetry breaking at the level of sets.

Now returning to the category of groups, between any two groups there also exist a particularly trivial homomorphism, that is the zero homomorphism. We shall also elevate this concept to the level of categories. The following theorem is immediate:

Theorem 1.9:

Let $T$ be the trivial group, and let $H$ and $G$ be any two groups. If $f:H\to T$ and $g:T\to G$ are homomorphisms, then $g\circ f$ is the trivial homomorphism.

Now we may proceed to the categorical definition of a zero morphism. It is only defined for categories that have a zero object. (There exists a more general definition, but it shall be of no use to us during the course of this book.)

Definition 1.10:

Let ${\mathcal {C}}$ be a category with a zero object $Z$ , and let $X,Y$ be objects of that category. Then the zero morphism from $X$ to $Y$ is defined as the composition of the two unique morphisms $X\to Z$ and $Z\to Y$ .

Functors, natural transformations, universal arrows

Functors

Definitions

There are two types of functors, covariant functors and contravariant functors. Often, a covariant functor is simply called a functor.

Definition 2.1:

Let ${\mathcal {C}},{\mathcal {D}}$ be two categories. A covariant functor $F:{\mathcal {C}}\to {\mathcal {D}}$ associates

to each object $A$ of ${\mathcal {C}}$ an object $F(A)$ of ${\mathcal {D}}$ , and
to each morphism $f:A\to B$ in ${\mathcal {C}}$ a morphism $F(f):F(A)\to F(B)$ ,

such that the following rules are satisfied:

For all objects $A$ of ${\mathcal {C}}$ we have $F(1_{A})=1_{F(A)}$ , and
for all morphisms $f:A\to B$ and $g:B\to C$ of ${\mathcal {C}}$ we have $F(g\circ f)=F(g)\circ F(f)$ .

Definition 2.2:

Let ${\mathcal {C}},{\mathcal {D}}$ be two categories. A contravariant functor $F:{\mathcal {C}}\to {\mathcal {D}}$ associates

to each object $A$ of ${\mathcal {C}}$ an object $F(A)$ of ${\mathcal {D}}$ , and
to each morphism $f:A\to B$ in ${\mathcal {C}}$ a morphism $F(f):F(B)\to F(A)$ ,

such that the following rules are satisfied:

For all objects $A$ of ${\mathcal {C}}$ we have $F(1_{A})=1_{F(A)}$ , and
for all morphisms $f:A\to B$ and $g:B\to C$ of ${\mathcal {C}}$ we have $F(g\circ f)=F(f)\circ F(g)$ .

Forgetful functors

I'm not sure if there is a precise definition of a forgetful functor, but in fact, believe it or not, the notion is easily explained in terms of a few examples.

Example 2.3:

Consider the category of groups with homomorphisms as morphisms. We may define a functor sending each group to it's underlying set and each homomorphism to itself as a function. This is a functor from the category of groups to the category of sets. Since the target objects of that functor lack the group structure, the group structure has been forgotten, and hence we are dealing with a forgetful functor here.

Example 2.4:

Consider the category of rings. Remember that each ring is an Abelian group with respect to addition. Hence, we may define a functor from the category of rings to the category of groups, sending each ring to the underlying group. This is also a forgetful functor; one which forgets the multiplication of the ring.

Natural transformations

Definition 2.5:

Let ${\mathcal {C}},{\mathcal {D}}$ be categories, and let $F,G:{\mathcal {C}}\to {\mathcal {D}}$ be two functors. A natural transformation is a family of morphisms in ${\mathcal {D}}$ $\eta _{X}:F(X)\to G(X)$ , where $X$ ranges over all objects of ${\mathcal {C}}$ , that are compatible with the images of morphisms $f:X\to Y$ of ${\mathcal {C}}$ by the functors $F$ and $G$ ; that is, the following diagram commutes:

Example 2.6:

Let ${\mathcal {C}}$ be the category of all fields and ${\mathcal {D}}$ the category of all rings. We define a functor

F:{\mathcal {C}}\to {\mathcal {D}}

as follows: Each object $\mathbb {F}$ of ${\mathcal {C}}$ shall be sent to the ring $R_{\mathbb {F} }$ consisting of addition and multiplication inherited from the field, and whose underlying set are the elements

S_{\mathbb {F} }\{\overbrace {1_{\mathbb {F} }+1_{\mathbb {F} }+\cdots +1_{\mathbb {F} }} ^{n{\text{ times}}}|n\in \mathbb {N} _{0}\}\cup \{\overbrace {-1_{\mathbb {F} }-1_{\mathbb {F} }-\cdots -1_{\mathbb {F} }} ^{n{\text{ times}}}|n\in \mathbb {N} \}

,

where $1_{\mathbb {F} }$ is the unit of the field $\mathbb {F}$ . Any morphism $f:\mathbb {F} \to \mathbb {G}$ of fields shall be mapped to the restriction $f\upharpoonright _{S_{\mathbb {F} }}$ ; note that this is well-defined (that is, maps to the object associated to $\mathbb {G}$ under the functor $F$ ), since both

f(1_{\mathbb {F} }+1_{\mathbb {F} }+\cdots +1_{\mathbb {F} })=f(1_{\mathbb {F} })+f(1_{\mathbb {F} })+\cdots +f(1_{\mathbb {F} })=1_{\mathbb {G} }+1_{\mathbb {G} }+\cdots +1_{\mathbb {G} }

and

f(-1_{\mathbb {F} }-1_{\mathbb {F} }-\cdots -1_{\mathbb {F} })=-f(1_{\mathbb {F} })-f(1_{\mathbb {F} })-\cdots -f(1_{\mathbb {F} })=-1_{\mathbb {G} }-1_{\mathbb {G} }-\cdots -1_{\mathbb {G} }

,

where $1_{\mathbb {G} }$ is the unit of the field $\mathbb {G}$ .

We further define a functor

G:{\mathcal {C}}\to {\mathcal {D}}

,

sending each field $\mathbb {F}$ to its associated prime field $\mathbb {F} _{\text{prime}}$ , seen as a ring, and again restricting morphisms, that is sending each morphism $f:\mathbb {F} \to \mathbb {G}$ to $f\upharpoonright _{\mathbb {F} _{\text{prime}}}$ (this is well-defined by the same computations as above and noting that $f$ , being a field morphism, maps inverses to inverses).

In this setting, the maps

\eta _{\mathbb {F} }:R_{\mathbb {F} }\to \mathbb {F} _{\text{prime}}

,

given by inclusion, form a natural transformation from $F$ to $G$ ; this follows from checking the commutative diagram directly.

Universal arrows

Definition 2.7 (universal arrows):

Let ${\mathcal {C}},{\mathcal {D}}$ be categories, let $F:{\mathcal {C}}\to {\mathcal {D}}$ be a functor, let $Y$ be an object of ${\mathcal {D}}$ . A universal arrow is a morphism $g:Y\to F(X)$ , where $X$ is a fixed object of ${\mathcal {C}}$ , such that for any other object $Z$ of ${\mathcal {C}}$ and morphism $h:Y\to F(Z)$ there exists a unique morphism $f:X\to Z$ such that the diagram

commutes.

Kernels, cokernels, products, coproducts

Kernels

Definition 3.1:

Let ${\mathcal {C}}$ be a category with zero objects, and let $f:a\to b$ be a morphism between two objects $a,b$ of ${\mathcal {C}}$ . A kernel of $f$ is an arrow $k:o_{k}\to a$ , where $o_{k}$ is what we shall call the object associated to the kernel $k$ , such that

$f\circ k=0_{o_{k},b}$ , and
for each object $z$ of ${\mathcal {C}}$ and each morphism $g:z\to a$ such that $f\circ g=0_{z,b}$ , there exists a unique $g':z\to o_{k}$ such that $g=k\circ g'$ .

The second property is depicted in the following commutative diagram:

Note that here, we don't see kernels only as subsets, but rather as an object together with a morphism. This is because in the category of groups, for example, we can take the morphism just by inclusion. Let me explain.

Example 3.2:

In the category of groups, every morphism has a kernel.

Proof:

Let $G,H$ be groups and $\varphi :G\to H$ a morphism (that is, a group homomorphism). We set

o_{k}:=\{g\in G:\varphi (g)=0\}

and

k:o_{k}\to G,k(g)=g

,

the inclusion. This is indeed a kernel in the category of groups. For, if $\theta :K\to G$ is a group homomorphism such that $\varphi \circ \theta =0$ , then $\theta$ maps wholly to $o_{k}$ , and we may simply write $\theta =k\circ \theta$ . This is also clearly a unique factorisation. $\Box$

For kernels the following theorem holds:

Theorem 3.3:

Let ${\mathcal {C}}$ be a category with zero objects, let $f:a\to b$ be a morphism and let $k:o_{k}\to a$ be a kernel of $f$ . Then $k$ is a monic (that is, a monomorphism).

Proof:

Let $k\circ s=k\circ t$ . The situation is depicted in the following picture:

Here, the three lower arrows depict the general property of the kernel. Now the morphisms $k\circ s$ and $k\circ t$ are both factorisations of the morphism $k\circ s$ over $k$ . By uniqueness in factorisations, $s=t$ . $\Box$

Kernels are essentially unique:

Theorem 3.4:

Let ${\mathcal {C}}$ be a category with zero objects, let $f:a\to b$ be a morphism and let $k:o_{k}\to a$ , ${\tilde {k}}:o_{\tilde {k}}\to a$ be two kernels of $f$ . Then

o_{k}\cong o_{\tilde {k}}

;

that is to say, $k$ and ${\tilde {k}}$ are isomorphic.

Proof:

From the first property of kernels, we obtain $f\circ k=0$ and $f\circ {\tilde {k}}=0$ . Hence, the second property of kernels imply the commutative diagrams

and

.

We claim that $k'$ and ${\tilde {k}}'$ are inverse to each other.

{\tilde {k}}k'{\tilde {k}}'=k{\tilde {k}}'={\tilde {k}}={\tilde {k}}1_{o_{\tilde {k}}}

and

k{\tilde {k}}'k'={\tilde {k}}k'=k=k1_{o_{k}}

.

Since both $k$ and ${\tilde {k}}$ are monic by theorem 3.3, we may cancel them to obtain

k'{\tilde {k}}'=1_{o_{\tilde {k}}}

and

{\tilde {k}}'k'=1_{o_{k}}

,

that is, we have inverse arrows and thus, by definition, isomorphisms. $\Box$

Cokernels

An analogous notion is that of a cokernel. This notion is actually common in mathematics, but not so much at the undergraduate level.

Definition 3.5:

Let ${\mathcal {C}}$ be a category with zero objects, and let $f:a\to b$ be a morphism between two objects $a,b$ of ${\mathcal {C}}$ . A cokernel of $f$ is an arrow $u:b\to o_{u}$ , where $o_{u}$ is an object of ${\mathcal {C}}$ which we may call the object associated to the cokernel $u$ , such that

$u\circ f=0_{a,o_{u}}$ , and
for each object $c$ of ${\mathcal {C}}$ and each morphism $h:b\to c$ such that $h\circ f=0_{a,c}$ , there exists a unique factorisation $h=h'\circ u$ for a suitable morphism $h'$ .

The second property is depicted in the following picture:

Again, this notion is just a generalisation of facts observed in "everyday" categories. Our first example of cokernels shall be the existence of cokernels in Abelian groups. Now actually, cokernels exist even in the category of groups, but the construction is a bit tricky since in general, the image need not be a normal subgroup, which is why we may not be able to form the factor group by the image. In Abelian groups though, all subgroups are normal, and hence this is possible.

Example 3.6:

In the category of Abelian groups, every morphism has a cokernel.

Proof:

Let $G,H$ be any two Abelian groups, and let $\varphi :G\to H$ be a group homomorphism. We set

o_{u}:=H/\operatorname {im} \varphi

;

we may form this quotient group because within an Abelian group, all subgroups are normal. Further, we set

u:H\to H/\operatorname {im} \varphi ,u(h)=h+\operatorname {im} \varphi

,

the projection (we adhere to the custom of writing Abelian groups in an additive fashion). Let now $\eta :H\to I$ be a group homomorphism such that $\eta \circ \varphi =0$ , where $I$ is another Abelian group. Then the function

\eta ':H/\operatorname {im} \varphi \to I,\eta '(h+\operatorname {im} \varphi ):=\eta (h)

is well-defined (because of the rules for group morphisms) and the desired unique factorisation of $h$ is given by $h=\eta '\circ u$ . $\Box$

Theorem 3.7:

Every cokernel is an epi.

Proof:

Let $f$ be a morphism and $u$ a corresponding cokernel. Assume that $t\circ u=s\circ u$ . The situation is depicted in the following picture:

Now again, $t\circ u\circ f=0$ , and $t\circ u$ and $s\circ u$ are by their equality both factorisations of $t\circ u$ . Hence, by the uniqueness of such factorisations required in the definition of cokernels, $s=t$ . $\Box$

Theorem 3.8:

If a morphism $f$ has two cokernels $u$ and ${\tilde {u}}$ (let's call the associated objects $o_{u}$ and $o_{\tilde {u}}$ ), then $u\cong {\tilde {u}}$ ; that is, $u$ and ${\tilde {u}}$ are isomorphic.

Proof:

Once again, we have $u\circ f=0$ and ${\tilde {u}}\circ f=0$ , and hence we obtain commutative diagrams

and

.

We once again claim that $u'$ and ${\tilde {u}}'$ are inverse to each other. Indeed, we obtain the equations

u'{\tilde {u}}'u=u'{\tilde {u}}=u=1_{o_{u}}u

and

{\tilde {u}}'u'{\tilde {u}}={\tilde {u}}'u={\tilde {u}}=1_{o_{u'}}{\tilde {u}}

and by cancellation (both $u$ and ${\tilde {u}}$ are epis due to theorem 8.7) we obtain

u'{\tilde {u}}'=1_{o_{u}}

and

{\tilde {u}}'u'=1_{o_{u'}}

and hence the theorem. $\Box$

Interplay between kernels and cokernels

Theorem 3.9:

Let ${\mathcal {C}}$ be a category with zero objects, and let $k$ be a morphism of ${\mathcal {C}}$ such that $k$ is the kernel of some arbitrary morphism $f$ of ${\mathcal {C}}$ . Then $k$ is also the kernel of any cokernel of itself.

Proof:

$k=\ker f$ means

.

We set $q:=\operatorname {coker} k$ , that is,

.

In particular, since $fk=0$ , there exists a unique $f'$ such that $f=f'q$ . We now want that $k$ is a kernel of $p$ , that is,

.

Hence assume $ql=0$ . Then $fl=f'ql=0$ . Hence, by the topmost diagram (in this proof), $l=kl'$ for a unique $l'$ , which is exactly what we want. Further, $qk=0$ follows from the second diagram of this proof. $\Box$

Theorem 3.10:

Let ${\mathcal {C}}$ be a category with zero objects, and let $q$ be a morphism of ${\mathcal {C}}$ such that $q$ is the kernel of some arbitrary morphism $r$ of ${\mathcal {C}}$ . Then $q$ is also the cokernel of any kernel of itself.

Proof:

The statement that $q$ is the cokernel of $r$ reads

.

We set $k:=\ker q$ , that is

.

In particular, since $qr=0$ , $r=kr'$ for a suitable unique morphism $r'$ . We now want $q$ to be a cokernel of $k$ , that is,

.

Let thus $mk=0$ . Then also $mr=mkr'=0$ and hence $m$ has a unique factorisation $m=m'q$ by the topmost diagram. $\Box$

Corollary 3.11:

Let ${\mathcal {C}}$ be a category that has a zero object and where all morphisms have kernels and cokernels, and let $f$ be an arbitrary morphism of ${\mathcal {C}}$ . Then

\ker f=\ker(\operatorname {coker} (\ker f))

and

\operatorname {coker} f=\operatorname {coker} (\ker(\operatorname {coker} f))

.

The equation

\ker f=\ker(\operatorname {coker} (\ker f))

is to be read "the kernel of $f$ is a kernel of any cokernel of itself", and the same for the other equation with kernels replaced by cokernels and vice versa.

Proof:

$k:=\ker f$ is a morphism which is some kernel. Hence, by theorem 3.9

k=\ker(\operatorname {coker} (k))

(where the equation is to be read " $k$ is a kernel of any cokernel of $k$ "). Similarly, from theorem 3.10

q=\operatorname {coker} (\ker(q))

,

where $q:=\operatorname {coker} f$ . $\Box$

Products

Definition 3.12:

Let ${\mathcal {C}}$ be a category, and let $a,b$ be two objects of ${\mathcal {C}}$ . A product of $a$ and $b$ , denoted $a\times b$ , is an object of ${\mathcal {C}}$ together with two morphisms

\pi _{a}:a\times b\to a

and

\pi _{b}:a\times b\to b

,

called the projections of $a\times b$ , such that for any morphisms $f:c\to a$ and $g:d\to b$ there exists a unique morphism such that the following diagram commutes:

[[]]

Example 3.13:

Theorem 3.14:

If ${\mathcal {C}}$ is a category, $a,b$ are objects of ${\mathcal {C}}$ and $p,q$ are products of $a$ and $b$ , then

p\cong q

,

that is, $p$ and $q$ are isomorphic.

Theorem 3.15:

Let ${\mathcal {C}}$ be a category, $a,b$ objects of ${\mathcal {C}}$ and $a\times b$ a product of $a$ and $b$ . Then the projection morphisms and are monics.

Coproducts

Definition 3.16:

Let ${\mathcal {C}}$ be a category, and let $a$ and $b$ be objects of ${\mathcal {C}}$ . Then a coproduct of $a$ and $b$ is another object of ${\mathcal {C}}$ , denoted $a\coprod b$ , together with two morphisms and such that for any morphisms and , there exist morphisms such that and .

Example 3.17:

Theorem 3.18:

Theorem 3.19:

Biproducts

Definition 3.20:

Let ${\mathcal {C}}$ be a category that contains two objects $a$ and $b$ . Assume we are given an object $c$ of ${\mathcal {C}}$ together with four morphisms that make it into a product, and simultaneously into a coproduct. Then we call $c$ a biproduct of the two objects $a$ and $b$ and denote it by

c=a\oplus b

.

Example 3.21:

Within the category of Abelian groups, a biproduct is given by the product group; if $G,H$ are Abelian groups, set the product group of $G$ and $H$ to be

G\times H

,

the cartesian product, with component-wise group operation.

Proof:

Diagram chasing within Abelian categories

Exact sequences of Abelian groups

Definition 4.1 (sequence):

Given $n$ Abelian groups $A_{1},A_{2},\ldots ,A_{n}$ and $n-1$ morphisms (that is, since we are in the category of Abelian groups, group homomorphisms)

\varphi _{j}:A_{j}\to A_{j+1}

,

we may define the whole of those to be a sequence of Abelian groups, and denote it by

A_{1}{\overset {\varphi _{1}}{\longrightarrow }}A_{2}{\overset {\varphi _{2}}{\longrightarrow }}\cdots {\overset {\varphi _{n-2}}{\longrightarrow }}A_{n-1}{\overset {\varphi _{n-1}}{\longrightarrow }}A_{n}

.

Note that if one of the objects is the trivial group, we denote it by $0$ and simply leave out the caption of the arrows going to it and emitting from it, since the trivial group is the zero object in the category of Abelian groups.

There are also infinite exact sequences, indicated by a notation of the form

A_{1}{\overset {\varphi _{1}}{\longrightarrow }}A_{2}{\overset {\varphi _{2}}{\longrightarrow }}\cdots {\overset {\varphi _{n-2}}{\longrightarrow }}A_{n-1}{\overset {\varphi _{n-1}}{\longrightarrow }}A_{n}{\overset {\varphi _{n}}{\longrightarrow }}\cdots

;

it just goes on and on and on. The exact sequence to be infinite means, that we have a sequence (in the classical sense) of objects and another classical sequence of morphisms between these objects (here, the two have same cardinality: Countably infinite).

Definition 4.2 (exact sequence):

A given sequence

A_{1}{\overset {\varphi _{1}}{\longrightarrow }}A_{2}{\overset {\varphi _{2}}{\longrightarrow }}\cdots {\overset {\varphi _{n-2}}{\longrightarrow }}A_{n-1}{\overset {\varphi _{n-1}}{\longrightarrow }}A_{n}

is called exact iff for all $i$ ,

\operatorname {im} \varphi _{i}=\ker \varphi _{i+1}

.

There is a fundamental example to this notion.

Example 4.3 (short exact sequence):

A short exact sequence is simply an exact sequence of the form

0\longrightarrow A{\overset {f}{\longrightarrow }}B{\overset {g}{\longrightarrow }}C\longrightarrow 0

for suitable Abelian groups $A,B,C$ and group homomorphisms $f:A\to B,g:B\to C$ .

The exactness of this sequence means, considering the form of the image and kernel of the zero morphism:

$f$ injective
$\ker g=\operatorname {im} f$
$g$ surjective.

Example 4.4:

Set $A:=\mathbb {Z} /3\mathbb {Z}$ , $B:=\mathbb {Z} /15\mathbb {Z}$ , $C:=\mathbb {Z} /5\mathbb {Z}$ , where we only consider the additive group structure, and define the group homomorphisms

f:A\to B,f(n+3\mathbb {Z} ):=5n+15\mathbb {Z}

and

g:B\to C,g(n+15\mathbb {Z} ):=n+5\mathbb {Z}

.

This gives a short exact sequence

0\longrightarrow A{\overset {f}{\longrightarrow }}B{\overset {g}{\longrightarrow }}C\longrightarrow 0

,

as can be easily checked.

A similar construction can be done for any factorisation of natural numbers $k=m\cdot j$ (in our example, $k=15$ , $m=3$ , $j=5$ ).

Diagram chase: The short five lemma

We now should like to briefly exemplify a supremely important method of proof called diagram chase in the case of Abelian groups. We shall later like to generalize this method, and we will see that the classical diagram lemmas hold in huge generality (that includes our example below), namely in the generality of Abelian categories (to be introduced below).

Theorem 4.5 (the short five lemma):

Assume we have a commutative diagram

,

where the two rows are exact. If $g$ and $h$ are isomorphisms, then so must be $f$ .

Proof:

We first prove that $f$ is injective. Let $f(b)=0$ for a $b\in B$ . Since the given diagram is commutative, we have $0=t(f(b))=h(q(b))$ and since $h$ is an isomorphism, $q(b)=0$ . Since the top row is exact, it follows that $b\in \operatorname {im} p$ , that is, $b=p(a)$ for a suitable $a\in A$ . Hence, the commutativity of the given diagram implies $0=f(b)=f(q(a))=s(g(a))$ , and hence $a=0$ since $s\circ g$ is injective as the composition of two injective maps. Therefore, $b=q(a)=q(0)=0$ .

Next, we prove that $f$ is surjective. Let thus $b'\in B'$ be given. Set $c':=t(b')$ . Since $h\circ q$ is surjective as the composition of two surjective maps, there exists $b\in B$ such that $h(q(b))=c'$ . The commutativity of the given diagram yields $t(f(b))=c'$ . Thus, $t(f(b)-b')=0$ by linearity, whence $f(b)-b'\in \ker t=\operatorname {im} s$ , and since $g$ is an isomorphism, we find $a\in A$ such that $s(g(a))=f(b)-b'$ . The commutativity of the diagram yields $f(b)-b'=s(g(a))=f(p(a))$ , and hence $f(b-p(a))=b'$ . $\Box$

Additive categories

Definition 4.6:

An additive category is a category ${\mathcal {C}}$ such that the following holds:

$\operatorname {Hom} (a,b)$ is an Abelian group for all objects $a,b$ of ${\mathcal {C}}$ .
The composition of arrows

\circ :\operatorname {Hom} (b,c)\times \operatorname {Hom} (a,b)\to \operatorname {Hom} (a,c)

is bilinear; that is, for

f,f'\in \operatorname {Hom} (b,c)

and

g,g'\in \operatorname {Hom} (b,c)

, we have

(f+f')\circ (g+g')=f\circ g+f'\circ g+f\circ g'+f'\circ g'

(note that, since no scalar multiplication is involved, this definition of bilinearity is less rich than bilinearity in vector spaces).

${\mathcal {C}}$ has a zero object.
Each pair of objects $a,b$ of ${\mathcal {C}}$ has a biproduct $a\oplus b$ .

Although additive categories are important in their own right, we shall only treat them as in-between step to the definition of Abelian categories.

Abelian categories

Definition 4.7:

An Abelian category is an additive category ${\mathcal {C}}$ such that furthermore:

Every arrow of ${\mathcal {C}}$ has a kernel and a cokernel, and
every monic arrow of ${\mathcal {C}}$ is the kernel of some arrow, and every epic arrow of ${\mathcal {C}}$ is the cokernel of some arrow.

We now embark to obtain a canonical factorisation of arrows within Abelian categories.

Lemma 4.8:

Let ${\mathcal {C}}$ be a category with a zero object and kernels and cokernels for all arrows. Then every arrow $f$ of ${\mathcal {C}}$ admits a factorisation

f=kq

,

where $k=\ker(\operatorname {coker} f))$ .

Proof:

The factorisation comes from the following commutative diagram, where we call $u:=\operatorname {coker} f$ and $k:=\ker(\operatorname {coker} f)$ :

Indeed, by the property of $k$ as a kernel and since $u\circ f=0$ , $f$ factors uniquely through $k$ . $\Box$

In Abelian categories, $q$ is even a monomorphism:

Lemma 4.9:

Let ${\mathcal {C}}$ be an Abelian category. If $k=\ker(\operatorname {coker} f)$ and we have any factorisation $f=kq$ , then $q$ is an epimorphism.

Proof:

Theorem 4.10:

Let ${\mathcal {C}}$ be an Abelian category. Then every arrow $f$ of ${\mathcal {C}}$ has a factorisation

f=me

,

where $m=\ker(\operatorname {coker} f)$ and $e=\operatorname {coker} (\ker f)$ .

Exact sequences in Abelian categories

We begin by defining the image of a morphism in a general context.

Definition 4.12:

Let $f$ be a morphism of a (this time arbitrary) category ${\mathcal {C}}$ . If it exists, a kernel of a cokernel of $f$ is called image of $f$ .

Construction 4.13:

We shall now construct an equivalence relation on the set $P_{c}$ of all morphisms whose codomain is a certain $c\in {\mathcal {C}}$ , where ${\mathcal {C}}$ is a category. We set

f\leq g:\Leftrightarrow f=gf'

for a suitable

f'

(that is,

f

factors through

g

).

This relation is transitive and reflexive. Hence, if we define

f\sim g:\Leftrightarrow f\leq g\wedge g\leq f

,

we have an equivalence relation (in fact, in this way we can always construct an equivalence relation from a transitive and reflexive binary relation, that is, a preorder).

With the image at hand, we may proceed to the definition of sequences, exact sequences and short exact sequences in a general context.

Definition 4.14:

Let ${\mathcal {C}}$ be an Abelian category.

Definition 4.15:

Let ${\mathcal {C}}$ be an Abelian category.

Definition 4.16:

Let ${\mathcal {C}}$ be an Abelian category.

Diagram chase within Abelian categories

Now comes the clincher we have been working towards. In the ordinary diagram chase, we used elements of sets. We will now replace those elements by arrows in a simple way: Instead of looking at "elements" " $x\in a$ " of some object $a$ of an abelian category ${\mathcal {C}}$ , we look at arrows towards that element; that is, arrows $x:d\to a$ for arbitrary objects $d$ of ${\mathcal {C}}$ . For "the codomain of an arrow $x$ is $a$ ", we write

x\in _{m}a

,

where the subscript $m$ stands for "member".

We have now replaced the notion of elements of a set by the notion of members in category theory. We also need to replace the notion of equality of two elements. We don't want equality of two arrows, since then we would not obtain the usual rules for chasing diagrams. Instead, we define yet another equivalence relation on arrows with codomain $a$ (that is, on members of $a$ ). The following lemma will help to that end.

Lemma 4.18 (square completion):

Construction 4.19 (second equivalence relation):

Now we are finally able to prove the proposition that will enable us doing diagram chases using the techniques we apply also to diagram chases for Abelian groups (or modules, or any other Abelian category).

Theorem 4.20 (diagram chase enabling theorem):

Let ${\mathcal {C}}$ be an Abelian category and $a$ an object of ${\mathcal {C}}$ . We have the following rules concerning properties of a morphism:

$f:a\to b$ is monic iff $\forall x\in _{m}a:fx\equiv 0\Rightarrow x\equiv 0$ .
$f:a\to b$ is monic iff $\forall x,x'\in _{m}a:fx\equiv fx'$ .
$f:a\to b$ is epic iff $\forall y\in _{m}b:\exists x\in _{m}a:fx\equiv y$ .
$f:a\to b$ is the zero arrow iff $\forall x\in _{m}a:fx\equiv 0$ .
A sequence $a{\overset {f}{\longrightarrow }}b{\overset {g}{\longrightarrow }}c$ $a{\overset {f}{\longrightarrow }}b{\overset {g}{\longrightarrow }}c$ is exact iff
1. $gf=0$ and
2. for each $y\in _{m}b$ with $gy\equiv 0$ , there exists $x\in _{m}a$ such that $fx\equiv y$ .
If $f:a\to b$ $f:a\to b$ is a morphism such that $fx\equiv fy$ $fx\equiv fy$ , there exists a member of $a$ $a$ , which we shall call $(x-y)$ $(x-y)$ (the brackets indicate that this is one morphism), such that:
1. $f(x-y)\equiv 0$
2. $gx\equiv 0\Rightarrow g(x-y)\equiv -gy$
3. $hy\equiv 0\Rightarrow h(x-y)\equiv hx$

We have thus constructed a relatively elaborate machinery in order to elevate our proof technique of diagram chase (which is quite abundant) to the very abstract level of Abelian categories.

Examples of diagram lemmas

Theorem 4.21 (the long five lemma):

Theorem 4.22 (the snake lemma):

Modules, submodules and homomorphisms

Basics

Definition 5.1 (modules):

Let $R$ be a ring. A left $R$ -module is an Abelian group $M$ together with a function

R\times M\to M,(r,m)\mapsto rm

such that

$\forall m\in M:1_{R}m=m$ ,
$\forall m,n\in M,r\in R:r(m+n)=rm+rn$ ,
$\forall m\in M,r,s\in R:(r+s)m=rm+sm$ and
$\forall m\in M,r,s\in R:r(sm)=(rs)m$ .

Analogously, one can define right $R$ -modules with an operation $R\times M\to M,(r,m)\mapsto mr$ ; the difference is only formal, but it will later help us define bimodules in a user-friendly way.

For the sake of brevity, we will often write module instead of left $R$ -module.

Exercise 5.1.1: Prove that every Abelian monoid $(M,+)$ together with an operation as specified in 1.) - 4.) of definition 5.1 is already a module.

Submodules

Definition 5.2 (submodules):

A subgroup $N\leq M$ which is closed under the module function (i.e. the left multiplication operation defined above) is called a submodule. In this case we write $N\leq M$ .

The following lemma gives a criterion for a subset of a module being a submodule.

Lemma 5.3:

A subset $N\subseteq M$ is a submodule iff

\forall r\in R,n,q\in N:rn-q\in N

.

Proof:

Let $N$ be a submodule. Then since $-q\in N$ since we have an Abelian group and further $rn\in N$ due to closedness under the module operation, also $rn+(-q)=:rn-q\in N$ .

If $N$ is such that $\forall r\in R,n,q\in N:rn-q\in N$ , then for any $n,m\in N$ also $n+m=n+(-1_{R})(-m)\in N$ .

Definition and theorem 5.4 (factor modules): If $N$ is a submodule of $N$ , the factor module by $N$ is defined as the group $M/N$ together with the module operation

r(m+N):=rm+N

.

This operation is well-defined and satisfies 1. - 4. from definition 5.1.

Proof:

Well-definedness: If $m+N=p+N$ , then $m-p\in N$ , hence $r(m-p)=rm-rp\in N$ and thus $rm+N=rp+N$ .

$1_{R}(m+N)=(1_{r}m)+N=m+N$
$r(n+N+m+N)=r((m+n)+N)=r(m+n)+N=rm+rn+N=rm+N+rn+N$
$(r+s)(m+N)=(r+s)m+N=rm+sm+N=rm+N+rn+N$
analogous to 3. (replace $+$ by $\cdot$ ) $\Box$

Sum and intersection of submodules

We shall now ask the question: Given a module $M$ and certain submodules $\{N_{\alpha }\}_{\alpha \in A}$ , which module is the smallest module containing all the $N_{\alpha }$ ? And which module is the largest module that is itself contained within all $N_{\alpha }$ ? The following definitions and theorems answer those questions.

Definition and theorem 5.5 (sum of submodules):

Let $M$ be a module over a certain ring $R$ and let $\{N_{\alpha }\}_{\alpha \in A}$ be submodules of $M$ . The set

\sum _{\alpha \in A}N_{\alpha }:=\left\{\sum _{l=1}^{k}r_{l}n_{\alpha _{l}}{\big |}k\in \mathbb {N} ,r_{l}\in R,n_{\alpha _{l}}\in N_{\alpha _{l}}\right\}

is a submodule of $M$ , which is the smallest submodule of $M$ that contains all the $N_{\alpha }$ . It is called the sum of $\{N_{\alpha }\}_{\alpha \in A}$ .

Proof:

1. $\sum _{\alpha \in A}N_{\alpha }$ is a submodule:

It is an Abelian subgroup since if $\sum _{l=1}^{k}r_{l}n_{\alpha _{l}},\sum _{j=1}^{m}s_{l}n_{\beta _{j}}\in \sum _{\alpha \in A}N_{\alpha }$ , then

\sum _{l=1}^{k}r_{l}n_{\alpha _{l}}-\sum _{j=1}^{m}s_{l}n_{\beta _{j}}=\sum _{l=1}^{k}r_{l}n_{\alpha _{l}}+\sum _{j=1}^{m}(-s_{l})n_{\beta _{j}}\in \sum _{\alpha \in A}N_{\alpha }

.

It is closed under the module operation, since

s\left(\sum _{l=1}^{k}r_{l}n_{\alpha _{l}}\right)=\sum _{l=1}^{k}(sr_{l})n_{\alpha _{l}}\in \sum _{\alpha \in A}N_{\alpha }

.

2. Each $N_{\alpha }$ is contained in $\sum _{\alpha \in A}N_{\alpha }$ :

This follows since $1_{r}n_{\alpha }\in \sum _{\alpha \in A}N_{\alpha }$ for each $\alpha \in A$ and each $n_{\alpha }\in N_{\alpha }$ .

3. $\sum _{\alpha \in A}N_{\alpha }$ is the smallest submodule containing all the $N_{\alpha }$ : If $K\leq M$ is another such submodule, then $K$ must contain all the elements

\sum _{l=1}^{k}r_{l}n_{\alpha _{l}},k\in \mathbb {N} ,r_{l}\in R,n_{\alpha _{l}}\in N_{\alpha _{l}}

due to closedness under addition and submodule operation. $\Box$

Definition and theorem 5.6 (intersection of submodules):

Let $M$ be a module over a ring $R$ , and let $\{N_{\alpha }\}_{\alpha \in A}$ be submodules of $M$ . Then the set

\bigcap _{\alpha \in A}N_{\alpha }

is a submodule of $M$ , which is the largest submodule of $M$ containing all the $N_{\alpha }$ . It is called the intersection of the $N_{\alpha }$ .

Proof:

1. It's a submodule: Indeed, if $r\in R,n,p\in \bigcap _{\alpha \in A}N_{\alpha }$ , then $n,p\in N_{\alpha }$ for each $\alpha$ and thus $n-rp\in N_{\alpha }$ for each $\alpha$ , hence $n-rp\in \bigcap _{\alpha \in A}N_{\alpha }$ .

2. It is contained in all $N_{\alpha }$ by definition of the intersection.

3. Any set that contains all elements from each of the $N_{\alpha }$ is contained within the intersection. $\Box$

We have the following rule for computing with intersections and sums:

Theorem 5.7 (modular law; Dedekind):

Let $M$ be a module and $K,L,N\leq M$ such that $L\subseteq K$ . Then

K\cap (L+N)=L+(K\cap N)

.

Proof:

$\subseteq$ : Let $l+n\in (L+N)\cap K$ . Since $L\subseteq K$ , $l\in K$ and hence $n\in K$ . Since also $n\in N$ by assumption, $l+n\in L+K\cap N$ .

$\supseteq$ : Let $l+m\in L+(K\cap N)$ . Since $L\subseteq K$ , $l\in K$ and since further $m\in K$ , $l+m\in K$ . Hence, $l+m\in K\cap (L+N)$ . $\Box$

More abstractly, the properties of the sum and intersection of submodules may be theoretically captured in the following way:

Lattices

Definition 5.8:

A lattice is a set $L$ together with two operations $\vee :L\times L\to L$ (called the join or least upper bound) and $\wedge :L\times L\to L$ (called the meet or greatest lower bound) such that the following laws hold:

Commutative laws: $a\Box b=b\Box a$ , $\Box \in \{\vee ,\wedge \}$
Idempotency laws: $a\Box a=a$ , $\Box \in \{\vee ,\wedge \}$
Absorption laws: $a\Box (a\triangledown b)=a$ , $\{\Box ,\triangledown \}=\{\vee ,\wedge \}$
Associative laws: $a\Box (b\Box c)=(a\Box b)\Box c$ , $\Box \in \{\vee ,\wedge \}$

There are some special types of lattices:

Definition 5.9:

A modular lattice $L$ is a lattice such that the identity

holds.

Theorem 5.10 (ordered sets as lattices):

Let $\leq$ be a partial order on the set $L$ such that

every set $S\subseteq L$ has a least upper bound (where a least upper bound $u$ of $S$ satisfies $u\geq s$ for all $s\in S$ (i.e. it is an upper bound) and $u\leq x$ for every other upper bound $x$ of $S$ ) and
every set $S\subseteq L$ has a greatest lower bound (defined analogously to least upper bound with inequality reversed).

Then $L$ , together with the joint operation sending $\{a,b\}$ to the least upper bound of that set and the meet operation analogously, is a lattice.

In fact, it suffices to require conditions 1. and 2. only for sets $S$ with two elements. But as we have shown, in the case that $L$ is the set of all submodules of a given module, we have the "original" conditions satisfied.

Proof:

First, we note that least upper bound and greatest lower bound are unique, since if for example $u,u'$ are least upper bounds of $S$ , then $u\leq u'$ and $u'\leq u$ and hence $u=u'$ . Thus, the joint and meet operation are well-defined.

The commutative laws follow from $\{a,b\}=\{b,a\}$ .

The idempotency laws from clearly $a$ being the least upper bound, as well as the greatest lower bound, of the set $\{a,a\}$ .

The first absorption law follows as follows: Let $u$ be the least upper bound of $\{a,b\}$ . Then in particular, $u\geq a$ . Hence, $a$ is a lower bound of $\{a,u\}$ , and any lower bound $l$ satisfies $l\leq a$ , which is why $a$ is the greatest lower bound of $\{a,u\}$ . The second absorption law is proven analogously.

The first associative law follows since if $u$ is the least upper bound of $\{a,b,c\}$ and $v$ is the upper bound of $\{a,b\}$ , then $u\geq v$ (as $u$ is an upper bound for $\{a,b\}$ ) and if $w$ is the least upper bound of $\{v,c\}$ , then $w=u$ since $u$ is an upper bound and further, $w\geq v\geq a$ and $w\geq b$ . The same argument (with $a$ and $c$ swapped) proves that $u$ is also the least upper bound of the l.u.b. of $\{b,c\}$ and $a$ . Again, the second associative law is proven similarly. $\Box$

From theorems 5.5-5.7 and 5.10 we note that the submodules of a module form a modular lattice, where the order is given by set inclusion.

Exercises

Exercise 5.2.1: Let $R$ be a ring. Find a suitable module operation such that $R$ together with its own addition and this module operation is an $R$ -module. Make sure you define this operation in the simplest possible way. Prove further, that with respect to this module operation, the submodules of $R$ are exactly the ideals of $R$ .

Homomorphisms

We shall now get to know the morphisms within the category of modules over a fixed ring $R$ .

Definition 5.11 (homomorphisms):

Let $M,N$ be two modules over a ring $R$ . A homomorphism from $M$ to $N$ , also called an $R$ -linear function from $M$ to $N$ , is a function

f:M\to N

such that

$\forall m,p\in M:f(m+p)=f(m)+f(p)$ and
$\forall r\in R,m\in M:f(rm)=rf(m)$ .

The kernel and image of homomorphisms of modules are defined analogously to group homomorphisms.

Since we are cool, we will often simply write morphisms instead of homomorphisms where it's clear from the context in order to indicate that we have a clue about category theory.

We have the following useful lemma:

Lemma 5.12:

$f:M\to N$ is $R$ -linear iff

\forall r\in R,m,p\in M:f(rm+p)=rf(m)+f(p)

.

Proof:

Assume first $R$ -linearity. Then we have

f(rm+p)=f(rm)+f(p)=rf(m)+f(p)

.

Assume now the other condition. Then we have for $m,p\in M$

f(m+p)=f(1_{R}m+p)=1_{R}f(m)+f(p)=f(m)+f(p)

and

f(rm)=f(rm+0)=rf(m)+f(0)=rf(m)

since $f(0)=0$ due to $f(0)=f(0+0)=f(0)+f(0)$ ; since $M$ is an abelian group, we may add the inverse of $f(0)$ on both sides. $\Box$

Lemma 5.13:

If $f:M\to N$ is $R$ -linear, then $\forall m\in M:f(-m)=-f(m)$ .

Proof:

This follows from the respective theorem for group homomorphisms, since each morphism of modules is also a morphism of Abelian groups. $\Box$

Definition 5.8 (isomorphisms):

An isomorphism $f:M\to N$ is a homomorphism which is bijective.

Lemma 5.14:

Let $f:M\to N$ be a morphism. The following are equivalent:

$f$ is an isomorphism
$\ker f=\{0\}$
$f$ has an inverse which is an isomorphism

Proof:

Lemma 5.15:

The kernel and image of morphisms are submodules.

Proof:

1. The kernel:

f(rn-q)=rf(n)+f(-q)=rf(n)-f(q)=0

2. The image:

rf(m)\overbrace {-f(p)} ^{=+f(-p)}=f(rm-p)

\Box

The following four theorems are in complete analogy to group theory.

Theorem 5.16 (factoring of morphisms):

Let $M,K$ be modules, let $\varphi :M\to K$ be a morphism and let $N\leq \ker \varphi$ . Then there exists a unique morphism ${\overline {\varphi }}:M/N\to K$ such that ${\overline {\varphi }}\circ \pi =\varphi$ , where $\pi :M\to M/N,\pi (m)=m+N$ is the canonical projection. In this situation, $\ker {\overline {\varphi }}=\ker \varphi /N$ .

Proof:

We define ${\overline {\varphi }}(m+N):=\varphi (m)$ . This is well-defined since $\ker \varphi \subseteq N$ . Furthermore, this definition is already enforced by ${\overline {\varphi }}\circ \pi =\varphi$ . Further, ${\overline {\varphi }}(m+N)=0\Leftrightarrow m\in \ker \varphi$ . $\Box$

Corollary 5.17 (first isomorphism theorem):

Let $M,K$ be $R$ -modules and let $f:M\to K$ be a morphism. Then $M/\ker f\cong K$ .

Proof:

We set $N=\ker f$ and obtain a homomorphism ${\overline {f}}:M/\ker f\to K$ with kernel $N/N$ by theorem 5.11. From lemma 5.16 follows the claim. $\Box$

Corollary 5.18 (third isomorphism theorem):

Let $M$ be an $R$ -module, let $N\leq M$ and let $L\leq N$ . Then

M/N\cong (M/L){\big /}(N/L)

.

Proof:

Since $L\leq N$ and $N\leq M$ also $L\leq M$ by definition. We define the function

\varphi :M/L\to M/N,m+L\mapsto m+N

.

This is well-defined since

m+L=p+L\Leftrightarrow m-p\in L\Rightarrow m-p\in N\Leftrightarrow m+N=p+N

.

Furthermore,

m+L\in \ker \varphi \Leftrightarrow m+N=0+N\Leftrightarrow m\in N

and hence $\ker \varphi =N/L$ . Hence, by theorem 5.17 our claim is proven. $\Box$

Theorem 5.19 (second isomorphism theorem):

Let $L,N\leq M$ . Then

L/(L\cap N)\cong (L+N)/N

.

Proof:

Consider the isomorphism

\varphi :L\to (L+N)/N,\varphi (l):=l+N

.

Then $\varphi (l)=0\Leftrightarrow l\in N$ , which is why the kernel of that homomorphism is given by $L\cap N$ . Hence, the theorem follows by the first isomorphism theorem. $\Box$

And now for something completely different:

Theorem 5.20:

Let $\varphi :M\to N$ be a homomorphism of modules over $R$ and let $L\leq N$ . Then $\varphi ^{-1}(L)$ is a submodule of $M$ .

Proof:

Let $a,b\in \varphi ^{-1}(L)$ . Then $\varphi (a+b)=\varphi (a)+\varphi (b)\in L$ and hence $a+b\in \varphi ^{-1}(L)$ . Let further $r\in R$ . Then $\varphi (ra)=r\varphi (a)\in L$ . $\Box$

Similarly:

Theorem 5.21:

Let $\varphi :M\to N$ be a homomorphism of modules over $R$ and let $K\leq M$ . Then $\varphi (K)$ is a submodule of $N$ .

Proof: Let $a,b\in \varphi (K)$ . Then $a=\varphi (i),b=\varphi (j)$ and $a+b=\varphi (i+j)\in \varphi (K)$ . Let further $r\in R$ . Then $ra=\varphi (ri)\in \varphi (K)$ . $\Box$

Exercises

Exercise 5.3.1: Let $R,S$ be rings regarded as modules over themselves as in exercise 5.2.1. Prove that the ring homomorphisms $\varphi :R\to S$ are exactly the module homomorphisms $R\to S$ ; that is, every ring hom. is a module hom. and vice versa.

The projection morphism

Definition 5.22:

Let $M$ be a module and $N\leq M$ . By the mapping $\pi _{N}:M\to M/N$ we mean the canonical projection mapping $m\in M$ to $m+N$ ; that is,

\pi _{N}:M\to M/N,\pi _{N}(m):=m+N

.

The following two fundamental equations for $\pi _{N}(\pi _{N}^{-1}(S))$ and $\pi _{N}^{-1}(\pi _{N}(K))$ shall gain supreme importance in later chapters, $S\subseteq M/N$ , $K\leq M$ .

Theorem 5.23:

Let $M$ be a module and $N\leq M$ . Then for every set $S\subseteq M/N$ , $\pi _{N}(\pi _{N}^{-1}(S))=S$ . Furthermore, for every other submodule $K\subseteq M$ , $\pi _{N}^{-1}(\pi _{N}(K))=K+N$ .

Proof:

Let first $m+N\in S$ . Then $m\in \pi ^{-1}(S)$ , since $\pi _{N}(m)=m+N$ . Hence, $m+N\in \pi _{N}(\pi ^{-1}(S))$ . Let then $m+N\in \pi _{N}(\pi _{N}^{-1}(S))$ . Then there exists $m'\in \pi _{N}^{-1}(S)$ such that $\pi _{N}(m')=m+N$ , that is $m'+N=m+N$ . Now $m'\in \pi _{N}^{-1}(S)$ means that $\pi (m')=m'+N\in S$ . Hence, $m+N=m'+N\in S$ .

Let first $m\in K+N$ , that is, $m=k+n$ for suitable $k\in K$ , $n\in N$ . Then $\pi _{N}(m)=k+n+N=k+N=\pi _{N}(k)\in \pi _{N}(K)$ , which is why by definition $m\in \pi _{N}^{-1}(\pi _{N}(K))$ . Let then $m\in \pi _{N}^{-1}(\pi _{N}(K))$ . Then $\pi _{N}(m)=m+N\in \pi _{N}(K)$ , that is $m+N=k+N$ with $k\in K$ , that is $m=k+n$ for a suitable $n\in N$ , that is $m\in K+N$ . $\Box$

The following lemma from elementary set theory have relevance for the projection morphism and we will need it several times:

Lemma 5.24:

Let $f:S\to T$ be a function, where $S,T$ are completely arbitrary sets. Then $f$ induces a function $2^{S}\to 2^{T}$ via $A\mapsto f(A)$ , the image of $A$ , where $A\subseteq S$ . This function preserves inclusion. Further, the function $2^{T}\to 2^{S},B\mapsto f^{-1}(B)$ , also preserves inclusion.

Proof:

If $A'\subseteq A$ , let $y'\in f(A')$ . Then $y'=f(x')$ for an $x'\in A'\subseteq A$ . Similarly for $f^{-1}$ . $\Box$

Exercises

Generators and chain conditions

Generators

Definition 6.1 (generators of modules):

Let $M$ be a module over the ring $R$ . A generating set of $M$ is a subset $\{m_{j}\}_{j\in J}\subseteq M$ such that

\forall n\in M:\exists j_{1},\ldots ,j_{k}\in J,r_{1},\ldots ,r_{k}\in R:n=\sum _{l=1}^{k}r_{l}m_{j_{l}}

.

Example 6.2:

For every module $M$ , the whole module itself is a generating set.

Definition 6.3:

Let $M$ be a module. $M$ is called finitely generated if there exists a generating set of $M$ which has a finite cardinality.

Example 6.4: Every ring $R$ is a finitely generated $R$ -module over itself, and a generating set is given by $\{1_{R}\}$ .

Definition 6.5 (generated submodules):

Exercises

Noetherian and Artinian modules

Definition 6.6 (Noetherian modules):

Let $M$ be a module over the ring $R$ . $M$ is called a Noetherian module iff for every ascending chain of submodules

N_{1}\subseteq N_{2}\subseteq N_{3}\subseteq \cdots \subseteq N_{k}\subseteq \cdots

of $M$ , there exists an $l\in \mathbb {N}$ such that

\forall k\geq l:N_{k}=N_{l}

.

We also say that ascending chains of submodules eventually become stationary.

Definition 6.7 (Artinian modules):

A module $M$ over a ring $R$ is called Artinian module iff for every descending chain of submodules

N_{1}\supseteq N_{2}\supseteq N_{3}\supseteq \cdots \supseteq N_{k}\supseteq \cdots

of $M$ , there exists an $l\in \mathbb {N}$ such that

\forall k\geq l:N_{k}=N_{l}

.

We also say that descending chains of submodules eventually become stationary.

We see that those definitions are similar, although they define a bit different objects.

Using the axiom of choice, we have the following characterisation of Noetherian modules:

Theorem 6.8:

Let $M$ be a module over $R$ . The following are equivalent:

$M$ is Noetherian.
All the submodules of $M$ are finitely generated.
Every nonempty set of submodules of $M$ has a maximal element.

Proof 1:

We prove 1. $\Rightarrow$ 2. $\Rightarrow$ 3. $\Rightarrow$ 1.

1. $\Rightarrow$ 2.: Assume there is a submodule $N$ of $M$ which is not finitely generated. Using the axiom of dependent choice, we choose a sequence $(n_{k})_{k\in \mathbb {N} }$ in $N$ such that

\forall k\in \mathbb {N} :\langle n_{1},\ldots ,n_{k}\rangle \subsetneq \langle n_{1},\ldots ,n_{k+1}\rangle

;

it is possible to find such a sequence since we may just always choose $n_{k+1}\in N\setminus \langle n_{1},\ldots ,n_{k}\rangle$ , since $N$ is not finitely generated. Thus we have an ascending sequence of submodules

\langle n_{1}\rangle \subsetneq \langle n_{1},n_{2}\rangle \subsetneq \cdots \subsetneq \langle n_{1},\ldots ,n_{k}\rangle \subsetneq \langle n_{1},\ldots ,n_{k+1}\rangle \subsetneq \cdots

which does not stabilize.

2. $\Rightarrow$ 3.: Let ${\mathcal {M}}$ be a nonempty set of submodules of $M$ . Due to Zorn's lemma, it suffices to prove that every chain within ${\mathcal {N}}$ has an upper bound (of course, our partial order is set inclusion, i.e. $N_{1}\leq N_{2}:\Leftrightarrow N_{1}\subseteq N_{2}$ ). Hence, let ${\mathcal {N}}$ be a chain within ${\mathcal {M}}$ . We write

{\mathcal {N}}=\left(N_{1}\subseteq N_{2}\subseteq \cdots \right)=\left(\langle n_{1},\ldots ,n_{k_{1}}\rangle \subseteq \langle n_{1},\ldots ,n_{k_{1}},n_{k_{1}+1},\ldots ,n_{k_{2}}\rangle \subseteq \cdots \right)

.

Since every submodule is finitely generated, so is

\langle n_{1},n_{2},\ldots ,n_{k},n_{k+1},\ldots \rangle =\langle m_{1},\ldots ,m_{l}\rangle

.

We write $m_{j}=\sum _{u\in \mathbb {N} }r_{u}n_{u}$ , where only finitely many of the $r_{u}$ are nonzero. Hence, we have

\langle n_{1},n_{2},\ldots ,n_{k},n_{k+1},\ldots \rangle =\langle n_{u_{1}},\ldots ,n_{u_{r}}\rangle

for suitably chosen $u_{1},\ldots ,u_{r}$ . Now each $u_{i}$ is eventually contained in some $N_{j}$ . Since the $N_{j}$ are an ascending sequence with respect to inclusion, we may just choose $j$ large enough such that all $u_{i}$ are contained within $N_{j}$ . Hence, $N_{j}$ is the desired upper bound.

3. $\Rightarrow$ 1.: Let

N_{1}\subseteq N_{2}\subseteq \cdots \subseteq N_{k}\subseteq N_{k+1}\subseteq \cdots

be an ascending chain of submodules of $M$ . The set $\{N_{j}|j\in \mathbb {N} \}$ has a maximal element $N_{l}$ and thus this ascending chain becomes stationary at $l$ . $\Box$

Proof 2:

We prove 1. $\Rightarrow$ 3. $\Rightarrow$ 2. $\Rightarrow$ 1.

1. $\Rightarrow$ 3.: Let ${\mathcal {N}}$ be a set of submodules of $M$ which does not have a maximal element. Then by the axiom of dependent choice, for each $N\in {\mathcal {N}}$ we may choose $N'\in {\mathcal {N}}$ such that $N\subsetneq N'$ (as otherwise, $N$ would be maximal). Hence, using the axiom of dependent choice and starting with a completely arbitrary $N_{1}\in {\mathcal {N}}$ , we find an ascending sequence

N_{1}\subsetneq N_{2}\subsetneq \cdots \subsetneq N_{k}\subsetneq N_{k+1}\subsetneq \cdots

which does not stabilize.

3. $\Rightarrow$ 2.: Let $N\leq M$ be not finitely generated. Using the axiom of dependent choice, we choose first an arbitrary $x_{1}\in N$ and given $x_{1},\ldots ,x_{k}$ we choose $x_{k+1}$ in $N\setminus \langle x_{1},\ldots ,x_{k}\rangle$ . Then the set of submodules

\{\langle x_{1},\ldots ,x_{k}\rangle {\big |}k\in \mathbb {N} \}

does not have a maximal element, although it is nonempty.

2. $\Rightarrow$ 1.: Let

N_{1}\subseteq N_{2}\subseteq \cdots \subseteq N_{k}\subseteq N_{k+1}\subseteq \cdots

be an ascending chain of submodules of $M$ . Since these are finitely generated, we have

\left(N_{1}\subseteq N_{2}\subseteq \cdots \right)=\left(\langle n_{1},\ldots ,n_{k_{1}}\rangle \subseteq \langle n_{1},\ldots ,n_{k_{1}},n_{k_{1}+1},\ldots ,n_{k_{2}}\rangle \subseteq \cdots \right)

for suitable $(k_{j})_{j\in \mathbb {N} }$ and $(n_{j})_{j\in \mathbb {N} }$ . Since every submodule is finitely generated, so is

\langle n_{1},n_{2},\ldots ,n_{k},n_{k+1},\ldots \rangle =\langle m_{1},\ldots ,m_{l}\rangle

.

We write $m_{j}=\sum _{u\in \mathbb {N} }r_{u}n_{u}$ , where only finitely many of the $r_{u}$ are nonzero. Hence, we have

\langle n_{1},n_{2},\ldots ,n_{k},n_{k+1},\ldots \rangle =\langle n_{u_{1}},\ldots ,n_{u_{r}}\rangle

for suitably chosen $u_{1},\ldots ,u_{r}$ . Now each $u_{i}$ is eventually contained in some $N_{j}$ . Hence, the chain stabilizes at $l$ , if $l$ is chosen as the maximum of those $j$ . $\Box$

The second proof might be advantageous since it does not use Zorn's lemma, which needs the full axiom of choice.

We can characterize Noetherian and Artinian modules in the following way:

Theorem 6.9:

Let $M$ be a module over a ring $R$ , and let $N\leq M$ . Then the following are equivalent:

$M$ is Noetherian.
$N$ and $M/N$ are Noetherian.

Proof 1:

We prove the theorem directly.

1. $\Rightarrow$ 2.: $N$ is Noetherian since any ascending sequence of submodules of $N$

N_{1}\subseteq N_{2}\subseteq \cdots \subseteq N_{k}\subseteq N_{k+1}\subseteq \cdots

is also a sequence of submodules of $M$ (check the submodule properties), and hence eventually becomes stationary.

$M/N$ is Noetherian, since if

M_{1}\subseteq M_{2}\subseteq \cdots \subseteq M_{k}\subseteq M_{k+1}\subseteq \cdots

is a sequence of submodules of $M/N$ , we may write

M_{k}=N_{k}/N

,

where $N_{k}:=\{m+n|m+N\in M_{k},n\in N\}$ . Indeed, " $\subseteq$ " follows from $m+N\in M_{k}\Rightarrow m+0+N\in N_{k}/N$ and " $\supseteq$ " follows from

l+N\in N_{k}/N\Rightarrow \exists m+N\in M_{k},n,n'\in N:l=m+n+n'\Rightarrow l+N=m+N\in M_{k}

.

Furthermore, $N_{k}$ is a submodule of $M$ as follows:

$l,l'\in N_{k}\Rightarrow \exists m+N,m'+N\in M_{k},n,n'\in N:l=m+n,l'=m'+n'\Rightarrow (m+m')+(n+n')=l+l'\in N_{k}$ since $m+m'+N\in M_{k}$ and $n+n'\in N$ ,
$l\in N_{k}\Rightarrow \exists m+N\in M_{k},n\in N:l=m+n\Rightarrow al\in N_{k}$ since $a(m+N)\in M_{k}$ and $an\in N$ .

Now further for each $k\in \mathbb {N}$ $N_{k}\subseteq N_{k+1}$ , as can be read from the definition of the $N_{k}$ by observing that $m+N\in M_{k},n\in N\Rightarrow m+N\in M_{k+1},n\in N$ . Thus the sequence

N_{1}\subseteq N_{2}\subseteq \cdots \subseteq N_{k}\subseteq N_{k+1}\subseteq \cdots

becomes stationary at some $j\in \mathbb {N}$ . But If $N_{k}=N_{k+1}$ , then also $M_{k}=M_{k+1}$ , since

m+N\in M_{k+1}\Rightarrow m\in N_{k+1}\Rightarrow m\in N_{k}\Rightarrow m=m'+n,m'\in M_{k},n\in N\Rightarrow m+N=m'+N\in M_{k}

.

Hence,

M_{1}\subseteq M_{2}\subseteq \cdots \subseteq M_{k}\subseteq M_{k+1}\subseteq \cdots

becomes stationary as well.

2. $\Rightarrow$ 1.: Let

N_{1}\subseteq N_{2}\subseteq \cdots \subseteq N_{k}\subseteq N_{k+1}\subseteq \cdots

be an ascending sequence of submodules of $M$ . Then

N\cap N_{1}\subseteq N\cap N_{2}\subseteq \cdots \subseteq N\cap N_{k}\subseteq N\cap N_{k+1}\subseteq \cdots

is an ascending sequence of submodules of $N$ , and since $N$ is Noetherian, this sequence stabilizes at an $l\in \mathbb {N}$ . Furthermore, the sequence

N_{1}/N\subseteq N_{2}/N\subseteq \cdots \subseteq N_{k}/N\subseteq N_{k+1}/N\subseteq \cdots

is an ascending sequence of submodules of $M/N$ , which also stabilizes (at $j\in \mathbb {N}$ , say). Set $N:=\max\{l,j\}$ , and let $k\geq N$ . Let $n\in N_{k+1}$ . Then $n+N\in N_{k+1}/N$ and thus $n+N\in N_{k}/N$ , that is $n=m+n'$ for an $m\in N_{k}$ and an $n'\in N$ . Now $n'=n-m\in N_{k+1}$ , hence $n'\in N_{k+1}\cap N=N_{k}\cap N$ . Hence $n\in N_{k}$ . Thus,

N_{1}\subseteq N_{2}\subseteq \cdots \subseteq N_{k}\subseteq N_{k+1}\subseteq \cdots

is stable after $N$ . $\Box$

Proof 2:

We prove the statement using the projection morphism to the factor module.

1. $\Rightarrow$ 2.: $N$ is Noetherian as in the first proof. Let

M_{1}\subseteq M_{2}\subseteq \cdots \subseteq M_{k}\subseteq M_{k+1}\subseteq \cdots

be a sequence of submodules of $M/N$ . If $\pi :M\to M/N$ is the projection morphism, then

N_{k}:=\pi ^{-1}(M_{k})

defines an ascending sequence of submodules of $M$ , as $\pi ^{-1}$ preserves inclusion (since $\pi$ is a function). Now since $M$ is Noetherian, this sequence stabilizes. Hence, since also $\pi$ preserves inclusion, the sequence

M_{1}\subseteq M_{2}\subseteq \cdots =\pi (\pi ^{-1}(M_{1}))\subseteq \pi (\pi ^{-1}(M_{2}))\subseteq \cdots =\pi (N_{1})\subseteq \pi (N_{2})\subseteq \cdots

also stabilizes ( $\pi (\pi ^{-1}(M_{k}))=M_{k}$ since $\pi$ is surjective).

2. $\Rightarrow$ 1.: Let

N_{1}\subseteq N_{2}\subseteq \cdots \subseteq N_{k}\subseteq N_{k+1}\subseteq \cdots

be an ascending sequence of submodules of $M$ . Then the sequences

\pi (N_{1})\subseteq \pi (N_{2})\subseteq \cdots

and

N\cap N_{1}\subseteq N\cap N_{2}\subseteq \cdots

both stabilize, since $M/N$ and $N$ are Noetherian. Now $\pi ^{-1}(\pi (N_{k}))=N_{k}+N$ , since $\pi (m)\in \pi (N_{k})\Leftrightarrow m=n'+n,n'\in N_{k},n\in N$ . Thus,

N_{1}+N\subseteq N_{2}+N\subseteq \cdots \subseteq N_{k}+N\subseteq N_{k+1}+N\subseteq \cdots

stabilizes. But since $N_{k}=N_{k+1}\Leftrightarrow N_{k}\cap N=N_{k+1}\cap N\wedge N_{k}+N=N_{k+1}+N$ , the theorem follows. $\Box$

Proof 3:

We use the characterisation of Noetherian modules as those with finitely generated submodules.

1. $\Rightarrow$ 2.: Let $K\leq N$ . Then $K\leq M$ and hence $K$ is finitely generated. Let $J\leq M/N$ . Then the module $\pi _{N}^{-1}(J)$ is finitely generated, with generators $g_{1},\ldots ,g_{n}$ , say. Then the set $\pi _{N}(g_{1}),\ldots ,\pi _{N}(g_{n})$ generates $J$ since $\pi _{N}$ is surjective and linear.

2. $\Rightarrow$ 1.: Let now $K\leq M$ . Then $J:=K\cap N$ is finitely generated, since it is also a submodule of $N$ . Furthermore,

L:=\{k+N|k\in K\}

is finitely generated, since it is a submodule of $M/N$ . Let $\{k_{1}+N,\ldots ,k_{n}+N\}$ be a generating set of $L$ . Let further $S$ be a finite generating set of $J$ , and set $S':=\{k_{1},\ldots ,k_{n}\}$ . Let $k\in K$ be arbitrary. Then $k+N\in L$ , hence $k+N=\sum _{j=1}^{n}r_{j}k_{j}+N$ (with suitable $r_{j}\in R$ ) and thus $k=\sum _{j=1}^{n}r_{j}k_{j}+n$ , where $n\in N$ ; we even have $n\in J$ due to $n=k-\sum _{j=1}^{n}r_{j}k_{j}\in K$ , which is why we may write it as a linear combination of elements of $S$ . $\Box$

Proof 4:

We use the characterisation of Noetherian modules as those with maximal elements for sets of submodules.

1. $\Rightarrow$ 2.: If $\{K_{\alpha }\}_{\alpha \in A}$ is a family of submodules of $N$ , it is also a family of submodules of $M$ and hence contains a maximal element.

If $\{J_{\alpha }\}_{\alpha \in A}$ is a family of submodules of $M/N$ , then $\{\pi _{N}^{-1}(J_{\alpha })\}_{\alpha \in A}$ is a family of submodules of $M$ , which has a maximal element $\pi _{N}^{-1}(J_{\beta })$ . Since $\pi _{N}$ is inclusion-preserving and $\pi _{N}(\pi _{N}^{-1}(J))$ for all $J\leq M/N$ , $J_{\beta }$ is maximal among $\{J_{\alpha }\}_{\alpha \in A}$ .

2. $\Rightarrow$ 1.: Let $\{K_{\alpha }\}_{\alpha \in A}$ be a nonempty family of submodules of $M$ . According to the hypothesis, the family $\{K_{\alpha }\cap N\}_{\alpha \in B}$ , where $B$ is defined such that the corresponding $K_{\alpha }\cap N,\alpha \in B$ are maximal elements of the family $\{K_{\alpha }\cap N\}_{\alpha \in A}$ , is nonempty. Hence, the family $\{L_{\alpha }\}_{\alpha \in B}$ , where

L_{\alpha }:=\{k+N|k\in K_{\alpha }\}

,

has a maximal element $L_{\gamma }$ . We claim that $K_{\gamma }$ is maximal among $\{K_{\alpha }\}_{\alpha \in A}$ . Indeed, let $K_{\delta }\supseteq K_{\gamma }$ . Then $K_{\delta }\cap N=K_{\gamma }\cap N$ since $\gamma \in B$ . Hence, $\delta \in B$ . Furthermore, let $k\in K_{\gamma }$ . Then $k+N\in L_{\delta }\Rightarrow k+N\in L_{\gamma }$ , since $\delta \in B$ . Thus $k+n\in K_{\delta }$ for a suitable $n\in N$ , which must be contained within $K_{\gamma }$ and thus also in $K_{\delta }$ .

We also could have first maximized the $L_{\alpha }$ and then the $K_{\alpha }\cap N$ . $\Box$

These proofs show that if the axiom of choice turns out to be contradictory to evident principles, then the different types of Noetherian modules still have some properties in common.

The analogous statement also holds for Artinian modules:

Theorem 6.10:

Let $M$ be a module over a ring $R$ , and let $N\leq M$ . Then the following are equivalent:

$M$ is Artinian.
$N$ and $M/N$ are Artinian.

That statement is proven as in proofs 1 or 2 of the previous theorem.

Lemma 6.11:

Let $M,N$ be modules, and let $\varphi :M\to N$ be a module isomorphism. Then

M{\text{ Noetherian}}\Leftrightarrow N{\text{ Noetherian}}

.

Proof:

Since $\varphi ^{-1}$ is also a module isomorphism, $\Rightarrow$ suffices.

Let $M$ be Noetherian. Using that $\varphi$ is an inclusion-preserving bijection of submodules which maps generating sets to generating sets (due to linearity), we can use either characterisation of Noetherian modules to prove that $\varphi (M)=N$ is Noetherian. $\Box$

Theorem 6.12:

Let $M,N$ be modules and let $\varphi :M\to N$ be a surjective module homomorphism. If $M$ is Noetherian, then so is $N$ .

Proof:

Let $K\leq N$ be a submodule of $N$ . By the first isomorphism theorem, we have $N\cong M/\ker \varphi$ . By theorem 6.9, $M/\ker \varphi$ is Noetherian. Hence, by lemma 6.11, $N$ is Noetherian. $\Box$

Exercises

Exercise 6.2.1: Is every Noetherian module $M$ finitely generated?
Exercise 6.2.2: We define the ring $R$ as the real polynomials in infinitely many variables, i.e. . Prove that $R$ is a finitely generated $R$ -module over itself which is not Noetherian.

The Cayley–Hamilton theorem and Nakayama's lemma

Determinants within a commutative ring

We shall now derive the notion of a determinant in the setting of a commutative ring.

Definition 7.1 (Determinant):

Let $R$ be a commutative ring, and let $n\in \mathbb {N}$ . A determinant is a function $\det :R^{n\times n}\to R$ satisfying the following three axioms:

$\det I_{n}=1$ , where $I_{n}$ is the $n\times n$ identity matrix.
If $A$ is a matrix such that two adjacent columns are equal, then $\det A=0$ .
For each $j\in \{1,\ldots ,n\}$ we have $\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j}+c\mathbf {b} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})=\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})+c\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {b} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})$ , where $\mathbf {a} _{1},\ldots ,\mathbf {a} _{n},\mathbf {b} _{j}$ are columns and $c\in R$ .

We shall later see that there exists exactly one determinant.

Theorem 7.2 (Properties of a (the) determinant):

If $A\in R^{n\times n}$ has a column consisting entirely of zeroes, then $\det A=0$ .
If $A$ is a matrix, and one adds a multiple of one column to an adjacent column, then $\det A$ does not change.
If two adjacent columns of $A$ are exchanged, then $\det A$ is multiplied by $-1$ .
If any two columns of a matrix $A$ are exchanged, then $\det A$ is multiplied by $-1$ .
If $A$ is a matrix, and one adds a multiple of one column to any other column, then $\det A$ does not change.
If $A$ is a matrix that has two equal columns, then $\det A=0$ .
Let $\sigma \in S_{n}$ be a permutation, where $S_{n}$ is the $n$ -th symmetric group. If $A=(\mathbf {a} _{1},\ldots ,\mathbf {a} _{n})$ , then $\det(\mathbf {a} _{\sigma (1)},\ldots ,\mathbf {a} _{\sigma (n)})=\operatorname {sgn} \sigma \det A$ .

Proofs:

1. Let $A=(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})$ , where the $j$ -th column $\mathbf {a} _{j}$ is the zero vector. Then by axiom 3 for the determinant setting $c=-1$ ,

\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})=\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j}-\mathbf {a} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})=\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})-\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})=0

.

Alternatively, we may also set $c=1$ and $\mathbf {b} _{j}=\mathbf {a} _{j}=\mathbf {0}$ to obtain

\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j}+c\mathbf {b} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})=(1+c)\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})

,

from which the theorem follows by subtracting $\det A$ from both sides.

Those proofs correspond to the proofs for $T0=0$ for a linear map $T$ (in whatever context).

2. If we set $\mathbf {b} _{j}=\mathbf {a} _{j+1}$ or $\mathbf {b} _{j}=\mathbf {a} _{j-1}$ (dependent on whether we add the column left or the column right to the current column), then axiom 3 gives us

\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j}+c\mathbf {b} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})=\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})+c\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {b} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})

,

where the latter determinant is zero because we have to adjacent equal columns.

3. Consider the two matrices $A:=(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j},\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})$ and $B:=(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j+1},\mathbf {a} _{j},\ldots ,\mathbf {a} _{n})$ . By 7.2, 2. and axiom 3 for determinants, we have

{\begin{aligned}\det B&=\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j+1}+\mathbf {a} _{j},\mathbf {a} _{j},\ldots ,\mathbf {a} _{n})\\&=\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j+1}+\mathbf {a} _{j},-\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})\\&=\det(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j},-\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})\\&=-\det A\end{aligned}}

.

4. We exchange the $j$ -th and $k$ -th column by first moving the $j$ -th column successively to spot $k$ (using $|j-k|$ swaps) and the $k$ -th column, which is now one step closer to the $j$ -th spot, to spot $j$ using $|j-k|-1$ swaps. In total, we used an odd number of swaps, and all the other columns are in the same place since they moved once to the right and once to the left. Hence, 4. follows from applying 3. to each swap.

5. Let's say we want to add $c\cdot \mathbf {a} _{k}$ to the $j$ -th column. Then we first use 4. to put the $j$ -th column adjacent to $\mathbf {a} _{k}$ , then use 2. to do the addition without change to the determinant, and then use 4. again to put the $j$ -th column back to its place. In total, the only change our determinant has suffered was twice multiplication by $-1$ , which cancels even in a general ring.

6. Let's say that the $j$ -th column and the $k$ -th column are equal, $k\neq j$ . Then we subtract column $j$ from column $k$ (or, indeed, the other way round) without change to the determinant, obtain a matrix with a zero column and apply 1.

7. Split $\sigma$ into swaps, use 4. repeatedly and use further that $\operatorname {sgn}$ is a group homomorphism. $\Box$

Note that we have only used axioms 2 & 3 for the preceding proof.

The following lemma will allow us to prove the uniqueness of the determinant, and also the formula $\det(AB)=\det A\det B$ .

Lemma 7.3:

Let $A=(a_{i,j})_{1\leq i,j\leq n}$ and $B=(b_{i,j})_{1\leq i,j\leq n}$ be two $n\times n$ matrices with entries in a commutative ring $R$ . Then

\det(AB)=\det A\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )b_{1,\sigma (1)}\cdots b_{n,\sigma (n)}

.

Proof:

The matrix $AB$ has $k$ -th columns $\sum _{\nu =1}^{n}b_{\nu ,k}\mathbf {a} _{\nu }$ . Hence, by axiom 3 for determinants and theorem 7.2, 7. and 6., we obtain, denoting $AB=:C=(c_{i,j})_{1\leq i,j\leq n}=(\mathbf {c} _{1},\ldots ,\mathbf {c} _{n})$ :

{\begin{aligned}\det(AB)&=\sum _{\nu _{1}=1}^{n}b_{\nu _{1},1}\det(\mathbf {a} _{\nu _{1}},\mathbf {c} _{2},\ldots ,\mathbf {c} _{n})\\&=\sum _{\nu _{1}=1}^{n}\sum _{\nu _{2}}^{n}b_{\nu _{1},1}b_{\nu _{2},2}\det(\mathbf {a} _{\nu _{1}},\mathbf {a} _{\nu _{2}},\mathbf {c} _{3},\ldots ,\mathbf {c} _{n})\\&=\cdots =\sum _{\nu _{1},\ldots ,\nu _{n}=1}^{n}b_{\nu _{1},1}\cdots b_{\nu _{n},n}\det(\mathbf {a} _{\nu _{1}},\ldots ,\mathbf {a} _{\nu _{n}})\\&=\det A\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )b_{1,\sigma (1)}\cdots b_{n,\sigma (n)}\end{aligned}}

\Box

Theorem 7.4 (Uniqueness of the determinant):

For each commutative ring, there is at most one determinant, and if it exists, it equals

\det C=\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )c_{1,\sigma (1)}\cdots c_{n,\sigma (n)}

.

Proof:

Let $C\in R^{n\times n}$ be an arbitrary matrix, and set $A=I_{n}$ and $B=C$ in lemma 7.3. Then we obtain by axiom 1 for determinants (the first time we use that axiom)

\det C=\det(I_{n}C)=1\cdot \sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )c_{1,\sigma (1)}\cdots c_{n,\sigma (n)}

.

\Box

Theorem 7.5 (Multiplicativity of the determinant):

If $\det$ is a determinant, then

\det(AB)=\det A\det B

.

Proof:

From lemma 7.3 and theorem 7.4 we may infer

\det(AB)=\det(A)\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )b_{1,\sigma (1)}\cdots b_{n,\sigma (n)}=\det(A)\det(B)

.

\Box

Theorem 7.6 (Existence of the determinant):

Let $R$ be a commutative ring. Then

\det(A):=\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )a_{1,\sigma (1)}\cdots a_{n,\sigma (n)}

is a determinant.

Proof:

First of all, $I_{n}$ has nonzero entries everywhere except on the diagonal. Hence, if $I_{n}=(a_{i,j})_{1\leq i,j\leq n}$ , then $a_{1,\sigma (1)}\cdots a_{n,\sigma (n)}$ vanishes except $\sigma (1)=1,\ldots ,\sigma (n)=n$ , i.e. $\sigma$ is the identity. Hence $\det(I_{n})=1$ .

Let now $A$ be a matrix whose $j$ -th and $j+1$ -th columns are equal. The function

f:S_{n}\to S_{n},f(\sigma )=k\mapsto {\begin{cases}\sigma (k)&k\notin \{j,j+1\}\\\sigma (j)&k=j+1\\\sigma (j+1)&k=j\end{cases}}

is bijective, since the inverse is given by $f$ itself. Furthermore, since $f$ amounts to composing $\sigma$ with another swap, it is sign reversing. Hence, we have

{\begin{aligned}\det(A)&=\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )a_{1,\sigma (1)}\cdots a_{n,\sigma (n)}\\&=\sum _{\operatorname {sgn} \sigma =1}a_{1,\sigma (1)}\cdots a_{n,\sigma (n)}-\sum _{\operatorname {sgn} \sigma =-1}a_{1,\sigma (1)}\cdots a_{n,\sigma (n)}\\&=\sum _{\operatorname {sgn} \sigma =1}a_{1,\sigma (1)}\cdots a_{n,\sigma (n)}-\sum _{\operatorname {sgn} \sigma =1}a_{1,f(\sigma )(1)}\cdots a_{n,f(\sigma )(n)}\end{aligned}}

.

Now since the $j$ -th and $j+1$ -th column of $A$ are identical, $\forall k,l\in \mathbb {N} :a_{k,\sigma (l)}=a_{k,f(\sigma )(l)}$ . Hence $\det A=0$ .

Linearity follows from the linearity of each summand:

\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )a_{1,\sigma (1)}\cdots (a_{\sigma ^{-1}(j),j}+cb_{\sigma ^{-1}(j),j})\cdots a_{n,\sigma (n)}=\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )a_{1,\sigma (1)}\cdots a_{\sigma ^{-1}(j),j}\cdots a_{n,\sigma (n)}+c\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )a_{1,\sigma (1)}\cdots b_{\sigma ^{-1}(j),j}\cdots a_{n,\sigma (n)}

.

\Box

Theorem 7.7:

The determinant of any matrix equals the determinant of the transpose of that matrix.

Proof:

Observe that inversion is a bijection on $S_{n}$ the inverse of which is given by inversion ( $(\sigma ^{-1})^{-1}=\sigma$ ). Further observe that $\operatorname {sgn}(\sigma )=\operatorname {sgn}(\sigma ^{-1})$ , since we just apply all the transpositions in reverse order. Hence,

\det A=\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )a_{1,\sigma (1)}\cdots a_{n,\sigma (n)}=\sum _{\sigma ^{-1}\in S_{n}}\operatorname {sgn}(\sigma ^{-1})a_{1,\sigma ^{-1}(1)}\cdots a_{n,\sigma ^{-1}(n)}=\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )a_{\sigma (1),1}\cdots a_{\sigma (n),n}=\det A^{t}

.

\Box

Theorem 7.8 (column expansion):

Let $A$ be an $n\times n$ matrix over a commutative ring $R$ . For $1\leq i,j\leq n$ define $A_{i,j}$ to be the $(n-1)\times (n-1)$ matrix obtained by crossing out the $i$ -th row and $j$ -th column from $A$ . Then for any $k\in \{1,\ldots ,n\}$ we have

\det A=\sum _{\nu =1}^{n}(-1)^{\nu +k}a_{\nu ,k}\det A_{\nu ,k}

.

Proof 1:

We prove the theorem from the formula for the determinant given by theorems 7.5 and 7.6.

Let $k\in \{1,\ldots ,n\}$ be fixed. For each $\nu \in \{1,\ldots ,n\}$ , we define

f:S_{n-1}\to S_{n},f(\sigma ):=m\mapsto {\begin{cases}\nu &m=\nu \\\sigma (m)&m<\nu \wedge \sigma (m)<\nu \\\sigma (m)+1&m<\nu \wedge \sigma (m)\geq \nu \\\sigma (m-1)&m>\nu \wedge \sigma (m)<\nu \\\sigma (m-1)+1&m>\nu \wedge \sigma (m)\geq \nu \end{cases}}

.

Then

{\begin{aligned}\sum _{\nu =1}^{n}a_{\nu ,k}\det A_{\nu ,k}&=\sum _{\nu =1}^{n}(-1)^{\nu +k}a_{\nu ,k}\sum _{\sigma \in S_{n-1}}\operatorname {sgn}(\sigma )a_{1,f(\sigma )(1)}\cdots a_{k-1,f(\sigma )(k-1)}a_{k+1,f(\sigma )(k+1)}\cdots a_{n,f(\sigma )(n)}\\&=\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )a_{1,\sigma (1)}\cdots a_{n,\sigma (n)}.\end{aligned}}

\Box

Proof 2:

We note that all of the above derivations could have been done with rows instead of columns (which amounts to nothing more than exchanging $a_{i,j}$ with $a_{j,i}$ each time), and would have ended up with the same formula for the determinant since

\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )a_{1,\sigma (1)}\cdots a_{n,\sigma (n)}=\sum _{\sigma ^{-1}\in S_{n}}\operatorname {sgn}(\sigma ^{-1})a_{1,\sigma ^{-1}(1)}\cdots a_{n,\sigma ^{-1}(n)}=\sum _{\sigma \in S_{n}}\operatorname {sgn}(\sigma )a_{\sigma (1),1}\cdots a_{\sigma (n),n}

as argued in theorem 7.7.

Hence, we prove that the function $R^{n\times n}\to R$ given by the formula $\sum _{\nu =1}^{n}(-1)^{\nu +k}a_{\nu ,k}\det A_{\nu ,k}$ satisfies 1 - 3 of 7.1 with rows instead of columns, and then apply theorem 7.4 with rows instead of columns.

1.

Set $A=I_{n}$ to obtain

\sum _{\nu =1}^{n}a_{\nu ,k}(-1)^{\nu +k}\det A_{\nu ,k}=(-1)^{2k}a_{k,k}\det A_{k,k}=1\cdot 1=1

.

2.

Let $A$ have two equal adjacent rows, the $j$ -th and $j+1$ -th, say. Then

\sum _{\nu =1}^{n}a_{\nu ,k}(-1)^{\nu +k}\det A_{\nu ,k}=(-1)^{j+k}\det A_{j,k}+(-1)^{j+1+k}\det A_{j+1,k}=0

,

since each of the $A_{\nu ,k}$ has two equal adjacent rows except for possibly $\nu =j$ and $\nu =j+1$ , which is why, by theorem 7.6, the determinant is zero in all those cases, and further $A_{j,k}=A_{j+1,k}$ , since in both we deleted "the same" row.

3.

Define $B:=(b_{i,j})_{1\leq i,j\leq n}:=(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {a} _{j}+c\mathbf {b} _{j}\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})^{t}$ , and for each $\nu ,k\in \{1,\ldots ,n\}$ define $C_{\nu ,k}$ as the matrix obtained by crossing out the $\nu$ -th row and the $k$ -th column from the matrix $C:=(c_{i,j})_{1\leq i,j\leq n}:=(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},\mathbf {b} _{j}\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})^{t}$ . Then by theorem 7.6 and axiom 3 for the determinant,

{\begin{aligned}\sum _{\nu =1}^{n}b_{\nu ,k}(-1)^{\nu +k}\det B_{\nu ,k}&=\sum _{\nu =1}^{j-1}a_{\nu ,k}(-1)^{\nu +k}(\det A_{\nu ,k}+c\det C_{\nu ,k})+(-1)^{j+k}(a_{j,k}+cb_{j,k})\det A_{j,k}+\sum _{\nu =j+1}^{n}a_{\nu ,k}(-1)^{\nu +k}(\det A_{\nu ,k}+c\det C_{\nu ,k})\\&=\sum _{\nu =1}^{n}a_{\nu ,k}(-1)^{\nu +k}\det A_{\nu ,k}+c\sum _{\nu =1}^{n}c_{\nu ,k}(-1)^{\nu +k}\det C_{\nu ,k}\\&=\det A+c\det C\end{aligned}}

.

Hence follows linearity by rows. $\Box$

For the sake of completeness, we also note the following lemma:

Lemma 7.9:

Let $A$ be an invertible matrix. Then $\det(A)$ is invertible.

Proof:

Indeed, $\det(A)^{-1}=\det(A^{-1})$ due to the multiplicativity of the determinant. $\Box$

The converse is also true and will be proven in the next subsection.

Exercises

Exercise 7.1.1: Argue that the determinant, seen as a map from the set of all matrices (where scalars are $1\times 1$ -matrices), is idempotent.

Cramer's rule in the general case

Theorem 7.10 (Cramer's rule, solution of linear equations):

Let $R$ be a commutative ring, let $A=(a_{i,j})_{1\leq i,j\leq n}$ be a matrix with entries in $R$ and let $\mathbf {b} =(b_{1},\ldots ,b_{n})^{t}$ be a vector. If $A$ is invertible, the unique solution to $Ax=\mathbf {b}$ is given by

x_{j}={\frac {\det A_{j}}{\det A}}

,

where $A_{j}$ is obtained by replacing the $j$ -th column of $A$ by $\mathbf {b}$ .

Proof 1:

Let $j\in \{1,\ldots ,n\}$ be arbitrary but fixed. The determinant of $A$ is linear in the first column, and hence constitutes a linear map in the first column $L_{j}:R^{n}\to R$ mapping any vector to the determinant of $A$ with the $j$ -th column replaced by that vector. If $\mathbf {a} _{j}$ is the $j-$ -th column of $A$ , $L_{j}(\mathbf {a} _{j})=\det(A)$ . Furthermore, if we insert a different column $\mathbf {a} _{k}$ into $L_{j}$ , we obtain zero, since we obtain the determinant of a matrix where the column $\mathbf {a} _{k}$ appears twice. We now consider the system of equations

{\begin{cases}a_{1,1}x_{1}+\cdots +a_{1,n}x_{n}&=b_{1}\\&\vdots \\a_{n,1}x_{1}+\cdots +a_{n,n}x_{n}&=b_{n},\end{cases}}

where $(x_{1},\ldots ,x_{n})^{T}$ is the unique solution of the system $Ax=b$ , which exists since it is given by $A^{-1}b$ since $A$ is invertible. Since $L_{j}$ is linear, we find an $1\times n$ matrix $(c_{1},\ldots ,c_{n})$ such that for all $\mathbf {v} \in R^{n}$

(c_{1},\ldots ,c_{n})\cdot \mathbf {v} =L_{j}(\mathbf {v} )

;

in fact, due to theorem 7.8, $c_{k}=(-1)^{j+k}\det(A_{j,k})$ . We now add up the lines of the linear equation system above in the following way: We take $c_{1}$ times the first row, add $c_{2}$ times the second row and so on. Due to our considerations, this yields the result

\det(A)x_{j}=L_{j}(\mathbf {b} )

.

Due to lemma 7.9, $\det(A)$ is invertible. Hence, we get

x_{j}=(\det(A))^{-1}L_{j}(\mathbf {b} )=(\det(A))^{-1}\det(A_{j})

and hence the theorem. $\Box$

Proof 2:

For all $j\in \{1,\ldots ,n\}$ , we define the matrix

X_{j}:={\begin{pmatrix}1&0&\cdots &0&x_{1}&0&\cdots &&0\\0&1&\cdots &0&x_{2}&0&\cdots &&0\\\vdots &&\ddots &&\vdots &&&&\vdots \\\vdots &&&&\vdots &\ddots &&&\vdots \\0&&\cdots &0&x_{n}&0&\cdots &1&0\\0&&\cdots &0&x_{n}&0&\cdots &0&1\end{pmatrix}};

this matrix shall represent a unit matrix, where the $j$ -th column is replaced by the vector $(x_{1},\ldots ,x_{n})^{\mathbf {T} }$ . By expanding the $j$ -th column, we find that the determinant of this matrix is given by $\det(X_{j})=x_{j}$ .

We now note that if $A=(\mathbf {a} _{1},\ldots ,\mathbf {a} _{n})$ , then $X_{j}=A^{-1}(\mathbf {a} _{1},\ldots ,\mathbf {a} _{j-1},A\mathbf {b} ,\mathbf {a} _{j+1},\ldots ,\mathbf {a} _{n})=A^{-1}A_{j}$ . Hence

x_{j}=\det(A^{-1}A_{j})=\det(A^{-1})\det(A_{j})=\det(A)^{-1}\det(A_{j})

,

where the last equality follows as in lemma 7.9. $\Box$

Theorem 7.11 (Cramer's rule, matrix inversion):

Let $A$ be an $n\times n$ matrix with entries in a ring $R$ . We recall that the cofactor matrix $\operatorname {Cof} A$ of $A$ is the matrix with $(i,j)$ -th entry

(-1)^{i+j}\det(A_{i,j})

,

where $A_{i,j}$ is obtained from $A$ by crossing out the $i$ -th row and $j-$ -th column. We further recall that the adjugate matrix $\operatorname {adj} (A)$ was given by

\operatorname {adj} (A):=\operatorname {Cof} (A)^{\mathsf {T}}

.

With this definition, we have

\operatorname {adj} (A)A=\det(A)I_{n}

.

In particular, if $\det(A)$ is a unit within $R$ , then $A$ is invertible and

A^{-1}={\frac {1}{\det(A)}}\operatorname {adj} (A)

.

Proof:

For $j\in \{1,\ldots ,n\}$ , we set $\mathbf {b} _{j}:=e_{j}=(0,\ldots ,0,1,0,\ldots ,0)^{T}$ , where the zero is at the $j$ -th place. Further, we set $L_{j}$ to be the linear function from proof 1 of theorem 7.10, and $M_{j}$ its matrix. Then $\operatorname {adj} (A)$ is given by

\operatorname {adj} (A)={\begin{pmatrix}-M_{1}-\\\vdots \\-M_{n}-\end{pmatrix}}

due to theorem 7.8. Hence,

\operatorname {adj} (A)A={\begin{pmatrix}-M_{1}A-\\\vdots \\-M_{n}A-\end{pmatrix}}={\begin{pmatrix}\det(A)&0&\cdots &\cdots &0\\0&\det(A)&&0&\vdots \\\vdots &&\ddots &&\vdots \\\vdots &0&&\det(A)&0\\0&\cdots &\cdots &0&\det(A)\end{pmatrix}},

where we used the properties of $L_{j}$ established in proof 1 of theorem 7.10. $\Box$

The theorems

Now we may finally apply the machinery we have set up to prove the following two fundamental theorems.

Theorem 7.12 (the Cayley–Hamilton theorem):

Let $M$ be a finitely generated $R$ -module, let $\phi :M\to M$ be a module morphism and let $I\leq R$ be an Ideal of $R$ such that $\phi (M)\subseteq IM$ . Then there exist $n\in \mathbb {N}$ and $a_{n-1},\ldots ,a_{1},a_{0}\in I$ such that

\phi ^{n}+a_{n-1}\phi ^{n-1}+\cdots +a_{1}\phi +a_{0}=0

;

this equation is to be read as

\forall m\in M:\phi ^{n}(m)+a_{n-1}\phi ^{n-1}(m)+\cdots +a_{1}\phi (m)+a_{0}m=0

,

where $\phi ^{n}(m)$ means applying $\phi$ to $m$ $n$ times.

Note that the polynomial in $\phi$ is monic, that is, the leading coefficient is $1$ , the unit of the ring in question.

Proof: Assume that $\{m_{1},\ldots ,m_{n}\}$ is a generating set for $M$ . Since $\phi (M)\subseteq IM$ , we may write

\phi (m_{j})=\sum _{k=1}^{n}b_{j,k}x_{k},~j\in \{1,\ldots ,n\}

(*),

where $b_{k}\in I$ for each $k$ . We now define a new commutative ring as follows:

{\tilde {R}}:=\{\phi ^{k}|k\in \mathbb {N} \}\cup R

,

where we regard each element $r$ of $R$ as the endomorphism $m\mapsto rm$ on $M$ . That is, ${\tilde {R}}$ is a subring of the endomorphism ring of $M$ (that is, multiplication is given by composition). Since $\phi$ is $R$ -linear, ${\tilde {R}}$ is commutative.

Now to every $n\times n$ matrix $A$ with entries in ${\tilde {R}}$ we may associate a function

A():M^{n}\to M^{n},A\left((x_{1},\ldots ,x_{n})^{T}\right):=\left(\sum _{k=1}^{n}a_{1,k}(x_{1}),\ldots ,\sum _{k=1}^{n}a_{1,k}(x_{1})\right)

.

By exploiting the linearities of all functions involved, it is easy to see that for another $n\times n$ matrix with entries in ${\tilde {R}}$ called $B$ , the associated function of $AB$ equals the composition of the associated functions of $A$ and $B$ ; that is, $(AB)(x)=A(B(x))$ .

Now with this in mind, we may rewrite the system (*) as follows:

A(x)=0

,

where $A$ has $j,k$ -th entry $\delta _{j,k}\phi -b_{j,k}\in {\tilde {R}}$ . Now define $B:=\operatorname {adj} (A)$ . From Cramer's rule (theorem 7.11) we obtain that

BA=I_{n}\det(A)

,

which is why

(\det Ax_{1},\ldots ,\det Ax_{n})^{t}=(BA)(x)=B(A(x))=B(0)=\mathbf {0}

, the zero vector.

Hence, $\det A\in {\tilde {R}}$ is the zero mapping, since it sends all generators to zero. Now further, as can be seen e.g. from the representation given in theorem 7.4, it has the form

\phi ^{n}+a_{n-1}\phi ^{n-1}+\cdots +a_{1}\phi +a_{0}

for suitable $a_{n-1},\ldots ,a_{0}\in I$ . $\Box$

Theorem 7.13 (Nakayama's lemma):

Let $R$ be a ring, $M$ a finitely generated $R$ -module and $I\leq R$ an ideal such that $IM=M$ . Then there exists an $x\equiv 1\mod I$ such that $xM=0$ .

Proof:

Choose $\phi =\operatorname {Id_{M}}$ in theorem 7.12 to obtain for $m\in M$ that

\phi ^{n}(m)+a_{n-1}\phi ^{n-1}(m)+\cdots +a_{1}\phi (m)+a_{0}m=(1+a_{n-1}+\cdots +a_{0})m=0

for suitable $a_{n-1},\ldots ,a_{0}\in I$ , since the identity is idempotent. $\Box$

Direct products, direct sums and the tensor product

Direct products and direct sums

Definition 8.1:

Let $M_{\alpha },\alpha \in A$ be modules. The direct product of $M_{\alpha },\alpha \in A$ is the infinite cartesian product

\prod _{\alpha \in A}M_{\alpha }

together with component-wise addition, module operation and thus zero and additive inverses.

Theorem 8.2:

In the category of modules, the direct product constitutes a product.

Proof:

Let ${\mathcal {J}}$ be any index category, that contains one element $j_{\alpha }\in {\mathcal {J}}$ for each $\alpha \in A$ , no other elements, and only the identity morphisms. Let $N$ be any other object such that

Definition 8.3:

Let $R$ be a commutative ring, and let $\{M_{\alpha }\}_{\alpha \in A}$ be modules over $R$ . The direct sum

\bigoplus _{\alpha \in A}M_{\alpha }

is defined to be the module consisting of tuples $(m_{\alpha })_{\alpha \in A}$ where only finitely many of the $m_{\alpha }$ s are nonzero, together with component-wise addition and component-wise module operation.

Lemma 8.4:

Let $M_{\alpha },\alpha \in A$ be modules. Their direct sum is a submodule of the direct product.

Proof:

Both have the same elements and the same operations, and the direct product is a subset that is a module with those operations. Therefore we have a submodule. $\Box$

Lemma 8.5:

For each $\alpha \in A$ , there is a canonical morphism

M_{\alpha }\to \bigoplus _{\alpha \in A}M_{\alpha }

.

Proof:

\iota _{\alpha }:M_{\alpha }\to \bigoplus _{\alpha \in A}M_{\alpha },\iota _{\alpha }(m):=(0,\ldots ,0,\overbrace {m} ^{\alpha {\text{-th place}}},0,\ldots ,0)

.

\Box

Lemma 8.6:

\operatorname {Hom} _{R}\left(\bigoplus _{\alpha \in A}M_{\alpha },N\right)\cong \prod _{\alpha \in A}\operatorname {Hom} _{R}(M_{\alpha },N)

.

Proof:

Consider the morphism

\varphi :\operatorname {Hom} _{R}\left(\bigoplus _{\alpha \in A}M_{\alpha },N\right)\to \prod _{\alpha \in A}\operatorname {Hom} _{R}(M_{\alpha },N),f\mapsto (f\circ \iota _{\alpha })_{\alpha \in A}

.

We claim that this is an isomorphism, so we check all points.

1. Well-defined:

Both $\iota _{\alpha }$ and $f$ are morphisms (with suitable domains and images), so $f\circ \iota _{\alpha }$ is as well.

2. Injective:

Assume $(f\circ \iota _{\alpha })_{\alpha \in A}=(g\circ \iota _{\alpha })_{\alpha \in A}$ . Then for any $(m_{\alpha })_{\alpha \in A}$ contained in $\bigoplus _{\alpha \in A}M_{\alpha }$ we have

f((m_{\alpha })_{\alpha \in A})=\sum _{\alpha \in A}f((0,\ldots ,0,\overbrace {m} ^{\alpha {\text{-th place}}},0,\ldots ,0))=\sum _{\alpha \in A}f\circ \iota _{\alpha }((m_{\alpha })_{\alpha \in A})=\sum _{\alpha \in A}g\circ \iota _{\alpha }((m_{\alpha })_{\alpha \in A})=g((m_{\alpha })_{\alpha \in A})

;

note that the sum is finite, since we are in the direct sum; this is necessary since infinite sums are not defined. Hence $f=g$ .

3. Surjective:

Let $(h_{\alpha })_{\alpha \in A}\in \prod _{\alpha \in A}\operatorname {Hom} _{R}(M_{\alpha },N)$ . Define

f:\bigoplus _{\alpha \in A}M_{\alpha }\to N,f((m_{\alpha })_{\alpha \in A}):=\sum _{\alpha \in A}h_{\alpha }(m_{\alpha })

.

The latter sum is finite because $h_{\alpha }(0)=0$ and all but finitely many $m_{\alpha }$ are nonzero. Thus this is well-defined as a function, and direct computation proves easily that it is $R$ -linear. Hence we have a morphism, and further

\varphi (f)=(f\circ \iota _{\alpha })_{\alpha \in A}=(h_{\alpha })_{\alpha \in A}

.

\Box

Theorem 8.7:

direct sum is coproduct in category of modules

Quotient spaces

To be then used to construct the tensor product.

The tensor product

Definition 8.8:

Let $R$ be a ring and $M,N$ modules over that ring. Consider the set of all pairs

(m,n),m\in M,n\in N

and endow this with multiplication and addition by formal linear combinations, producing elements such as

\sum _{k=1}^{l}r_{k}(m_{k},n_{k})

where the $r_{k}$ are in $R$ . We have obtained the vector space of formal linear combinations (call $V$ ). Set the subspace

{\begin{aligned}S:=\langle \{&(rm,n)-r(m,n),\\&(m,rn)-r(m,n),\\&(m+m',n)-(m,n)-(m',n),\\&(m,n+n')-(m,n)-(m,n')|r\in R,m,m'\in M,n,n'\in N\}\rangle \subseteq V\end{aligned}}

,

the generated subspace. We form the quotient

M\otimes N:=V/S

.

This is called the tensor product. To indicate that $M,N$ are $R$ -modules, one often writes

M\otimes N:=M\otimes _{R}N

.

The following theorem shows that the tensor product has something to do with bilinear maps:

Theorem 8.9:

Let $M,N,K$ be $R$ -modules and let $f:M\times N\to K$ be $R$ -bilinear. Then there exists a unique morphism $g:M\otimes N\to K$ such that the following diagram commutes:

Proof:

Let $f:M\times N\to K$ be any $R$ -bilinear map. Define

g\left(\left[\sum _{j=1}^{l}r_{j}(m_{j},n_{j})\right]\right):=\sum _{j=1}^{l}r_{j}f(m_{j},n_{j})

,

where the square brackets indicate the equivalence class.

Once we proved that this is well-defined, the linearity of $g$ easily follows. We thus have to show that $g$ maps equivalent vectors to the same element, which after subtracting the right hand side follows from $g$ mapping $S$ to zero.

Indeed, let

\sum _{j=1}^{l}r_{j}s_{j}\in S

,

where all $s_{j}$ are one of the four types of generators of $S$ . By distinguishing cases, one obtains that each type of generator of $S$ is mapped to zero by $f$ because of bilinearity. Well-definedness follows, and linearity is clear from the definition and since addition and module operation interchange with equivalence class formation. $\Box$

Note that from a category theory perspective, this theorem 8.9 states that for any two modules $M,N$ over the same ring, the arrow

M\times N{\overset {\otimes }{\rightarrow }}M\otimes N

is a universal arrow. Hence, we call the result of theorem 8.9 the universal property of the tensor product.

Lemma 8.10:

Let $R$ be a ring and $M$ be an $R$ -module. Recall that using canonical operations, $R$ is an $R$ -module over itself. We have

R\otimes _{R}M\cong M

.

Proof:

Define the morphism

R\times M\to M,(r,m)\to rm

,

extend it to all formal linear combinations via summation

\sum _{k=1}^{l}s_{k}(r_{k},m_{k})\mapsto \sum _{k=1}^{l}s_{k}r_{k}m_{k}

and then observe that

\varphi :R\otimes _{R}M\to M,\varphi \left(\left[\sum _{k=1}^{l}s_{k}(r_{k},m_{k})\right]\right):=\sum _{k=1}^{l}s_{k}r_{k}m_{k}

is well-defined; again, by subtracting the right hand side, it's enough to show that $S$ is mapped to zero, and this is again done by consideration of each of the four generating types.

This is a morphism as shown by direct computation (using the rules for the module operation), it is clearly surjective (map $[(1,m)]$ ) and it is injective because if

\sum _{k=1}^{l}s_{k}r_{k}m_{k}=\sum _{j=1}^{r}s_{j}'r_{j}'m_{j}'

, then

\left[\sum _{k=1}^{l}s_{k}r_{k}m_{k}-\sum _{j=1}^{r}s_{j}'r_{j}'m_{j}'\right]=0

since $0\in S$ . $\Box$

Lemma 8.11:

Let $M,N,K$ be $R$ -modules. Then

M\otimes (N\otimes K)\cong (M\otimes N)\otimes K

.

Proof:

For $m\in M$ fixed, define the bilinear function

f_{m}:N\times K\to (M\otimes N)\otimes K,f_{m}((n,k)):=[([m,n],k)]

.

Applying theorem 8.9 yields

g_{m}:N\otimes K\to (M\otimes N)\otimes K

such that $g_{m}([(n,k)])=[([(m,n)],k)]$ . Then define

F:M\times (N\otimes K)\to (M\otimes N)\otimes K,F\left(m,\left[\sum _{j=1}^{l}r_{j}(n_{j},k_{j})\right]\right):=g_{m}\left(\left[\sum _{j=1}^{l}r_{j}(n_{j},k_{j})\right]\right)

.

This function is bilinear (linearity in $m$ from

[([(m+\lambda m',n)],k)]=[([(m,n)+\lambda (m',n)],k)]=[([(m,n)]+\lambda [(m',n)],k)]=\cdots =[([(m,n)],k)]+\lambda [([(m',n)],k)]

)

and thus theorem 8.9 yields a morphism

G:M\otimes (N\otimes K)\to (M\otimes N)\otimes K

such that

G([(m,[(n,k)])])=F(m,[(n,k)])=g_{m}([(n,k)])=[([(m,n)],k)]

.

An analogous process yields a morphism

H:(M\otimes N)\otimes K\to M\otimes (N\otimes K)

such that

H([([(m,n)],k)])=[(m,[(n,k)])]

.

Since addition within tensor products commutes with equivalence class formation, $G$ and $H$ are inverses. $\Box$

Lemma 8.12:

Let $N_{\alpha },\alpha \in A$ be $R$ -modules, let $M$ be an $R$ -module. Then

M\otimes \left(\bigoplus _{\alpha \in A}N_{\alpha }\right)\cong \bigoplus _{\alpha \in A}(M\otimes N_{\alpha })

.

Proof:

We define

f:M\times \left(\bigoplus _{\alpha \in A}N_{\alpha }\right)\to \bigoplus _{\alpha \in A}(M\otimes N_{\alpha }),(m,(n_{\alpha })_{\alpha \in A})\mapsto \left([(m,n_{\alpha })]\right)_{\alpha \in A}

.

This is bilinear (since formation of equivalence classes commutes with summation and module operation), and hence theorem 8.9 yields a morphism

g:M\otimes \left(\bigoplus _{\alpha \in A}N_{\alpha }\right)\to \bigoplus _{\alpha \in A}(M\otimes N_{\alpha })

such that

g([(m,(n_{\alpha })_{\alpha \in A})])=\left([(m,n_{\alpha })]\right)_{\alpha \in A}

.

This is obviously surjective. It is injective because

{\begin{aligned}&g\left(\left[\sum _{j=1}^{l}r_{j}(m_{j},(n_{\alpha }^{j})_{\alpha \in A})\right]\right)=g\left(\left[\sum _{q=1}^{p}s_{q}(x_{q},(y_{\alpha }^{q})_{\alpha \in A})\right]\right)\\\Leftrightarrow &\left(\left[\sum _{j=1}^{l}r_{j}(m_{j},n_{\alpha }^{j})\right]\right)_{\alpha \in A}=\left(\left[\sum _{q=1}^{p}s_{q}(x_{q},y_{\alpha }^{q})\right]\right)_{\alpha \in A}\end{aligned}}

by the linearity of $g$ and component-wise addition in the direct sum, and equality for the direct sum is component-wise. We split the argument up into sums where only one component of the right direct sum matters, and observe equality since we divide out isomorphic spaces. $\Box$

Lemma 8.13:

$M\otimes N\cong N\otimes M$ .

Proof:

Linear extension of

[(m,n)]\mapsto [(n,m)]

defines a morphism which is well-defined due to symmetry, linear by definition and bijective because of the obvious inverse. $\Box$

We have proven:

Theorem 8.14:

Let $R$ be a fixed ring. The set of all $R$ -modules forms a commutative semiring, where the addition is given by $\oplus$ (direct sum), the multiplication by $\otimes$ (tensor product), the zero by the trivial module and the unit by $R$ .

Note that we have more: From lemma 8.12 even infinite direct sums (uncountably many, as many as you like, ...) distribute over the tensor product. Incidentally, only finite direct sums are identical to the direct product. This may give hints for an infinite distributive law for infinitesimals.

Theorem 8.15 ("tensor-hom adjunction"):

Let $M,N,K$ be $R$ -modules. Then

\operatorname {Hom} (M\otimes N,K)\cong \operatorname {Hom} (M,\operatorname {Hom} (N,K))

.

Proof:

Set

\varphi :\operatorname {Hom} (M\otimes N,K)\to \operatorname {Hom} (M,\operatorname {Hom} (N,K)),\varphi (f)(m)=n\mapsto f([(m,n)])

.

Due to the equalities holding for elements of the tensor product and the linearity of $f$ , this is well-defined. Further, we obviously have linearity in $f$ since function addition and module operation are defined point-wise.

Further set

\psi :\operatorname {Hom} (M,\operatorname {Hom} (N,K))\to \operatorname {Hom} (M\times N,K),\psi (g)(m,n)=g(m)(n)

.

By theorem 8.9 and thinking outside the box, we get a map

\theta :\operatorname {Hom} (M,\operatorname {Hom} (N,K))\to \operatorname {Hom} (M\otimes N,K)

such that

\theta (g)([(m,n)])=g(m)(n)

.

Then $\theta$ and $\varphi$ are inverse morphisms, since $f:M\otimes N\to K$ is determined by what it does on elements of the form $[(m,n)]$ . $\Box$

Theorem 8.16:

Let $M,N$ be $R$ -modules isomorphic to each other (via $\theta :M\to N$ ), and let $K$ be any other $R$ -module. Then

M\otimes K\cong N\otimes K

via an isomorphism

\varphi :M\otimes K\to N\otimes K

such that

\varphi ([(m,k)])=[(\theta (m),k)]

for all $m\in M$ , $k\in K$ .

Proof:

The map

\phi :M\times K\to N\otimes K,\phi (m,k)=\theta (m)\otimes k

is bilinear, and hence induces a map

\varphi :M\otimes K\to N\otimes K

such that

\varphi ([(m,k)])=[(\theta (m),k)]

.

Similarly, the map

\phi _{-1}N\times K\to M\otimes K,\phi (n,k)=\theta ^{-1}(n)\otimes k

induces a map

\varphi ^{-1}:N\otimes K\to M\otimes K

such that

\varphi ^{-1}([(n,k)])=[(\theta ^{-1}(n),k)]

.

These maps are obviously inverse on elements of the type $m\otimes k$ , $n\otimes k$ , and by their linearity and since addition and equivalence classes commute, they are inverse to each other. $\Box$

Fractions, annihilator

Fractions within rings

Definition 9.1:

Let $R$ be a commutative ring, and let $S\subseteq R$ be an arbitrary subset. $S$ is called multiplicatively closed iff the following two conditions hold:

$1\in S$
$a,b\in S\Rightarrow ab\in S$

Definition 9.2:

Let $R$ be a ring and $S\subseteq R$ a multiplicatively closed subset. Define

S^{-1}R:=\{r/s|r\in R,s\in S\}/\sim _{S}

,

where the equivalence relation $\sim _{S}$ is defined as

r/s\sim _{S}u/t:\Leftrightarrow \exists i\in S:i(rt-su)=0

.

Equip this with addition

r/s+u/t:=(rt+us){\big /}st

and multiplication

r/s\cdot u/t:=(ru){\big /}st

.

The following two lemmata ensure that everything is correctly defined.

Lemma 9.3:

$\sim _{S}$ is an equivalence relation.

Proof:

For reflexivity and symmetry, nothing interesting happens. For transitivity, there is a little twist. Assume

r/s\sim _{S}u/t

and

u/t\sim _{S}v/w

.

Then there are $i,j\in S$ such that

i(rt-su)=0

and

j(uw-tv)=0

.

But in this case, we have

ijt(rw-vs)=ij(rwt-vst)=ij(rwt-suw+suw-vst)=0

;

note $ijt\in S$ because $S$ is multiplicatively closed. $\Box$

Lemma 9.4:

The addition and multiplication given above turn $S^{-1}R$ into a ring.

Proof:

We only prove well-definedness; the other rules follow from the definition and direct computation.

Let thus $r/s\sim _{S}u/t$ and $a/p\sim _{S}b/q$ .

Thus, we have $i(rt-su)=0$ and $j(aq-bp)=0$ for suitable $i,j\in S$ .

We want

(rp+as){\big /}sp=(uq+bt){\big /}tq

and

ra{\big /}sp=ub{\big /}tq

.

These translate to

x((rp+as)tq-(uq+bt)sp)=0

and

y(ratq-ubsp)=0

for suitable $x,y\in S$ . We get the desired result by picking $x=y=ij$ and observing

ij(ratq-ubsp)=ij(ratq-sauq+sauq-ubsp)=0

and

ij((rp+as)tq-(uq+bt)sp)=ij(rptq+astq-uqsp-btsp)=0

.

\Box

Note that we were heavily using commutativity here.

Theorem 9.5 (properties of augmentation):

Let $R$ a ring and $S\subseteq R$ multiplicatively closed. Set

\pi _{S}:R\mapsto S^{-1}(R),\pi _{S}(r):=r/1

,

the projection morphism. Then:

$s\in S\Rightarrow \pi _{S}(s)$ is a unit.
$\pi _{S}(r)=0\Rightarrow rs=0$ for some $s\in S$ .
Every element of $S^{-1}R$ has the form $\pi _{S}(r)\pi _{S}(s)^{-1}$ for suitable $r\in R$ , $s\in S$ .
Let $I,J\leq R$ be ideals. Then $S^{-1}(I\cdot J)=S^{-1}(I\cdot J)$ , where

S^{-1}I:=\{i/s|i\in I,s\in S\}

.

Let $I\leq R$ an ideal. If $S\cap I\neq \emptyset$ , then $\pi _{S}(I)=S^{-1}R$ .

We will see further properties like 4. when we go to modules, but we can't phrase it in full generality because in modules, we may not have a product of two module elements.

Proof:

1.:

If $s\in S$ , then the rules for multiplication for $S^{-1}R$ indicate that $1/s$ is an inverse for $\pi _{S}(s)=s/1$ .

2.:

Assume $r/1=0=0/1$ . Then there exists $s\in S$ such that $s(r-0)=sr=0$ .

3.:

Let $r/s$ be an arbitrary element of $S^{-1}R$ . Then $r/s=r/1\cdot 1/s=\pi _{S}(r)\pi _{S}(s)^{-1}$ .

4.

{\begin{aligned}r/s\in S^{-1}(I\cdot J)&\Leftrightarrow r/s=ij/t,i\in I,j\in J,t\in S\\&\Leftrightarrow r/s=(i/t)(j/1),i\in I,j\in J,t\in S\\&\Leftrightarrow r/s\in S^{-1}I\cdot S^{-1}J\end{aligned}}

5.

Let $I\cap S\neq \emptyset$ , that is, $s\in S\cap I$ . Then $\pi _{S}(s)\in \pi _{S}(I)$ , where $\pi _{S}(s)$ is a unit in $S^{-1}R$ . Further, $\pi _{S}(I)$ is an ideal within $S^{-1}R$ since $\pi _{S}$ is a morphism. Thus, $\pi _{S}(I)=S^{-1}R$ . $\Box$

Theorem 9.6 (universal property):

Let $R$ be a ring, $S\subseteq R$ multiplicatively closed, let $T$ be another ring and let

f:R\to T

be a morphism, such that for all $s\in S$ , $f(s)\in T^{\times }$ . Then there exists a unique morphism

g:S^{-1}R\to T

such that

f=g\circ \pi _{S}

.

Proof:

We first prove uniqueness. Assume there exists another such morphism $g'$ . Then we would have

g'(r/s)=g'(r/1)g'(s/1)^{-1}=f(r)f^{-1}=g(r/1)g(s/1)^{-1}

.

Then we prove existence; we claim that

g(r/s):=f(r)(f(s))^{-1}

defines the desired morphism.

First, we show well-definedness.

Firstly, $f(s)^{-1}$ exists for $s\in S$ .

Secondly, let $r/s\sim _{S}u/t$ , that is, $i(rt-su)=0$ . Then

{\begin{aligned}g(r/s)&=g(itr/its)\\&=f(itr)(f(its))^{-1}\\&=f(isu)(f(its))^{-1}\\&=g(isu/its)=g(u/t).\end{aligned}}

The multiplicativity of this morphism is visually obvious (use that $f\circ \pi _{S}$ is a morphism and commutativity); additivity is proven as follows:

{\begin{aligned}g(r/s+u/t)&=g\left((rt+su){\big /}st\right)\\&=f(rt+su)(f(st))^{-1}\\&=f(rt)(f(st))^{-1}+f(su)(f(st))^{-1}\\&=g(r/s)+g(u/t).\end{aligned}}

It is obvious that the unit is mapped to the unit. $\Box$

Theorem 9.7:

Category theory context

Fractions within modules

Definition 9.8:

Let $R$ be a ring, $S\subseteq R$ a multiplicative subset of $R$ and $M$ an $R$ -module. Set $S^{-1}R$ to be the ring $R$ augmented by inverses of $S$ . We define the $S^{-1}R$ -module $S^{-1}M$ as follows:

S^{-1}M:=\left\{m/s{\big |}m\in M\right\}/\sim _{S}

(the formal fractions),

where again

m/s\sim _{S}n/t:\Leftrightarrow \exists u\in S:u(tm-sn)=0

,

with addition

m/s+n/t:=(tm+sn)/st

and module operation

r/s~~m/t:=rm/st

.

Note that applying this construction to a ring $R$ that is canonically an $R$ -module over itself, we obtain nothing else but $S^{-1}R$ canonically seen as an $S^{-1}R$ -module over itself, since multiplication and addition coincide. Thus, we have a generalisation here!

That everything is well-defined is seen exactly as in the last section; the proofs carry over verbatim.

Theorem 9.9 (properties of the augmented module):

Let $M$ be an $R$ -module, let $S\subseteq M$ be a multiplicatively closed subset of $R$ , and let $N,K\leq M$ be submodules. Then

$S^{-1}(N+K)=S^{-1}N+S^{-1}K$ ,
$S^{-1}(N\cap K)=S^{-1}N\cap S^{-1}K$ , and
$S^{-1}(M/N)\cong (S^{-1}M)/(S^{-1}N)$ ;

in the first two equations, all modules are seen as submodules of $S^{-1}M$ (as above with $S^{-1}I$ ), and in the third isomorphy relation, the modules are seen as independent $S^{-1}R$ -modules.

Proof:

1.

{\begin{aligned}m/s\in S^{-1}(N+K)&\Leftrightarrow m/s=(n+k)/t,t\in S,n\in N,k\in K\\&\Leftrightarrow m/s=n/t+k/t,t\in S,n\in N,k\in K\\&\Leftrightarrow m/s\in S^{-1}N+S^{-1}K;\end{aligned}}

note that to get from the third row back to the second, we used that submodules are closed under multiplication by an element of $R$ to equalize denominators and thus get a suitable $t\in S$ ( $S$ is closed under multiplication).

2.

{\begin{aligned}m/s\in S^{-1}(N\cap K)&\Leftrightarrow m/s=l/t,t\in S,l\in N\cap K\\&\Leftrightarrow m/s=n/u=k/v,u,v\in S,n\in N,k\in K\\&\Leftrightarrow m/s\in S^{-1}N\cap S^{-1}K;\end{aligned}}

to get from the second to the first row, we note $n/u=k/v\Leftrightarrow w(vn-uk)=0$ for a suitable $w\in S$ , and in particular for example

m/s=wvn/wvu

,

where $wvn=wuk\in N\cap K$ .

3.

We set

\varphi :S^{-1}(M/N)\to (S^{-1}M)/(S^{-1}N),\varphi ((m+N)/s):=m/s+S^{-1}N

and prove that this is an isomorphism.

First we prove well-definedness. Indeed, if $m+N=m'+N$ , then $m-m'\in N$ , hence $(m-m')/s\in S^{-1}N$ and thus $m/s+S^{-1}N=m'/s+S^{-1}N$ .

Then we prove surjectivity. Let $m/s+S^{-1}N$ be given. Then obviously $(m+N)/s$ is mapped to that element.

Then we prove injectivity. Assume $m/s\in S^{-1}N$ . Then $m/s=n/t$ , where $n\in N$ and $t\in S$ , that is $u(tm-sn)=0$ for a suitable $u\in S$ . Then $utm\in N$ and therefore $(m+N)/s=(utm+N)/uts=0$ . $\Box$

Theorem 9.10:

functor relating tensor product and fractions

Theorem 9.11:

Let $M,N$ be $R$ -modules and $S\subseteq R$ multiplicatively closed. Then

S^{-1}M\otimes _{S^{-1}R}S^{-1}N\cong S^{-1}(M\otimes _{R}N)

.

Proof:

Exercises

Exercise 9.2.1: Let $M,N$ be $R$ -modules and $I\leq R$ an ideal. Prove that $IM:=\{im|i\in I,m\in M\}$ is a submodule of $M$ and that $(M\otimes _{R}N)/I(M\otimes _{R}N)\cong (M/I)\otimes _{R/I}(N/I)$ (this exercise serves the purpose of practising the proof technique employed for theorem 9.11).

The annihilator, faithfulness

Definition 9.12:

Let $R$ be a ring, $M$ a module over $R$ and $S\subseteq M$ an arbitrary subset. Then the annihilator of $S$ with respect to $M$ is defined to be the set

{\text{Ann}}_{R}(S):=\{r\in R|\forall s\in S:rs=0\}

.

Theorem 9.13:

Let $R$ be a ring, $M$ a module over $R$ and $S\subseteq M$ an arbitrary subset. Then ${\text{Ann}}_{R}(S)$ is an ideal of $R$ .

Proof:

Let $a,b\in {\text{Ann}}_{R}(S)$ and $r\in R$ . Then for all $s\in S$ , $(ra-b)s=r(as)-bs=0$ . Hence the theorem by lemma 5.3. $\Box$

Definition 9.14:

An $R$ -module $M$ is called faithful iff ${\text{Ann}}_{R}M=\{0\}$ .

Theorem 9.15:

Let $R$ be a ring. Then $R$ regarded as an $R$ module over itself is faithful.

Proof: Let $s\in R$ such that $\forall r\in R:rs=0$ . Then in particular $s1=0$ . $\Box$

Theorem 9.16:

Let $M$ be an $R$ -module and $S\subseteq M$ an arbitrary subset. Let $\langle S\rangle =:N\leq M$ be the submodule of $M$ generated by $S$ . Then ${\text{Ann}}_{R}S={\text{Ann}}_{R}N$ .

Proof:

From the definition it is clear that ${\text{Ann}}_{R}N\subseteq {\text{Ann}}_{R}S$ , since annihilating all elements of $N$ is a stronger condition than only those of $S$ .

Let now $t\in {\text{Ann}}_{R}S$ and $x_{1}s_{1}+\cdots x_{n}s_{n}\in N$ , where $x_{j}\in R$ and $s_{j}\in S$ . Then $t(x_{1}s_{1}+\cdots x_{n}s_{n})=x_{1}ts_{1}+\cdots x_{n}ts_{n}=0+0+\cdots +0=0$ . $\Box$

Local properties

Definition 9.17:

Let $M$ be an $R$ -module (where $R$ is a ring) and let $p\leq R$ be a prime ideal. Then the localisation of $M$ with respect to $p$ , denoted by

M_{p}

,

is defined to be $S^{-1}M$ with $S:=R\setminus p$ ; note that $S$ is multiplicatively closed because $p$ is a prime ideal.

Definition 9.18:

A property which modules can have (such as being equal to zero) is called a local-global property iff the following are equivalent:

$M$ has property (*).
$S^{-1}M$ has property (*) for all multiplicatively closed $S\subseteq R$ .
$M_{p}$ has property (*) for all prime ideals $p\leq R$ .
$M_{m}$ has property (*) for all maximal ideals $m\leq R$ .

Theorem 9.19:

Being equal to zero is a local-global property.

Proof:

We check the equivalence of 1. - 4. from definition 9.12. Clearly, 4. $\Rightarrow$ 1. suffices.

Assume that $N$ is a nonzero module, that is, we have $n\in N$ such that $n\neq 0$ . By theorem 9.11, $\operatorname {Ann} _{R}(n):=\operatorname {Ann} _{R}(\{n\})$ is an ideal of $R$ . Therefore, it is contained within some maximal ideal of $R$ , call $m$ (unfortunately, we have to refer to a later chapter, since we wanted to separate treatments of different algebraic objects. The required theorem is theorem 12.2). Then for $s\in R\setminus m$ we have $sn\neq 0$ and therefore $n/1\neq 0/1$ in $M_{m}$ . $\Box$

The following theorems do not really describe local-global properties, but are certainly similar and perhaps related to those.

Theorem 9.20:

If $f:M\to N$ is a morphism, then the following are equivalent:

$f:M\to N$ surjective.
$f_{S}:S^{-1}M\to S^{-1}N$ surjective for all $S\subseteq R$ multiplicatively closed.
$f_{p}:M_{p}\to N_{p}$ surjective for all $p\leq R$ prime.
$f_{m}:M_{m}\to N_{m}$ surjective for all $m\leq R$ maximal.

Proof:

Sequences of modules

Modules in category theory

Definition 10.1 ( $R$ -mod):

For each ring $R$ , there exists one category of modules, namely the modules over $R$ with module homomorphisms as the morphisms. This category is called $R$ -mod.

We aim now to prove that if $R$ is a ring, $R$ -mod is an Abelian category. We do so by verifying that modules have all the properties required for being an Abelian category.

Theorem 10.1:

The category of modules has kernels.

Proof:

For $R$ -modules $M,N$ and a morphism $f:M\to N$ we define

\ker f:=\{m\in M|f(m)=0\}

.

Sequences of augmented modules

Theorem 10.?:

Let $R$ be a ring and let $S\subseteq M$ be multiplicatively closed. Let $M,N,K$ be $R$ -modules. Then

0\rightarrow M\rightarrow N\rightarrow K\rightarrow 0

exact implies

0\rightarrow S^{-1}M\rightarrow S^{-1}N\rightarrow S^{-1}K\rightarrow 0

exact.

-category-theoretic comment

Torsion-free, flat, projective and free modules

Free modules

The following definitions are straightforward generalisations from linear algebra. We begin by repeating a definition we already saw in chapter 6.

Definition 6.1 (generators of modules):

Let $M$ be a module over the ring $R$ . A generating set of $M$ is a subset $\{m_{j}\}_{j\in J}\subseteq M$ such that

\forall n\in M:\exists j_{1},\ldots ,j_{k}\in J,r_{1},\ldots ,r_{k}\in R:n=\sum _{l=1}^{k}r_{l}m_{j_{l}}

.

We also have:

Definition 11.1:

Let $M$ be an $R$ -module. A subset $\{m_{j}\}_{j\in J}$ of $M$ is called linearly independent if and only if, whenever $j_{1},\ldots ,j_{n}\in J$ , we have

r_{1}m_{j_{1}}+\cdots +r_{n}m_{j_{n}}=0\Rightarrow r_{1},\ldots ,r_{n}=0

.

Definition 11.2:

A free $R$ -module is a module $M$ over $R$ where there exists a basis, that is, a subset $\{m_{j}\}_{j\in J}$ of $M$ that is a linearly independent generating set.

Theorem 11.3:

Let $M_{\alpha },\alpha \in A$ be free modules. Then the direct sum

\bigoplus _{\alpha \in A}M_{\alpha }

is free.

Proof:

Let bases $\{e_{\beta }\}_{\beta \in B_{\alpha }}$ of the $M_{\alpha }$ be given. We claim that

\left\{\left(0,\ldots ,0,\overbrace {e_{\beta _{\alpha }}} ^{\alpha {\text{-th place}}},0,\ldots ,0\right){\big |}\alpha \in A,\beta _{\alpha }\in B_{\alpha }\right\}

is a basis of

M:=\bigoplus _{\alpha \in A}M_{\alpha }

.

Indeed, let an arbitrary element $(m_{\alpha })_{\alpha \in A}$ be given. Then by assumption, each of the $m_{\alpha }$ has a decomposition

m_{\alpha }=\sum _{j=1}^{n_{\alpha }}r_{j,\alpha }e_{\beta _{j,\alpha }}

for suitable $e_{\beta _{j,\alpha }}\in \{e_{\beta }\}_{\beta \in B_{\alpha }}$ . By summing this, we get a decomposition of $(m_{\alpha })_{\alpha \in A}$ in the aforementioned basis. Furthermore, this decomposition must be unique, for otherwise projecting gives a new composition of one of the particular $m_{\alpha }$ . $\Box$

The converse is not true in general!

Theorem 11.4:

Let $M,N$ be free $R$ -modules, with bases $\{e_{\alpha }\}_{\alpha \in A}$ and $\{f_{\beta }\}_{\beta \in B}$ respectively. Then

M\otimes _{R}N

is a free module, with basis

\{e_{\alpha }\otimes f_{\beta }\}_{(\alpha ,\beta )\in A\times B}

,

where we wrote for short

e_{\alpha }\otimes f_{\beta }:=[(e_{\alpha },f_{\beta })]

(note that it is quite customary to use this notation).

Proof:

We first prove that our supposed basis forms a generating system. Clearly, by summation it suffices to show that elements of the form

m\otimes n

,

m\in M,n\in N

can be written in terms of the $e_{\alpha }\otimes f_{\beta }$ . Thus, write

m=\sum _{j=1}^{\mu }r_{j}e_{\alpha _{j}}

and

n=\sum _{i=1}^{\nu }s_{i}b_{\beta _{i}}

,

and obtain by the rules of computing within the tensor product, that

m\otimes n=\sum _{j=1}^{\mu }\sum _{i=1}^{\nu }r_{j}s_{i}e_{\alpha _{j}}\otimes b_{\beta _{i}}

.

On the other hand, if

0=\sum _{\alpha \in A,\beta \in B}t_{\alpha ,\beta }e_{\alpha }\otimes f_{\beta }

is a linear combination (i.e. all but finitely many summands are zero), then all the $t_{\alpha ,\beta }$ must be zero. The argument is this: Fix $\alpha ,\beta$ and define a bilinear function

f:M\times N\to R,(m,n)\mapsto r_{\alpha }s_{\beta }

,

where $r_{\alpha }$ , $s_{\beta }$ are the coefficients of $e_{\alpha }$ , $f_{\beta }$ in the decomposition of $m$ and $n$ respectively. According to the universal property of the tensor product, we obtain a linear map

g:M\otimes N\to R

with

g\circ \pi =f

,

where $\pi :M\times N\to M\otimes N$ is the canonical projection on the quotient space. We have the equations

g(e_{\alpha '}\otimes f_{\beta '})=f(e_{\alpha '},f_{\beta '})=[\alpha =\alpha '\wedge \beta =\beta ']

,

and inserting the given linear combination into this map therefore yields the desired result. $\Box$

Projective modules

The following is a generalisation of free modules:

Definition 11.5:

Let $M$ be an $R$ -module. $M$ is called projective if and only if for a fixed module $N$ and a fixed surjection $f:N\twoheadrightarrow M$ every other module morphism with codomain $M$ (call $g:K\to M$ ) has a factorisation

.

Theorem 11.6:

Every free module is projective.

Proof:

Pick a basis $\{m_{j}\}_{j\in J}$ of $M$ , let $f:N\twoheadrightarrow M$ be surjective and let $g:K\to M$ be some morphism. For each $m_{j}$ pick $n_{j}\in N$ with $f(n_{j})=m_{j}$ . Define

h:K\to N,h(k)=\sum _{i=1}^{l}r_{i}n_{j_{i}}

where

g(k)=\sum _{i=1}^{l}r_{i}m_{j_{i}}

.

This is well-defined since the linear combination describing $g(k)$ is unique. Furthermore, it is linear, since we have

g(k+rk')=\sum _{i=1}^{l}r_{i}m_{j_{i}}+r\sum _{i'=1}^{l'}r_{i}'m'_{j_{i}'}

,

where the right hand side is the sum of the linear combinations coinciding with $g(k)$ and $g(k')$ respectively, which is why $h(k+rk')=h(k)+rh(k')$ . By linearity of $f$ and definition of the $n_{j}$ , it has the desired property. $\Box$

There are a couple equivalent definitions of projective modules.

Theorem 11.7:

A module $M$ is projective if and only if there exists a module $N$ such that $K:=M\oplus N$ is free.

Proof:

$\Rightarrow$ : Define the module

L:=\bigoplus _{m\in M}R

(this obviously is a free module) and the function

f:L\to M,(0,\ldots ,0,\overbrace {r} ^{m{\text{-th place}}},0,\ldots ,0)\mapsto rm

.

$f$ is a surjective morphism, whence we obtain a commutative diagram

;

that is, $f\circ h=\operatorname {Id} _{M}$ .

We claim that the map

\varphi :M\oplus \ker f\to L,\varphi (m,k):=h(m)+k

is an isomorphism. Indeed, if $h(m)+k=0$ , then $f(h(m)+k)=f(h(m))=m=0$ and thus also $k=0$ (injectivity) and further $\varphi ((rm,0))=(0,\ldots ,0,\overbrace {r} ^{m{\text{-th place}}},0,\ldots ,0)+k$ , where $k\in \ker f$ , which is why

\varphi ((rm,-k))=(0,\ldots ,0,\overbrace {r} ^{m{\text{-th place}}},0,\ldots ,0)=(0,\ldots ,0,\overbrace {r} ^{m{\text{-th place}}},0,\ldots ,0)

(surjectivity).

$\Leftarrow$ : Assume $M\oplus L$ is a free module. Assume $f:N\twoheadrightarrow M$ is a surjective morphism, and let $g:K\to M$ be any morphism. We extend $g$ to ${\tilde {g}}:K\to M\oplus L$ via

{\tilde {g}}(k):=(g(k),0)

.

This is still linear as the composition of the linear map $g$ and the linear inclusion $M\hookrightarrow M\oplus L$ . Now $M\oplus L$ is projective since it's free. Hence, we get a commutative diagram

where ${\tilde {h}}$ satisfies $(f\times \operatorname {Id} _{L})\circ {\tilde {h}}={\tilde {g}}$ . Projecting ${\tilde {h}}$ to $N$ gives the desired diagram for $M$ . $\Box$

Definition 11.8:

An exact sequence of modules

0\rightarrow K\rightarrow N\rightarrow M\rightarrow 0

is called split exact iff we can augment it by three isomorphisms such that

commutes.

Theorem 11.9:

A module $M$ is projective iff every exact sequence

0\rightarrow K\rightarrow N\rightarrow M\rightarrow 0

is split exact.

Proof:

$\Rightarrow$ : The morphism $N\rightarrow M$ is surjective, and thus every other morphism with codomain $M$ lifts to $N$ . In particular, so does the projection $\pi :K\oplus M\to M$ . Thus, we obtain a commutative diagram

where we don't know yet whether $h$ is an isomorphism, but we can use $h$ to define the function

{\tilde {h}}:K\otimes M\to N,{\tilde {h}}(k,m):=g(k)+h(0,m)

,

which is an isomorphism due to injectivity:

Let ${\tilde {h}}(k,m)=0$ , that is $h(0,m)+g(k)=0$ . Then first

m=f(h(0,m))=f(h(0,m)+g(k))=f(0)=0

and therefore second

g(k)=h(0,m)+g(k)=0\Rightarrow k=0

.

And surjectivity:

Let $n\in N$ . Set $m:=f(n)$ . Then

h(0,m)-n\in \ker f=\operatorname {im} h

and hence $g(k)=h(0,m)-n$ for a suitable $k\in K$ , thus

n={\tilde {h}}(-k,m)

.

We thus obtain the commutative diagram

and have proven what we wanted.

$\Leftarrow$ : We prove that $M\oplus N$ is free for a suitable $N$ .

We set

K:=\bigoplus _{m\in M}R

,

f:K\to M

where $f$ is defined as in the proof of theorem 11.7 $\Rightarrow$ . We obtain an exact sequence

0\rightarrow \ker f{\overset {\iota }{\hookrightarrow }}K{\overset {f}{\rightarrow }}M\rightarrow 0

which by assumption splits as

which is why $\ker f\oplus M$ is isomorphic to the free module $K$ and hence itself free. $\Box$

Theorem 11.10:

Let $M$ and $N$ be projective $R$ -modules. Then $M\otimes N$ is projective.

Proof:

We choose $L,K$ $R$ -modules such that $M\oplus L$ and $N\oplus K$ are free. Since the tensor product of free modules is free, $(M\oplus L)\otimes (N\oplus K)$ is free. But

(M\oplus L)\otimes (N\oplus K)\cong (M\otimes N)\oplus (M\otimes K)\oplus (L\otimes N)\oplus (L\otimes K)

,

and thus $M\otimes N$ occurs as the summand of a free module and is thus projective. $\Box$

Theorem 11.11:

Let $M_{\alpha },\alpha \in A$ be $R$ -modules. Then $\bigoplus _{\alpha \in A}M_{\alpha }$ is projective if and only if each $M_{\alpha }$ is projective.

Proof:

Let first each of the $M_{\alpha }$ be projective. Then each of the $M_{\alpha }$ occurs as the direct summand of a free module, and summing all these free modules proves that $\bigoplus _{\alpha \in A}M_{\alpha }$ is the direct summand of free modules.

On the other hand, if $\bigoplus _{\alpha \in A}M_{\alpha }$ is the summand of a free module, then so are all the $M_{\alpha }$ s. $\Box$

Flat modules

The following is a generalisation of projective modules:

Definition 11.12:

An $R$ -module $M$ is called flat if and only if tensoring by it preserves exactness:

0\rightarrow N\rightarrow L\rightarrow K\rightarrow 0

exact implies

0\rightarrow N\otimes _{R}M\rightarrow L\otimes _{R}M\rightarrow K\otimes _{R}M\rightarrow 0

exact.

The morphisms in the right sequence induced by any morphism $f$ are given by the bilinear map

(x,m)\mapsto f(x)\otimes m

.

Theorem 11.13:

The module $S^{-1}R$ is a flat $R$ -module.

Proof: This follows from theorems 9.10 and 10.?. $\Box$

Theorem 11.14:

Flatness is a local property.

Proof: Exactness is a local property. Furthermore, for any multiplicatively closed $S\subseteq R$

S^{-1}(M\otimes _{R}N)\cong S^{-1}M\otimes _{S^{-1}R}S^{-1}N

by theorem 9.11. Since every $S^{-1}R$ -module is the localisation of an $R$ -module (for instance itself as an $R$ -module via $rn=r/1n$ ), the theorem follows. $\Box$

Theorem 11.15:

A projective module is flat.

Proof:

We first prove that every free module is flat. This will enable us to prove that every projective module is flat.

Indeed, if $M$ is a free module and $\{e_{\alpha }\},\alpha \in A$ a basis of $M$ , we have

M\cong \bigoplus _{\alpha \in A}R

via

\sum _{\alpha \in A}r_{\alpha }e_{\alpha }\mapsto (r_{\alpha })_{\alpha \in A}

,

where all but finitely many of the summands on the left are nonzero. Hence, by distributivity of direct sum over tensor product, if we are given any exact sequence

0\rightarrow A\rightarrow B\rightarrow C\rightarrow 0

,

to show that the sequence

0\rightarrow A\otimes M\rightarrow B\otimes M\rightarrow C\otimes M\rightarrow 0

is exact, all we have to do is to prove that

0\rightarrow A\otimes M\rightarrow B\otimes M\rightarrow C\otimes M\rightarrow 0

is exact, since we may then augment the latter sequence by suitable isomorphisms

Theorem 11.16:

direct sum flat iff all summands are

Theorem 11.17:

If $M,N$ are flat $R$ -modules, then $M\otimes _{R}N$ is as well.

Proof:

Let

0\rightarrow A\rightarrow B\rightarrow C\rightarrow 0

be an exact sequence of modules.

Torsion-free modules

The following is a generalisation of flat modules:

Definition 11.18:

Let $M$ be an $R$ -module. The torsion of $M$ is defined to be the set

T(M):=\{m\in M|\exists r\in R:rm=0\}

.

Lemma 11.19:

The torsion of a module is a submodule of that module.

Proof:

Let $m,n\in T(M)$ , $r\in R$ . Obviously $m-n\in T(M)$ (just multiply the two annihilating elements together), and further $s(rm)=r(sm)=0$ if $sm=0$ (we used commutativity here). $\Box$

We may now define torsion-free modules. They are exactly what you think they are.

Definition 11.20:

Let $M$ be a module. $M$ is called torsion-free if and only if

T(M)=\{0\}

.

Theorem 11.21:

A flat module is torsion-free.

To get a feeling for the theory, we define $S$ -torsion for a multiplicatively closed subset $S\subseteq R$ .

Definition 11.22:

Let $S\subseteq R$ be a multiplicatively closed subset of a ring $R$ , and let $M$ be an $R$ -module. Then the $S$ -torsion of $M$ is defined to be

T_{S}(M):=\{m\in M|\exists s\in S:sm=0\}

.

Theorem 11.23:

Let $S\subseteq R$ be a multiplicatively closed subset of a ring $R$ , and let $M$ be an $R$ -module. Then the $S$ -torsion of $M$ is precisely the kernel of the canonical map $\pi _{S}:M\to S^{-1}M$ .

Basic ideal theory

Prime ideals

Definition 12.1:

Let $R$ be a ring. A prime ideal $p$ is an ideal of $R$ such that whenever $ab\in p$ , either $a\in p$ or $b\in p$ .

Lemma 12.2:

Let $R$ be a ring and $I\leq R$ an ideal. $I$ is prime if and only if $R/I$ is an integral domain.

Proof:

$I$ prime is equivalent to $ab\in I\Rightarrow a\in I\vee b\in I$ . This is equivalent to

ab+I=0+I\Rightarrow a+I=0+I\vee b+I=0+I

.

\Box

Theorem 12.3:

Let $S\subseteq R$ be multiplicatively closed. Then there exists a prime ideal not intersecting $S$ .

Proof:

Order all ideals of $R$ not intersecting $S$ by set inclusion, and let a chain

I_{1}\subseteq I_{2}\subseteq \cdots \subseteq I_{k}\subseteq \cdots

be given. The ideal

I:=\bigcup _{k\in \mathbb {N} }I_{k}

(this is an ideal, since $a,b\in I\Rightarrow a\in I_{m},b\in I_{n}\Rightarrow a,b\in I_{\max\{m,n\}}$ , hence $a+b\in I$ , $ra\in I$ ) is an upper bound of the chain, since $I$ cannot intersect $S$ for else one of the $I_{k}$ would intersect $S$ . Since the given chain was arbitrary, Zorn's lemma implies the existence of a maximal ideal among all ideals not intersecting $S$ . This ideal shall be called $J$ ; we prove that it is prime.

Let $ab\in J$ , and assume for contradiction that $a\notin J$ and $b\notin J$ . Then $\langle a\rangle +J$ , $\langle b\rangle +J$ are strict superideals of $J$ and hence intersect $S$ , that is,

s=xa+yj

,

t=zb+wj'

,

$s,t\in S$ , $x,y,z,w\in R$ , $j,j'\in J$ . Then $S\ni st=xzab+yzbj+xawj'+ywjj'\in J$ , contradiction. $\Box$

Projection to the quotient ring

In this section, we want to fix a notiation. Let $R$ be a ring and $I\leq R$ an ideal. Then we may form the quotient ring $R/I$ consisting of the elements of the form $r+I$ , $r\in R$ . Throughout the book, we shall use the following notation for the canonical projection $r\mapsto r+I$ :

Definition 12.4:

Let $I\leq R$ an ideal. The map

\pi _{I}:R\to R/I,\pi _{I}(r):=r+I

is the canonical projection of $R$ to $R/I$ .

Maximal ideals

Definition 12.5:

Let $R$ be a ring. A maximal ideal $m$ of $R$ is an ideal that is not the whole ring, and there is no proper ideal $I\leq R$ such that $m\subsetneq I$ .

Lemma 12.6:

An ideal $I\leq R$ is maximal iff $R/I$ is a field.

Proof:

A ring is a field if and only if its only proper ideal is the zero ideal. For, in a field, every nonzero ideal contains $1$ , and if $R$ is not a field, it contains a non-unit $a$ , and then $\langle a\rangle$ does not contain $1$ .

By the correspondence given by the correspondence theorem, $R/I$ corresponds to $R$ , the zero ideal of $R/I$ corresponds to $I$ , and any ideal strictly in between corresponds to an ideal $K\leq R$ such that $I\subsetneq K\subsetneq R$ . Hence, $R/I$ is a field if and only if there are no proper ideals strictly containing $I$ . $\Box$

Lemma 12.7:

Any maximal ideal is prime.

Proof 1:

If $R$ is a ring, $m\leq R$ maximal, then $R/m$ is a field. Hence $R/m$ is an integral domain, hence $m$ is prime. $\Box$

Proof 2:

Let $m\leq R$ be maximal. Let $ab\in m$ . Assume $a,b\notin m$ . Then $1=ra+sn=tb+uk$ for suitable $n,k\in m$ , $r,s,t,u\in R$ . But then $1=1^{2}=(ra+sn)(tb+uk)=rtab+stbn+rauk+sunk\in m$ . $\Box$

Theorem 12.8:

Let $R$ be a ring and $I\leq R$ an ideal not equal to all of $R$ . Then there exists a maximal $m\leq R$ with $I\subseteq m$ .

Proof:

We order the set of all ideals $J$ such that $I\subseteq J$ and $J\neq R$ by inclusion. Let

J_{1}\subseteq J_{2}\subseteq \cdots \subseteq J_{k}\subseteq \cdots

be a chain of those ideals. Then set

J:=\bigcup _{k\in \mathbb {N} }J_{k}

.

Clearly, all $J_{k}$ are contained within $J$ . Since $I\subseteq J_{1}$ , $I\subseteq J$ . Further, assume $1\in J$ . Then $1\in J_{m}$ for some $m$ , contradiction. Hence, $J$ is a proper ideal such that $I\subseteq J$ , and hence an upper bound for the given chain. Since the given chain was arbitrary, we may apply Zorn's lemma to obtain the existence of a maximal element with respect to inclusion. This ideal must then be maximal, for any proper superideal also contains $I$ . $\Box$

Lemma 12.9:

Let $R$ be a ring, $I\leq R$ . Then via $\pi _{I}$ , maximal ideals of $R$ containing $I$ correspond to maximal ideals of $R/I$ .

Proof: From the correspondence theorem. $\Box$

Local rings

Definition 12.10:

A local ring is a ring that has exactly one maximal ideal.

Theorem 12.11 (characterisation of local rings):

Let $R$ be a ring. The following are equivalent:

$R$ is a local ring.
If $a+b$ is a unit, then either $a$ or $b$ is a unit, where $a,b\in R$ arbitrary.
The set of all non-units forms a maximal ideal.
If $r_{1},\ldots ,r_{n}\in R$ where $r_{1}+\cdots +r_{n}$ is a unit, then one of the $r_{j}$ is a unit.
If $r\in R$ is arbitrary, either $r$ or $1-r$ is a unit.

Proof:

1. $\Rightarrow$ 2.: Assume $a$ and $b$ are both non-units. Then $\langle a\rangle$ and $\langle b\rangle$ are proper ideals of $R$ and hence they are contained in some maximal ideal of $R$ by theorem 12.7. But there is only one maximal ideal $m$ of $R$ , and hence $a,b\in m$ , thus $a+b\in m$ . Maximal ideals can not contain units.

2. $\Rightarrow$ 3.: The sum of two non-units is a non-unit, and if $a$ is a non-unit and $r\in R$ , $ra$ is a non-unit (for if $sra=1$ , $sr$ is an inverse of $a$ ). Hence, all non-units form an ideal. Any proper ideal of $R$ contains only non-units, hence this ideal is maximal.

3. $\Rightarrow$ 4.: Assume the $r_{j}$ are all non-units. Since the non-units form an ideal, $r_{1}+\cdots +r_{n}$ is contained in that ideal of non-units, contradiction.

4. $\Rightarrow$ 5.: Assume $r$ , $1-r$ are non-units. Then $1=r+(1-r)$ is a non-unit, contradiction.

5. $\Rightarrow$ 1.: Let $m,n\leq R$ two distinct maximal ideals. Then $m+n=R$ , hence $1=s+t$ , $s\in m$ , $t\in n$ , that is, $s=1-t$ . $t$ is not a unit, so $1-t=s$ is, contradiction. $\Box$

Localisation at prime ideals

In chapter 9, we had seen how to localise a ring at a multiplicatively closed subset $S$ . An important special case is $S=R\setminus p$ , where $p$ is a prime ideal.

Lemma 12.12:

Let $p\leq R$ be a prime ideal of a ring. Then $S:=R\setminus p\subseteq R$ is multiplicatively closed.

Proof: Let $a,b\notin p$ . Then $ab$ can't be in $p$ , hence $ab\in S$ . $\Box$

Definition 12.13:

Let $p\leq R$ be a prime ideal of a ring. Set $S:=R\setminus p$ . Then

R_{p}:=S^{-1}R

is called the localisation of $R$ at $p$ .

Theorem 12.14:

Let $R$ be a ring, $p\leq R$ be prime. $R_{p}$ is a local ring.

Proof:

Set $S:=R\setminus p$ , then $R_{p}=S^{-1}R$ . Set

m:=\{r/s|r\in p,s\in S\}\subseteq S^{-1}R

.

All elements of $m$ are non-units, and all elements of $R_{p}\setminus m$ are of the form $r'/t$ , $r'\notin p$ , $t\in S$ and thus are units. Further, $m$ is an ideal since $p$ is and by definition of addition and multiplication in $S^{-1}R$ and since $S$ is multiplicatively closed. Hence $R_{p}$ is a local ring. $\Box$

This finally explains why we speak of localisation.

Nilradical and Jacobson radical

Commutative Algebra/Nilradical and Jacobson radical

Jacobson rings

Definition and elementary characterisations

Definition 14.1:

A Jacobson ring is a ring such that every prime ideal is the intersection of some maximal ideals.

Before we strive for a characterisation of Jacobson rings, we shall prove a lemma first which will be of great use in one of the proofs in that characterisation.

Lemma 14.2:

Let $R$ be a Jacobson ring and let $I\leq R$ be an ideal. Then $R/I$ is a Jacobson ring.

Proof:

Let $p\leq R/I$ be prime. Then $P:=\pi _{I}^{-1}(p)$ is prime. Hence, according to the hypothesis, we may write

P=\bigcap _{\alpha \in A}m_{\alpha }

,

where the $m_{\alpha }$ are all maximal. As $\pi _{I}$ is surjective, we have $\pi _{I}(P)=\pi _{I}(\pi _{I}^{-1}(p))=p$ . Hence, we have

p=\pi _{I}\left(\bigcap _{\alpha \in A}m_{\alpha }\right)=\bigcap _{\alpha \in A}\pi _{I}(m_{\alpha })

,

where the latter equality follows from $\forall \alpha \in A:y+I\in \pi _{I}(m_{\alpha })$ implying that for all $\alpha$ , $y=x_{\alpha }+i_{\alpha }$ , where $x_{\alpha }\in m_{\alpha }$ and $i_{\alpha }\in I\subseteq m_{\alpha }$ and thus $y\in m_{\alpha }$ . Since the ideals $\pi _{I}(m_{\alpha })$ are maximal, the claim follows. $\Box$

Theorem 14.3:

Let $R$ be a ring. The following are equivalent:

$R$ is a Jacobson ring.
Every radical ideal (see def. 13.1) is an intersection of maximal ideals.
For every $p\leq R$ prime the Jacobson radical of $R/p$ equals the zero ideal.
For every ideal $I\leq R$ , the Jacobson radical of $R/I$ is equal to the nilradical of $R/I$ .

Proof 1: We prove 1. $\Rightarrow$ 2. $\Rightarrow$ 3. $\Rightarrow$ 4. $\Rightarrow$ 1.

1. $\Rightarrow$ 2.: Let $I$ be a radical ideal. Due to theorem 13.3,

I=\bigcap _{I\subseteq p \atop p{\text{ prime}}}p

.

Now we may write each prime ideal $p$ containing $I$ as the intersection of maximal ideals (we are in a Jacobson ring) and hence obtain 1. $\Rightarrow$ 2.

2. $\Rightarrow$ 3.: Let $p\leq R$ be prime. In particular, $p$ is radical. Hence, we may write

p=\bigcap _{i\in I}m_{i}

,

where the $m_{i}$ are maximal. Now suppose that $x+p$ is contained within the Jacobson radical of $R/p$ . According to theorem 13.7, $(1-xy)+p$ is a unit within $R/p$ , where $y\in R$ is arbitrary. We want to prove $x\in p$ . Let thus $k\in I$ be such that $x\notin m_{k}$ . Then $\langle x\rangle +m_{k}=R$ and thus $1=xy+s$ with $y\in R$ and $s\in m_{k}$ , that is $s=1-xy$ . Let $a+p$ be the inverse of $s+p$ , that is $as-1\in p$ . This means $as-1\in m_{i}$ for all $i\in I$ , and in particular, $as-1\in m_{k}$ . Hence $1\in m_{k}$ , contradiction.

3. $\Rightarrow$ 4.: Let $I\leq R$ . Assume there exists $x+I\in R/I$ and a prime ideal $q\leq R/I$ such that $x\notin q$ , but $x\in m$ for all maximal $m\leq R/I$ . Let $\pi _{I}:R\to R/I$ be the canonical projection. Since preimages of prime ideals under homomorphism are prime, $p:=\pi _{I}^{-1}(q)$ is prime.

Let $m'$ be a maximal ideal within $R/p$ . Assume $x+p\notin m'$ . Let $\pi _{p}:R\to R/p$ be the canonical projection. As in the first proof of theorem 12.2, $J:=\pi _{p}^{-1}(m')$ is maximal.

We claim that $K:=\pi _{I}(J)$ is maximal. Assume $1+I\in K$ , that is $i-1\in J$ for a suitable $i\in I$ . Since $I\subseteq p\subseteq J$ , $1\in J$ , contradiction. Assume $K$ is strictly contained within $L\leq R/I$ . Let $x+I\in L\setminus K$ . Then $x\in \pi _{I}^{-1}(L)$ . If $x\in \pi _{I}^{-1}(K)$ , then $x+I\in K$ , contradiction. Hence $\pi _{I}^{-1}(L)\supsetneq \pi _{I}^{-1}(K)=J$ and thus $1\in \pi _{I}^{-1}(L)$ , that is $1+I\in L$ .

Furthermore, if $x+I\in K$ , then $x\in \pi _{I}^{-1}(\pi _{I}(J))$ . Now $\pi _{I}^{-1}(\pi _{I}(J))=I+J=J$ since $I\subseteq J$ . Hence, $x\in J$ , that is, $\pi _{p}(x)\in M'$ , a contradiction to $x+p\notin m'$ .

Thus, $x$ is contained within the Jacobson radical of $R/p$ .

4. $\Rightarrow$ 1.: Assume $q\leq R$ is prime not the intersection of maximal ideals. Then

q\subsetneq \bigcap _{q\subseteq m\leq R \atop m{\text{ maximal}}}m

.

Hence, there exists an $x\in R$ such that $q\subseteq m\Rightarrow x\in m\setminus q$ for every maximal ideal $m$ of $R$ .

The set $\{(x+q)^{n}|n\in \mathbb {N} _{0}\}$ is multiplicatively closed. Thus, theorem 12.3 gives us a prime ideal $p\leq R/q$ such that $x\notin p$ .

Let $m$ be a maximal ideal of $R/q$ that does not contain $x$ . Let $\pi :R\to R/q$ be the canonical projection. We claim that $\pi ^{-1}(m)$ is a maximal ideal containing $p$ . Indeed, the proof runs as in the first proof of theorem 12.2. Furthermore, $\pi ^{-1}(m)$ does not contain $x$ , for if it did, then $\pi (x)=x+p\in m$ . Thus we obtained a contradiction, which is why every maximal ideal of $R/q$ contains $x$ .

Since within $R/q$ , the Jacobson radical equals the Nilradical, $x$ is also contained within all prime ideals of $R/q$ , in particular within $p$ . Thus we have obtained a contradiction. $\Box$

Proof 2: We prove 1. $\Rightarrow$ 4. $\Rightarrow$ 3. $\Rightarrow$ 2. $\Rightarrow$ 1.

1. $\Rightarrow$ 4.: Due to lemma 3.10, $R/I$ is a Jacobson ring. Hence, it follows from the representations of theorem 13.3 and def. 13.6, that Nilradical and Jacobson radical of $R/I$ are equal.

4. $\Rightarrow$ 3.: Since $p$ is a radical ideal (since it is even a prime ideal), $R/p$ has no nilpotent elements and thus it's nilradical vanishes. Since the Jacobson radical of that ring equals the Nilradical due to the hypothesis, we obtain that the Jacobson radical vanishes as well.

3. $\Rightarrow$ 2.: I found no shorter path than to combine 3. $\Rightarrow$ 1. with 1. $\Rightarrow$ 2.

2. $\Rightarrow$ 1.: Every prime ideal is radical. $\Box$

Remaining arrows:

1. $\Rightarrow$ 3.: Let $p$ be a prime ideal of $R$ . Now suppose that $x+p$ is contained within the Jacobson radical of $R/p$ . According to theorem 13.7, $(1-xy)+p$ is a unit within $R/p$ , where $y\in R$ is arbitrary. Write

p=\bigcap _{i\in I}m_{i}

,

where the $m_{i}$ are maximal. We want to prove $x\in p$ . Let thus $k\in I$ be such that $x\notin m_{k}$ . Then $\langle x\rangle +m_{k}=R$ and thus $1=xy+s$ with $y\in R$ and $s\in m_{k}$ , that is $s=1-xy$ . Let $a+p$ be the inverse of $s+p$ , that is $as-1\in p$ . This means $as-1\in m_{i}$ for all $i\in I$ , and in particular, $as-1\in m_{k}$ . Hence $1\in m_{k}$ , contradiction.

3. $\Rightarrow$ 1.: Let $p\leq R$ be prime. If $p$ is maximal, there is nothing to show. If $p$ is not maximal, $R/p$ is not a field. In this case, there exists a non-unit within $R/p$ , and hence, by theorem 12.1 or 12.2 (applied to $I=(a)$ where $a$ is a non-unit), $R/p$ contains at least one maximal ideal. Furthermore, the Jacobson radical of $R/p$ is trivial, which is why there are some maximal ideals $m_{i},i\in I$ of $R/p$ such that

\bigcap _{i\in I}m_{i}=\emptyset

.

As in the first proof of theorem 12.2, $K_{i}:=\pi ^{-1}(m_{i})$ are maximal ideals of $R$ . Furthermore,

p=\bigcap _{i\in I}K_{i}

.

2. $\Rightarrow$ 4.: Let ${\mathcal {N}}_{I}$ be the nilradical of $R/I$ . We claim that

K:=\pi _{I}^{-1}({\mathcal {N}}_{I})=r(I)

.

Let first $k\in K$ , that is, $k+I\in {\mathcal {N}}_{I}$ . Then $k^{n}+I=0+I$ , that is $k^{n}\in I$ and $k\in r(I)$ . The other inclusion follows similarly, only the order is in reverse (in fact, we just did equivalences).

Due to the assumption, we may write

r(I)=\bigcap _{\alpha \in A}m_{\alpha }

,

where the $m_{\alpha }$ are maximal ideals of $R$ .

Since $\pi _{I}$ is surjective, $\pi _{I}(\pi _{I}^{-1}({\mathcal {N}}_{I}))={\mathcal {N}}_{I}$ . Hence,

{\mathcal {N}}_{I}=\pi _{I}(r(I))=\pi _{I}\left(\bigcap _{\alpha \in A}m_{\alpha }\right)=\bigcap _{\alpha \in A}\pi _{I}(m_{\alpha })

,

where the last equality follows from $\forall \alpha \in A:y+I\in \pi _{I}(m_{\alpha })$ implying that $y=x_{\alpha }+i_{\alpha }$ for $i_{\alpha }\in I\subseteq m_{\alpha }$ and $x_{\alpha }\in m_{\alpha }$ and hence $y\in m_{\alpha }$ for all $\alpha$ . Furthermore, the $\pi _{I}(m_{\alpha })$ are either maximal or equal to $R/I$ , since any ideal $J$ of $R/I$ properly containing $\pi _{I}(m_{\alpha })$ contains one element $y+I$ not contained within $\pi _{I}(m_{\alpha })$ , which is why $y\notin \pi _{I}^{-1}(\pi _{I}(m_{\alpha }))=m_{\alpha }$ , hence $\pi _{I}^{-1}(J)=R$ and thus $J=\pi _{I}(\pi _{I}^{-1}(J))=R/I$ .

Thus, ${\mathcal {N}}_{I}$ is the intersection of some maximal ideals of $R/I$ , and thus the Jacobson radical of $R/I$ is contained within it. Since the other inclusion holds in general, we are done.

4. $\Rightarrow$ 2.: As before, we have

\pi _{I}^{-1}({\mathcal {N}}_{I})=r(I)

.

Let now ${\mathcal {J}}_{I}$ be the Jacobson radical of $R/I$ , that is,

{\mathcal {J}}_{I}=\bigcap _{\alpha \in A}m_{\alpha }

,

where the $m_{\alpha }$ are the maximal ideals of $R/I$ . Then we have by the assumption:

\bigcap _{\alpha \in A}\pi _{I}^{-1}(m_{\alpha })=\pi _{I}^{-1}\left({\mathcal {J}}_{I}\right)=\pi _{I}^{-1}\left({\mathcal {N}}_{I}\right)=r(I)

.

Furthermore, as in the first proof of theorem 12.2, $\pi _{I}^{-1}(m_{\alpha })$ are maximal.

Goldman's criteria

Now we shall prove two more characterisations of being a Jacobson ring. These were established by Oscar Goldman.

Theorem 14.4 (Goldman's first criterion):

Let $R$ be a ring. $R$ is Jacobson if and only if $R[x]$ is.

This is the hard one, and we do it right away so that we have it done.

Proof:

One direction ( $\Leftarrow$ ) isn't too horrible. Let $R[x]$ be a Jacobson ring, and let $p_{0}\leq R$ be a prime ideal of $R$ . (We shall denote ideals of $R$ with a small zero as opposed to ideals of $R[x]$ to avoid confusion.)

We now define

p:=p_{0}R[x]+xR[x]

.

This ideal contains exactly the polynomials whose constant term is in $p_{0}$ . It is prime since

fg\in p\Rightarrow f\in p\vee g\in p

as can be seen by comparing the constant coefficients. Since $R[x]$ is Jacobson, for a given $a$ that is not contained within $p_{0}$ , and hence not in $p$ , there exists a maximal ideal $m$ containing $p$ , but not containing $a$ . Set $m_{0}:=m\cap R$ . We claim that $m_{0}$ is maximal. Indeed, we have an isomorphism

R[x]/m\cong R/m_{0}

via

a_{n}x^{n}+\cdots +a_{1}x+a_{0}+m\mapsto a_{0}+m_{0}

.

Therefore, $R[x]/m$ is a field if and only if $R/m_{0}$ is. Hence, $m_{0}$ is maximal, and it does not contain $a$ . Since thus every element outside $p_{0}$ can be separated from $p_{0}$ by a maximal ideal, $R$ is a Jacobson ring.

The other direction $\Rightarrow$ is a bit longer.

We have given $R$ a Jacobson ring and want to prove $R[x]$ Jacobson. Hence, let $p\leq R[x]$ be a prime ideal, and we want to show it to be the intersection of maximal ideals.

We first treat the case where $p\cap R=\{0\}$ and $R$ is an integral domain.

Assume first that $p$ does contain a nonzero element (i.e. is not equal the zero ideal).

Assume $g\in R[x]$ is contained within all maximal ideals containing $p$ , but not within $p$ . Let $f\in p$ such that $f$ is of lowest degree among all nonzero polynomials in $p$ . Since $p\cap R=\{0\}$ , $\deg f\geq 1$ . Since $R$ is an integral domain, we can form the quotient field $K=\operatorname {Quot} R$ . Then $R[x]\subseteq K[x]$ .

Assume that $f$ is not irreducible in $K[x]$ . Then $f=f_{1}f_{2}$ , $f_{1},f_{2}\in K[x]$ , where $f_{1}$ , $f_{2}$ are not associated to $f$ . Let $\alpha ,\beta ,\gamma$ such that $\alpha f,\beta f_{1},\gamma f_{2}\in R[x]$ . Then $\alpha \beta \gamma f=\alpha (\beta f_{1})(\gamma f_{2})$ . As $p$ is prime, wlog. $\alpha \beta f_{1}\in p$ . Hence $\deg f_{1}=\deg f$ . Thus, $f$ and $f_{1}$ are associated, contradiction.

$K[x]$ is Euclidean with the degree as absolute value. Uniqueness of prime factorisation gives a definition of the greatest common divisor. Since $f$ is irreducible in $K[x]$ and $g\notin p$ , $\gcd(f,g)=1$ . Applying the Euclidean algorithm, $1=fh_{1}+gh_{2}$ , $h_{1},h_{2}\in K[x]$ . Multiplication by an appropriate constant $b$ yields $b=fbh_{1}+gbh_{2}$ , $bh_{1},bh_{2}\in R[x]$ . Thus, $b\in p+gR[x]$ . Hence, $b$ is contained within every maximal ideal containing $p$ . Further, $p\cap R=\{0\}\Rightarrow b\notin p$ .

Let $m_{0}\leq R$ be any maximal ideal of $R$ not containing $a$ . Set

I:=m_{0}R[x]+p

.

Assume $I=R[x]$ . Then $1=u(x)+v(x)$ , $u\in m_{0}R[x],v\in p$ . We divide $v$ by $f$ by applying a polynomial long division algorithm working for elements of a general polynomial ring: We successively eliminate the first coefficient of $v$ by subtracting an appropriate multiple of $f$ . Should that not be possible, we multiply $v$ by the leading coefficient of $f$ , that shall be denoted by $a$ . Then we cannot eliminate the desired coefficient of $v$ , but we can eliminate the desired coefficient of $av$ . Repeating this process gives us

a^{n}v(x)=f(x)h(x)+i(x)

,

\deg i<\deg f

for $h,i\in R[x]$ . Furthermore, since this equation implies $i\in p$ , we must have $i=0$ since the degree of $f$ was minimal among polynomials in $p$ . Then

{\begin{aligned}a^{n}&=a^{n}v(x)+a^{n}u(x)\\&=f(x)h(x)+a^{n}u(x)\\&=f(x)h(x)+r(x)\end{aligned}}

with $r(x):=a^{n}u(x)\in m_{0}R[x]$ . By moving such coefficients to $r(x)$ , we may assume that no coefficient of $h$ is in $m_{0}$ . Further, $h$ is nonzero since otherwise $a^{n}\in m_{0}\Rightarrow a\in m_{0}$ . Denote the highest coefficient of $h$ by $\delta$ , and the highest coefficient of $r$ by $\epsilon$ . Since the highest coefficients of $fh$ and $r$ must cancel out (as $\deg f\geq 1$ ),

a\delta =-\epsilon

.

Thus, $a\notin m_{0}$ and $\delta \notin m_{0}$ , but $-\epsilon \in m_{0}$ , which is absurd as every maximal ideal is prime. Hence, $I\subsetneq R[x]$ .

According to theorem 12.2, there exists a maximal ideal $m\leq R[x]$ containing $I$ . Now $m\cap R$ does not equal all of $R$ , since otherwise $m=R[x]$ . Hence, $m_{0}\subseteq m\cap R$ and the maximality of $m_{0}$ imply $m\cap R=m_{0}$ . Further, $m$ is a maximal ideal containing $p$ and thus contains $b$ . Hence, $b\in m_{0}$ .

Thus, every maximal ideal $m_{0}$ that does not contain $a$ contains $b$ ; that is, $ab\in m_{0}$ for all maximal ideals $m_{0}$ of $R$ . But according to theorem 12.3, we may choose a prime ideal $p_{0}$ of $R$ not intersecting the (multiplicatively closed) set $\{(ab)^{n}|n\in \mathbb {N} \}$ , and since $R$ is a Jacobson ring, there exists a maximal ideal $m_{0}$ containing $p_{0}$ and not containing $ab$ . This is a contradiction.

Let now $p\leq R[x]$ be the zero ideal (which is prime within an integral domain). Assume that there are only finitely many elements in $R[x]$ which are irreducible in $K[x]$ , and call them $f_{1},\ldots ,f_{n}$ . The element

f_{1}(x)\cdots f_{n}(x)+1

factors into irreducible elements, but at the same time is not divisible by any of $f_{1},\ldots ,f_{n}$ , since otherwise wlog.

f_{1}(x)\cdots f_{n}(x)+1=f_{1}(x)\cdot s(x)\Leftrightarrow 1=f_{1}(x)(s(x)-f_{2}(x)\cdots f_{n}(x))

,

which is absurd. Thus, there exists at least one further irreducible element not listed in $f_{1},\ldots ,f_{n}$ , and multiplying this by an appropriate constant yields a further element of $R[x]$ irreducible in $K[x]$ .

Let $f\in R[x]$ be irreducible in $K[x]$ . We form the ideal $\langle f\rangle \leq K[x]$ and define $I_{f}:=R[x]\cap \langle f\rangle$ . We claim that $I_{f}$ is prime. Indeed, if $a(x)b(x)\in \langle f\rangle$ , then $a$ and $b$ factor in $K[x]$ into irreducible components. Since $K[x]$ is a unique factorisation domain, $f$ occurs in at least one of those two factorisations.

Assume there is a nonzero element $w(x)$ contained within all the $I_{f}$ , where $f$ is irreducible over $K[x]$ . $w$ factors in $K[x]$ uniquely into finitely many irreducible components, leading to a contradiction to the infinitude of irreducible elements of $K[x]$ . Hence,

\bigcap _{f\in K[x] \atop f{\text{ irreducible}}}I_{f}=\{0\}

,

where each $I_{f}$ is prime and $I_{f}\cap R=\{0\}$ . Hence, by the previous case, each $I_{f}$ can be written as the intersection of maximal elements, and thus, so can $p=\{0\}$ .

Now for the general case where $R$ is an arbitrary Jacobson ring and $p\leq R[x]$ is a general prime ideal of $R[x]$ . Set $p_{0}:=p\cap R$ . $p_{0}$ is a prime ideal, since if $ab\in p_{0}$ , where $a,b\in R$ , then $a\in p$ or $b\in p$ , and hence $a\in p_{0}$ or $b\in p_{0}$ . We further set $q:=p_{0}R[x]$ . Then we have

R[x]/q\cong (R/p_{0})[x]

via the isomorphism

\varphi :a_{n}x^{n}+\cdots +a_{1}x+a_{0}+q\mapsto (a_{n}+p_{0})x^{n}+\cdots +(a_{1}+p_{0})x+(a_{0}+p_{0})

.

Set

R':=R/p_{0}

and

p':=\varphi (\pi _{q}(p))

.

Then $R'$ is an integral domain and a Jacobson ring (lemma 14.2), and $p'$ is a prime ideal of $R'[x]$ with the property that $p'\cap R'=\{0\}$ . Hence, by the previous case,

p'=\bigcap _{p'\subseteq m'\leq R' \atop m'{\text{ max.}}}m'

.

Thus, since $q\subseteq p$ ,

p=\pi _{q}^{-1}(\varphi ^{-1}(p'))=(\varphi \circ \pi _{q})^{-1}\left(\bigcap _{p'\subseteq m'\leq R' \atop m'{\text{ max.}}}m'\right)=\bigcap _{p'\subseteq m'\leq R' \atop m'{\text{ max.}}}(\varphi \circ \pi _{q})^{-1}(m')

,

which is an intersection of maximal ideals due to lemma 12.4 and since isomorphisms preserve maximal ideals. $\Box$

Theorem 14.5 (Goldman's second criterion):

A ring $R$ is Jacobson if and only if for every maximal ideal $m\in R[x]$ , $m_{0}:=m\cap R$ is maximal in $R$ .

Proof:

The reverse direction $\Leftarrow$ is once again easier.

Let $p_{0}\leq R$ be a prime ideal within $R$ , and let $a\notin p_{0}$ . Set

I:=p_{0}R[x]+(ax-1)R[x]

.

Assume $I=R[x]$ . Then there exist $f\in p_{0}R[x]$ , $g\in R[x]$ such that

1=f(x)+(ax-1)g(x)

.

By shifting parts of $g$ to $f$ , one may assume that $g$ does not have any coefficients contained within $p_{0}$ . Furthermore, if $g=0$ follows $1\in p_{0}R[x]$ . Further, $p_{0}R[x]\cap R=p_{0}$ , since if $ch\in R$ , $c\in p_{0}$ , $h\in R[x]$ , then $c$ annihilates all higher coefficients of $h$ , which is why $ch$ equals the constant term of $h$ times $c$ and thus $ch\in p_{0}$ . Hence $g\neq 0$ and let $b$ be the leading coefficient of $g$ . Since the nontrivial coefficients of the polynomial $f(x)+(ax-1)g(x)$ must be zero for it being constantly one, $ab\in p_{0}$ , contradicting the primality of $p_{0}$ .

Thus, let $m\leq R[x]$ be maximal containing $I$ . Assume $m$ contains $a$ . Then $ax-(ax-1)=1\in m$ and thus $m=R[x]$ . $m$ contracts to a maximal ideal $m_{0}$ of $R$ , which does not contain $a$ , but does contain $p_{0}$ . Hence the claim.

The other direction is more tricky, but not as bad as in the previous theorem.

Let thus $R$ be a Jacobson ring. Assume there exists a maximal ideal $m\leq R[x]$ such that $R\cap m$ is not maximal within $R$ . Define

p_{0}:=m\cap R

and

p:=p_{0}R[x]

.

p_{0}

is a prime ideal, since if

a,b\in R

such that

ab\in R

,

a\in m

or

b\in m

and hence

a\in p_{0}

or

b\in p_{0}

. Further

R[x]/p\cong (R/p_{0})[x]

via the isomorphism

\varphi :a_{n}x^{n}+\cdots +a_{1}x+a_{0}+p\mapsto (a_{n}+p_{0})x^{n}+\cdots +(a_{1}+p_{0})x+(a_{0}+p_{0})

.

According to lemma 12.5, $\pi _{p}(m)$ is a maximal ideal within $R[x]/p$ . We set

R':=R/p_{0}

and

m':=\varphi (\pi _{p}(m))

.

Then $R'$ is a Jacobson ring that is not a field, $m'$ is a maximal ideal within $R'$ (isomorphisms preserve maximal ideals) and $m'\cap R'=\{0\}$ , since if $w\in R[x]$ is any element of $m$ which is not mapped to zero by $\pi _{p}$ , then at least one of $a_{n}+p_{0},\ldots ,a_{1}+p_{0}$ must be nonzero, for, if only $a_{0}\notin p_{0}$ , then $a_{0}\in (m\cap R)\setminus p_{0}$ , which is absurd.

Replacing $R$ by $R'$ and $m$ by $m'$ , we lead the assumption to a contradiction where $R$ is an integral domain but not a field and $m\cap R=\{0\}$ .

$m$ is nonzero, because else $R[x]$ would be a field. Let $f\neq 0$ have minimal degree among the nonzero polynomials of $m$ , and let $a\in R$ be the leading coefficient of $f$ .

Let $n_{0}\leq R$ be an arbitrary maximal ideal of $R$ . $n_{0}$ can not be the zero ideal, for otherwise $R$ would be a field. Hence, let $b\in n_{0}$ be nonzero. Since $m\cap R=\{0\}$ , $b\notin m$ . Since $m$ is maximal, $m+\langle b\rangle =R[x]$ . Hence, $1=g(x)+bh(x)$ , where $g\in m$ and $h\in R[x]$ . Applying the general division algorithm that was described above in order to divide $g$ by $f$ and obtain

a^{n}h(x)=s(x)f(x)+r(x)

for suitable $n\in \mathbb {N}$ and $r,s\in R[x]$ such that $\deg r<\deg f$ . From the equality holding for $h$ we get

a^{n}bh(x)=a^{n}(1-g(x))=bs(x)f(x)+br(x)\Leftrightarrow a^{n}-br(x)=bs(x)f(x)+a^{n}g(x)

.

Hence, $a^{n}-br(x)\in m$ , and since the degree of $f$ was minimal in $m$ , $a^{n}-br(x)=0$ . Since all coefficients of $br(x)$ are contained within $n_{0}$ (since they are multiplied by $b$ ), $a^{n}\in n_{0}$ . Thus $a\in n_{0}$ (maximal ideals are prime).

Hence, $a$ is contained in all maximal ideals of $R$ . But since $R$ was assumed to be an integral domain, this is impossible in view of lemma 12.3 applied to the set $S=\{a^{n}|n\in \mathbb {N} _{0}\}$ , yielding a prime ideal $p_{0}\leq R$ which is separated from $a$ by a maximal ideal since $R$ is a Jacobson ring. Hence, we have obtained a contradiction. $\Box$

The spectrum and the Zariski topology

Definition 16.1:

Let $R$ be a commutative ring. The spectrum of $R$ is the set

\operatorname {Spec} (R):=\left\{p\leq R|p{\text{ prime}}\right\}

;

i.e. the set of all prime ideals of $R$ .

On $\operatorname {Spec} R$ , we will define a topology, turning $\operatorname {Spec} R$ into a topological space. This topology will be called Zariski topology, although only Alexander Grothendieck gave the definition in the above generality.

Closed sets

Definition 16.2:

Let $R$ be a ring and $S\subseteq R$ a subset of $R$ . Then define

V(S):=\left\{p\in \operatorname {Spec} R|S\subseteq p\right\}

.

The sets $V(S)$ , where $S$ ranges over subsets of $R$ , satisfy the following equations:

Proposition 16.3:

Let $R$ be a ring, and let $(S_{\alpha })_{\alpha \in A}$ be a family of subsets of $R$ .

$V(\emptyset )=\operatorname {Spec} R$ and $V(R)=\emptyset$
$\bigcap _{\alpha \in A}V(S_{\alpha })=V\left(\bigcup _{\alpha \in A}S_{\alpha }\right)$
If $A=\{\alpha _{1},\ldots ,\alpha _{n}\}$ is finite, then $V(S_{\alpha _{1}})\cup \cdots \cup V(S_{\alpha _{n}})=V(S_{\alpha _{1}}\cap \cdots \cap S_{\alpha _{n}})$ .

Proof:

The first two items are straightforward. For the third, we use induction on $n$ . $n=1$ is clear; otherwise, the direction $\subseteq$ is clear, and the other direction follows from lemma 14.20. $\Box$

Definition 16.4:

Principal open sets

Topological properties of the spectrum

Noetherian rings

Rings as modules

Theorem 14.1:

We had already observed that a ring $R$ is a module over itself, where the module operation is given by multiplication and the addition by ring addition. In this context, we further have that the submodules of $R$ are exactly the ideals.

Proof: Being a submodule means being an additive subgroup closed under the module operation. In the above context, this is exactly the definition of ideals. $\Box$

Transfer of the properties

Definition 14.2:

Let $R$ be a (commutative) ring. $R$ is called Noetherian if and only if every ascending chain of ideals of $R$

I_{1}\subseteq I_{2}\subseteq \cdots \subseteq I_{k}\subseteq \cdots

eventually becomes stationary.

From theorems 6.7 and 14.1, we obtain the following characterisation of Noetherian rings:

Theorem 14.3:

The following are equivalent:

$R$ is Noetherian.
Every ideal of $R$ is finitely generated.
Every set of ideals of $R$ has a maximal element with respect to inclusion.

In analogy to theorem 6.11, we further obtain

Theorem 14.4:

If $R$ is Noetherian, $S$ is another ring and $\phi :R\to S$ is a surjective ring homomorphism, then $S$ is Noetherian.

Proof 1: Proceed in analogy to theorem 6.11, using the isomorphism theorem of rings. $\Box$

Proof 2: Use theorem 6.11 directly. $\Box$

New properties in the ring setting

When rings are considered, several new properties show themselves in the noetherian case.

{{TextBox| M=0 | W=100% | BG=#FFFFFF |1=Theorem 14.4:

Noetherian rings and constructions

In this section we will prove theorems involving Noetherian rings and module or localisation-like structures over them.

Theorem 14.4:

Let $R$ be Noetherian and let $M$ be a finitely generated $R$ -module. Then $M$ is Noetherian.

Theorem 14.5 (Hilbert's basis theorem):

Let $R$ be a Noetherian ring. Then the polynomial ring over $R$ , $R[x]$ , is also Noetherian.

Proof 1:

Consider any ideal $I\leq R[x]$ . We form the ideal $J\leq R$ , that shall contain all the leading coefficients of any polynomials in $I$ ; that is

a\in J:\Leftrightarrow \exists f\in I:f(x)=ax^{m}+{\text{(lower terms)}}

.

Since $R$ is Noetherian, $J$ as a finite set of generators; call those generators $j_{1},\ldots ,j_{n}$ . All $j_{k}$ belong to a certain $f_{j}\in R[x]$ as a leading coefficient; let thus $d_{k}$ be the degree of that polynomial for all $1\leq k\leq n$ . Set

d:=\max _{1\leq k\leq n}d_{k}

.

We further form the ideals $K:=\langle f_{1},\ldots ,f_{n}\rangle$ and $L:=\langle 1,x,x^{2},\ldots ,x^{d-1}\rangle \cap I$ of $R[x]$ and claim that

I=K+L

.

Indeed, certainly $K,L\subseteq I$ and thus $K+L\subseteq I$ (see the section on modules). The other direction is seen as thus: If $g(x)\in I$ , $\deg g=m$ , then we can set $a\in R$ to be the leading coefficient of $f$ , write $a=r_{1}j_{1}+\cdots +r_{n}j_{n}$ for suitable $r_{1},\ldots ,r_{n}\in R$ and then subtract $h(x):=(r_{1}j_{1}f_{1}x^{m-d_{1}}+\cdots +r_{n}j_{n}f_{n}x^{m-d_{n}})$ , to obtain

\deg(g-h)<m

so long as $m\geq d$ . By repetition of this procedure, we subtract a polynomial $h'$ of $g$ to obtain a polynomial in $L$ , that is, $g\in K+L$ .

However, both $K$ and $L$ are finitely generated ideals ( $\langle 1,x,x^{2},\ldots ,x^{d-1}\rangle$ is finitely generated as an $R$ -module and hence Noetherian by the previous theorem, which is why so is $L$ as a submodule of a Noetherian module). Since the sum of finitely generated ideals is clearly finitely generated, $I$ is finitely generated. $\Box$

Exercises

Let $R$ be a Noetherian ring, and let $M$ be an $R$ -module. Prove that $M$ is Noetherian if and only if it is finitely generated. (Hint: Is there any surjective ring homomorphism $R[x_{1},\ldots ,x_{n}]\to M$ , where $n$ is the number of generators of $M$ ? If so, what does the first isomorphism theorem say to that?)

Noetherian spaces

Primary decomposition

The following theory was originally developed by world chess champion Emmanuel Lasker in his 1905 paper "Zur theorie der Moduln und Ideale" ("On the theory of modules and ideals") on polynomial rings, and then generalised by Emmy Noether to commutative rings satisfying the ascending chain condition (noetherian rings), in her revolutionary 1921 paper "Idealtheorie in Ringbereichen".

Primary ideals

Definition 19.4:

An ideal $q\leq R$ is called primary ideal if and only if the following holds:

xy\in q\Rightarrow x\in q\vee \exists n\in \mathbb {N} :y^{n}\in q

.

Clearly, every prime ideal is primary.

We have the following characterisations:

Theorem 19.5 (characterisations of primary ideals):

Let $q\leq R$ , with $r(q)$ denoting the radical ideal of $q$ . The following are equivalent:

$q$ is primary.
If $xy\in q$ , then either $x\in q$ or $y\in q$ or $x\in r(q)\wedge y\in r(q)$ .
Every zerodivisor of $R/q$ is nilpotent.

Proof 1:

1. $\Rightarrow$ 2.: Let $q$ be primary. Assume $xy\in q$ and neither $x\in q$ nor $y\in q$ . Since $x\notin q$ , $y^{k}\in q$ for a suitable $k\geq 2$ . Since $yx\in q$ and $y\notin q$ , $x^{m}\in q$ for a suitable $m\geq 2$ .

2. $\Rightarrow$ 3.: Let $d+q$ be a zerodivisor of $R/q$ , that is, $cd\in q$ for a certain $c\in R$ such that $c\notin q$ . Hence $d\in r(q)$ , that is, $d^{k}\in q$ for a suitable $k$ .

3. $\Rightarrow$ 1.: Let $xy\in q$ . Then either $x\in q$ or $y\in q$ or $x+q$ is a zerodivisor within $R/q$ , which is why $x^{k}+q=0$ for a suitable $k$ . $\Box$

Proof 2:

1. $\Rightarrow$ 3.: Let $q$ be primary, and let $x+q$ be a zerodivisor within $R/q$ . Then $xy\in q$ for a $y\notin q$ and hence $x^{k}\in q$ for a suitable $k$ .

3. $\Rightarrow$ 2.: Let $xy\in q$ . Assume neither $x\in q$ nor $y\in q$ . Then both $x+q$ and $y+q$ are zerodivisors in $R/q$ , and hence are nilpotent, which is why $x^{k},y^{m}\in q$ for suitable $k,m$ and hence $x,y\in r(q)$ .

2. $\Rightarrow$ 1.: Let $xy\in q$ . Assume not $x\in q$ and not $y\in q$ . Then in particular $y\in r(q)$ , that is, $y^{k}\in q$ for suitable $k$ . $\Box$

Theorem 19.6:

If $q\leq R$ is any primary ideal, then $r(q)$ is prime.

Proof:

Let $xy\in r(q)$ . Then $(xy)^{n}\in q$ for a suitable $n\in \mathbb {N}$ . Hence either $x^{n}\in q$ and thus $x\in r(q)$ or $(y^{n})^{m}\in q$ for a suitable $m\in \mathbb {N}$ and hence $y\in r(q)$ . $\Box$

Existence

Existence in the Noetherian case

Following the exposition of Zariski, Samuel and Cohen, we deduce the classical Noetherian existence theorem from two lemmas and a definition.

Definition 19.7:

An ideal $I\leq R$ is called irreducible if and only if it can not be written as the intersection of finitely many proper superideals.

Lemma 19.8:

In a Noetherian ring, every irreducible ideal is primary.

Proof:

Assume there exists an irreducible ideal $I$ which is not primary. Since $I$ is not primary, there exist $x,y\in R$ such that $xy\in I$ , but neither $x\in I$ nor $y^{n}\in I$ for any $n\in \mathbb {N}$ . We form the ascending chain of ideals

(I:y)\subseteq (I:y^{2})\subseteq (I:y^{3})\subseteq \cdots

;

this chain is ascending because $ry^{n}\in I\Rightarrow ry^{n+1}\in I$ . Since we are in a Noetherian ring, this chain eventually stabilizes at some $m\in \mathbb {N}$ ; that is, for $k\geq m$ we have $(I:y^{k})=(I:y^{k+1})$ . We now claim that

I=(I+\langle x\rangle )\cap (I+y^{n}R)

.

Indeed, $\subseteq$ is obvious, and for $\supseteq$ we note that if $r\in (I+\langle x\rangle )\cap (I+y^{n}R)$ , then

r=i+sx=j+y^{n}t

for suitable $i,j\in I$ and $s,t\in R$ , which is why $sx-y^{n}t\in I$ , hence $sxy-y^{n+1}t\in I$ , since $xy\in I$ thus $y^{n+1}t\in I$ , $t\in (I:y^{n+1})=(I:y^{n})$ , hence $y^{n}t\in I$ and $sx\in I$ . Therefore $r\in I$ .

Furthermore, by the choice of $x$ and $y$ both $I+\langle x\rangle$ and $I+y^{n}R$ are proper superideals, contradicting the irreducibility of $I$ . $\Box$

Lemma 19.9:

In a Noetherian ring, every ideal can be written as the finite intersection of irreducible ideals.

Proof:

Assume otherwise. Consider the set of all ideals that are not the finite intersection of irreducible ideals. If we are given an ascending chain within that set

I_{1}\subsetneq I_{2}\subsetneq \cdots

,

this chain has an upper bound, since it stabilizes as we are in a Noetherian ring. We may hence choose a maximal element $I$ among all ideals that are not the finite intersection of irreducible ideals. $I$ itself is thus not irreducible. Hence, it can be written as the intersection of strict superideals; that is

I=J_{1}\cap \cdots \cap J_{n}

for appropriate $J_{i}\supsetneq I$ . Since $I$ is maximal, each $J_{i}$ is a finite intersection of irreducible ideals, and hence so is $I$ , which contradicts the choice of $I$ . $\Box$

Corollary 19.10:

In a Noetherian ring, every ideal can be written as the finite intersection of primary ideals.

Proof:

Combine lemmas 19.8 and 19.9. $\Box$

Minimal decomposition

Definition 19.11:

Let $I\leq R$ be an ideal in a ring, and let

I=\bigcap _{s=1}^{r}q_{s}

be a primary decomposition of $I$ . This decomposition is called minimal if and only if

there does not exist $t\in \{1,\ldots ,r\}$ with $q_{t}\supseteq \bigcap _{s=1 \atop s\neq t}^{r}q_{s}$ , and
for all $i\neq j$ , $r(q_{i})\neq r(q_{j})$ (that is, the radicals of the prime ideals are pairwise distinct).

In fact, once we have a primary composition for a given ideal, we can find a minimal primary decomposition of that ideal. But before we prove that, we need a general fact about radicals first.

Lemma 19.12:

Let $I_{1},\ldots ,I_{n}$ be ideals. Then

r\left(\bigcap _{j=1}^{n}I_{j}\right)=\bigcap _{j=1}^{n}r(I_{j})

.

One could phrase this lemma as "radical interchanges with finite intersections".

Proof:

$\Rightarrow$ :

{\begin{aligned}s\in r\left(\bigcap _{j=1}^{n}I_{j}\right)&\Leftrightarrow \exists k\in \mathbb {N} :s^{k}\in \bigcap _{j=1}^{n}I_{j}\\&\Leftrightarrow \exists k\in \mathbb {N} :\forall j\in \{1,\ldots ,n\}:s^{k}\in I_{j}\\&\Rightarrow \forall j\in \{1,\ldots ,n\}:\exists k\in \mathbb {N} :s^{k}\in I_{j}\\&\Leftrightarrow s\in \bigcap _{j=1}^{n}r(I_{j}).\end{aligned}}

$\Leftarrow$ : Let $s\in \bigcap _{j=1}^{n}r(I_{j})$ . For each $j$ , choose $k_{j}$ such that $s^{k_{j}}\in I_{j}$ . Set

k:=\max\{k_{1},\ldots ,k_{n}\}

.

Then $s^{k}\in \bigcap _{j=1}^{n}I_{j}$ , hence $s\in r\left(\bigcap _{j=1}^{n}I_{j}\right)$ . $\Box$

Note that for infinite intersections, the lemma need not (!!!) be true.

Theorem 19.13:

Let $I\leq R$ be an ideal in a ring that has a primary decomposition. Then $I$ also has a minimal primary decomposition.

Proof 1:

First of all, we may exclude all primary ideals $q_{t}$ for which

q_{t}\supseteq \bigcap _{s=1 \atop s\neq t}^{r}q_{s}

;

the intersection won't change if we do that, for intersecting with a superset changes nothing in general.

Then assume we are given a decomposition

I=\bigcap _{j=1}^{n}q_{j}

,

and for a fixed prime ideal $p$ set

q_{p}:=\bigcap _{r(q_{j})=p}q_{j}

;

due to theorem 19.6,

I=\bigcap _{p\leq R{\text{ prime}}}q_{p}

.

We claim that $q_{p}$ is primary, and $r(q_{p})=p$ . For the first claim, note that by the previous lemma

r\left(\bigcap _{r(q_{j})=p}q_{j}\right)=\bigcap _{r(q_{j})=p}r(q_{j})=p

.

For the second claim, let $xy\in q_{p}$ . If $x\in q_{p}$ there is nothing to prove. Otherwise let $x\notin q_{p}$ . Then there exists $q_{l}$ such that $x\notin q_{l}$ , and hence $y^{k}\in q_{l}$ for a suitable $k$ . Thus $y\in p$ , and hence $y^{k_{j}}\in q_{j}$ for all $j$ and suitable $k_{j}$ . Pick

m:=\max\{k_{j}|r(q_{j})=p\}

.

Then $y^{m}\in q_{p}$ . Hence, $q_{p}$ is primary. $\Box$

Uniqueness properties

In general, we don't have uniqueness for primary decompositions, but still, any two primary decompositions of the same ideal in a ring look somewhat similar. The classical first and second uniqueness theorems uncover some of these similarities.

Theorem 19.14 (first uniqueness theorem):

Let $I\leq R$ be an ideal within a ring $r$ , and assume we are given a minimal primary decomposition

I=\bigcap _{j=1}^{n}q_{j}

.

Then the prime (theorem 19.6) ideals $p_{j}:=r(q_{j})$ are exactly the prime ideals among the ideals $r((I:x)),x\in R$ and hence are independent of the choice of the particular decomposition. That is, the ideals $p_{j}$ are uniquely determined by $I$ .

Proof:

We begin by deducing an equation. According to theorem 19.2 and lemma 19.12,

r((I:x))=r\left(\left(\bigcap _{j=1}^{n}q_{j}:x\right)\right)=r\left(\bigcap _{j=1}^{n}(q_{j}:x)\right)=\bigcap _{j=1}^{n}r((q_{j}:x))

.

Now we fix $q_{j}$ and distinguish a few cases.

If $x\in q_{j}$ , then obviously $(q_{j}:x)=R$ .
If $x\notin p_{j}$ (where again $p_{j}=r(q_{j})$ ), then if $sx\in q_{j}$ we must have $s\in q_{j}$ since no power of $x$ is contained within $q_{j}$ .
If $x\in p_{j}$ , but $x\notin q_{j}$ , we have $r((q_{i}:x))=p_{i}$ , since
${\begin{aligned}r\in r((q_{i}:x))&\Leftrightarrow \exists k\in \mathbb {N} :r^{k}x\in q_{i}\\&\Leftrightarrow \exists k\in \mathbb {N} :\exists m\in \mathbb {N} :(r^{k})^{m}\in q_{i}\\&\Leftrightarrow r\in p_{i}.\end{aligned}}$

In conclusion, we find

r((I:x))=\bigcap _{j=1 \atop x\notin q_{j}}^{n}p_{j}

.

Assume first that $r((I:x))$ is prime. Then the prime avoidance lemma implies that $r((I:x))$ is contained within one of the $p_{j}$ , $x\notin q_{j}$ , and since $p_{j}=r((q_{j},x))\subseteq r((I:x))$ , $r((I:x))=p_{j}$ .

Let now $p_{j}$ for $j\in \{1,\ldots ,n\}$ be given. Since the given primary decomposition is minimal, we find $x\in R$ such that $x\notin p_{j}$ , but $x\in \bigcap _{l=1 \atop l\neq j}^{n}p_{l}$ . In this case, $r((I:x))=p_{j}$ by the above equation. $\Box$

This theorem motivates and enables the following definition:

Definition 19.15:

Let $I$ be any ideal that has a minimal primary decomposition

I=\bigcap _{j=1}^{n}q_{j}

.

Then the ideals $p_{j}:=r(q_{j})$ are called the prime ideals belonging to $I$ .

We now prove two lemmas, each of which will below yield a proof of the second uniqueness theorem (see below).

Lemma 19.16:

Let $I\leq R$ be an ideal which has a primary decomposition

I=\bigcap _{j=1}^{n}q_{j}

,

and let again $p_{j}:=r(q_{j})$ for all $j$ . If we define

q_{j}':=\{x\in R|(I:x)\not \subseteq p_{j}\}

,

then $q_{j}'$ is an ideal of $R$ and $q_{j}'\subseteq q_{j}$ .

Proof:

Let $a,b\in q_{j}'$ . There exists $c\notin p_{j}$ such that $c\langle a\rangle \subseteq I$ without $c\in p_{j}$ , and a similar $d$ with an analogous property in regard to $b$ . Hence $cd\langle a-b\rangle \subseteq I$ , but not $cd\in p_{j}$ since $p_{j}$ is prime. Also, $cd\langle ab\rangle \subseteq I$ . Hence, we have an ideal.

Let $x\in q_{j}'$ . There exists $c\notin p_{j}$ such that

c\langle x\rangle \subseteq I=\bigcap _{j=1}^{n}q_{j}

.

In particular, $cx\in q_{i}$ . Since no power of $c$ is in $q_{i}$ , $x\in q_{i}$ . $\Box$

Lemma 19.17:

Let $S\subseteq R$ be multiplicatively closed, and let

\pi _{S}:R\to S^{-1}R,r\mapsto r/1

be the canonical morphism. Let $I$ be a decomposable ideal, that is

I=\bigcap _{j=1}^{n}q_{j}

for primary $q_{j}$ , and number the $q_{j}$ such that the first $r$ $q_{j}$ have empty intersection with $S$ , and the others nonempty intersection. Then

\pi _{S}^{-1}\circ \pi _{S}(I)=\bigcap _{j=1}^{r}q_{r}

.

Proof:

We have

\pi _{S}(I)=\pi _{S}\left(\bigcap _{j=1}^{n}q_{j}\right)=\bigcap _{j=1}^{n}\pi _{S}(q_{j})

by theorem 9.?. If now $S\cap q_{j}\neq \emptyset$ , lemma 9.? yields $\pi _{S}(q_{j})=S^{-1}R$ . Hence,

\pi _{S}(I)=\bigcap _{j=1}^{n}\pi _{S}(q_{j})=\bigcap _{j=1}^{r}\pi _{S}(q_{j})

.

Application of $\pi _{S}^{-1}$ on both sides yields

\pi _{S}^{-1}\circ \pi _{S}(I)=\bigcap _{j=1}^{r}\pi _{S}^{-1}\pi _{S}(q_{j})

,

and

\pi _{S}^{-1}\pi _{S}(q_{j})=q_{j}

since $\supseteq$ holds for general maps, and $x\in \pi _{S}^{-1}\pi _{S}(q_{j})$ means $\pi _{S}(x)=r/s$ , where $r\in q_{j}$ and $s\in S$ ; thus $\pi _{S}(sx)=r/1$ , that is $(sx)/1=r/1$ . This means that

\exists t\in S:tsx=tr

.

Hence $tsx\in q_{j}$ , and since no power of $ts$ is in $q_{j}$ ( $S$ is multiplicatively closed and $S\cap q_{j}=\emptyset$ ), $x\in q_{j}$ . $\Box$

Definition 19.18:

Let $I$ be an ideal which admits a primary decomposition, and let $P\subseteq \operatorname {Spec} R$ be a set of prime ideals of $R$ that all belong to $I$ . $P$ is called isolated if and only if for every prime ideal $p\in P$ , if $p'$ is a prime ideal belonging to $I$ such that $p'\subseteq p$ , then $p'\in P$ as well.

Theorem 19.19 (second uniqueness theorem):

Let $I$ be an ideal that has a minimal primary decomposition. If $P=\{p_{i_{1}},\ldots ,p_{i_{k}}\}$ is a subset of the set of the prime ideals belonging to $I$ which is isolated, then

\bigcap _{l=1}^{k}q_{i_{l}}

is independent of the particular minimal primary decomposition from which the $q_{i_{l}}$ are coming.

Note that applied to reduced sets consisting of only one prime ideal, this means that if all prime subideals of a prime ideal $p_{i}$ belonging to $I$ also belong to $I$ , then the corresponding $q_{i}$ is predetermined.

Proof 1 (using lemma 19.16):

We first reduce the theorem down to the case where $P$ is the set of all prime subideals belonging to $I$ of a prime ideal that belongs to $I$ . Let $P$ be any reduced system. For each maximal element of that set $p_{r}$ (w.r.t. inclusion) define $P_{r}$ to be the set of all ideals in $P$ contained in $p_{r}$ . Since $P$ is finite,

P=\bigcup _{p_{r}{\text{ maximal in }}P}P_{r}

;

this need not be a disjoint union (note that these are not maximal ideals!). Hence

\bigcap _{l=1}^{k}q_{i_{l}}=\bigcap _{p_{r}{\text{ maximal in }}P}\bigcap _{p_{j}\in P_{r}}q_{j}

.

Hence, let $p$ be an ideal belonging to $I$ and let $P=\{p_{i_{1}},\ldots ,p_{i_{k}}\}$ be an isolated system of subideals of $p$ . Let $p_{j_{1}},\ldots ,p_{j_{m}}$ be all the primary ideals belonging to $I$ not in $P$ . For those ideals, we have $p_{j_{l}}\not \subseteq p$ , and hence we find $b_{j_{l}}\in p_{j_{l}}\setminus p$ . For each $l$ take $s(l)\in \mathbb {N}$ large enough so that $b_{j_{l}}^{s(l)}\in q_{j_{l}}$ . Then

b:=\prod _{l=1}^{m}b_{j_{l}}^{s(l)}\in \bigcap _{l=1}^{m}q_{j_{l}}

,

which is why $b\bigcap _{p_{t}\in P}q_{t}\subseteq I$ . From this follows that

\bigcap _{p_{t}\in P}q_{t}\subseteq q'

,

where $q$ is the element in the primary decomposition of $I$ to which $p$ is associated, since clearly for each element $x$ of the left hand side, $bx\in I$ and thus $b\langle x\rangle \subseteq I$ , but also $b\notin p$ . But on the other hand, $p_{t}\subseteq p$ implies $q\subseteq q_{t}$ . Hence for any such $t$ lemma 19.16 implies

q'\subseteq q\subseteq q_{t}

,

which in turn implies

\bigcap _{p_{t}\in P}q_{t}\supseteq q'

.

\Box

Proof 2 (using lemma 19.17):

Let $\{p_{i_{1}},\ldots ,p_{i_{k}}\}$ be an isolated system of prime ideals belonging to $I$ . Pick

S:=R\setminus (p_{i_{1}}\cup \ldots \cup p_{i_{k}})=(R\setminus p_{i_{1}})\cap \cdots \cap (R\setminus p_{i_{k}})

,

which is multiplicatively closed since it's the intersection of multiplicatively closed subsets. The primary ideals of the decomposition of $I$ which correspond to the $p_{i_{l}}$ are precisely those having empty intersection with $S$ , since any other primary ideal $q_{j}$ in the decomposition of $I$ must contain an element outside all $\{p_{i_{1}},\ldots ,p_{i_{k}}\}$ , since otherwise its radical would be one of them by isolatedness. Hence, lemma 19.17 gives

\bigcap _{l=1}^{k}q_{i_{l}}=\pi _{S}^{-1}\circ \pi _{S}(I)

and we have independence of the particular decomposition. $\Box$

Characterisation of prime ideals belonging to an ideal

The following are useful further theorems on primary decomposition.

First of all, we give a proposition on general prime ideals.

Proposition 19.20:

Let $R$ be a (commutative) ring, and let $p\leq R$ be a prime ideal. If $p$ contains

either the intersection

\bigcap _{j=1}^{n}I_{j}

or the product

I_{1}\cdot I_{2}\cdots I_{n}

of certain arbitrary ideals, then it contains one of the $I_{j}$ completely.

Proof:

Since the product is contained in the intersection, it suffices to prove the theorem under the assumption that $I_{1}\cdots I_{n}\subseteq p$ .

Indeed, assume none of the $I_{1},\ldots ,I_{n}$ is contained in $p$ . Choose $r_{j}\in I_{j}\setminus p$ for $j=1,\ldots ,n$ . Since $p$ is prime, $r_{1}\cdots r_{n}\notin p$ . But it's in the product, contradiction. $\Box$

This proposition has far-reaching consequences for primary decomposition, given in Corollary 19.22. But first, we need a lemma.

Lemma 19.21:

Let $q\leq R$ be a primary ideal, and assume $p\leq R$ is prime such that $q\subseteq p$ . Then $r(q)\subseteq p$ .

Proof:

If $x^{n}\in q$ , then $x\in p$ . $\Box$

Corollary 19.22:

Let $I$ be an ideal admitting a prime decomposition

I=\bigcap _{j=1}^{n}q_{j}

.

If $p\leq R$ is any prime ideal that contains $I$ , then it also contains a prime ideal belonging to $I$ . Further, the prime ideals belonging to $I$ are exactly those that are minimal with respect to the partial order induced by inclusion on $V(I)\subseteq \operatorname {Spec} (R)$ .

Proof:

The first assertion follows from proposition 19.20 and lemma 19.21. The second assertion follows since any prime ideal belonging to $I$ contains $I$ . $\Box$

Artinian rings

Definition, first property

Definition 19.1:

A ring $R$ is called artinian if and only if each descending chain

I_{1}\supseteq I_{2}\supseteq \cdots \supseteq I_{k}\supseteq \cdots

of ideals of $R$ eventually terminates.

Equivalently, $R$ is artinian if and only if it is artinian as an $R$ -module over itself.

Proposition 19.2:

Let $R$ be an artinian integral domain. Then $R$ is a field.

Proof:

Let $r\in R$ . Consider in $R$ the descending chain

\langle r\rangle \supseteq \langle r^{2}\rangle \supseteq \langle r^{3}\rangle \supseteq \cdots \supseteq \langle r^{n}\rangle \supseteq \cdots

.

Since $R$ is artinian, this chain eventually stabilizes; in particular, there exists $n\in \mathbb {N}$ such that

\langle r^{n}\rangle =\langle r^{n+1}\rangle

.

Then write $r^{n}=sr^{n+1}$ , that is, $r^{n}(1-sr)=0$ , that is (as we are in an integral domain) $sr=1$ and $r$ has an inverse. $\Box$

Corollary 19.3:

Let $R$ be an artinian ring. Then each prime ideal of $R$ is maximal.

Proof:

If $p\leq R$ is a prime ideal, then $R/p$ is an artinian (theorem 12.9) integral domain, hence a field, hence $p$ is maximal. $\Box$

Characterisation

Theorem 19.4:

Let $R$ be a ring. We have:

R

is artinian

\Leftrightarrow

R

is noetherian and every prime ideal of

R

is maximal.

Proof:

First assume that the zero ideal $\langle 0\rangle$ of $R$ can be written as a product of maximal ideals; i.e.

\langle 0\rangle =m_{1}\cdots m_{n}

for certain maximal ideals $m_{1},\ldots ,m_{n}\leq R$ . In this case, if either chain condition is satisfied, one may consider the normal series of $R$ considered as an $R$ -module over itself given by

R\geq m_{1}\geq m_{1}\cdot m_{2}\geq \cdots \geq m_{1}\cdot m_{2}\cdots m_{n}=\langle 0\rangle

.

Consider the quotient modules $m_{1}\cdots m_{k}/m_{1}\cdots m_{k+1}$ . This is a vector space over the field $R/m_{k+1}$ ; for, it is an $R$ -module, and $m_{k+1}$ annihilates it.

Hence, in the presence of either chain condition, we have a finite vector space, and thus $R$ has a composition series (use theorem 12.9 and proceed from left to right to get a composition series). We shall now go on to prove that $\langle 0\rangle$ is a product of maximal ideals in cases

$R$ is noetherian and every prime ideal is maximal
$R$ is artinian.

1.: If $R$ is noetherian, every ideal (in particular $\langle 0\rangle$ ) contains a product of prime ideals, hence equals a product of prime ideals. All these are then maximal by assumption.

2.: If $R$ is artinian, we use the descending chain condition to show that if (for a contradiction) $\langle 0\rangle$ is not product of prime ideals, the set of ideals of $R$ that are product of prime ideals is inductive with respect to the reverse order of inclusion, and hence contains a minimal (w.r.t. inclusion) element $I\neq \langle 0\rangle$ . We lead this to a contradiction.

We form $A:=(\langle 0\rangle :I)$ . Since $1\notin A$ as $I\neq 0$ , $A\neq R$ . Again using that $R$ is artinian, we pick $B$ minimal subject to the condition $B>A$ . We set $p:=(A:B)$ and claim that $p$ is prime. Let indeed $a\notin p$ and $b\notin p$ . We have

A\subsetneq aB+A\subseteq B

, hence, by minimality of

B

,

aB+A=B

and similarly for $b$ . Therefore

abB+A=a(bB+A)+A=aB+A=B

,

whence $ab\notin p$ . We will soon see that $p\neq R$ . Indeed, we have $pB\leq A$ , hence $IpB\subseteq \langle 0\rangle$ and therefore

(\langle 0\rangle :Ip)\geq B>A=(\langle 0\rangle :I)

.

This shows $p\neq R$ , and $Ip\subsetneq I$ contradicts the minimality of $I$ . $\Box$

Krull dimension

Definition 17.1:

Let $R$ be a ring. The (Krull) dimension of $R$ is defined to be

\dim R:=\sup\{n\in \mathbb {N} |\exists p_{0},\ldots ,p_{n}\leq R{\text{ prime }}:p_{0}\subsetneq p_{1}\subsetneq \cdots \subsetneq p_{n}\}

.

Theorem 18.1 (prime avoidance):

Let $J,I_{1},\ldots ,I_{n}$ be ideals within a ring $R$ such that at most two of the ideals $I_{1},\ldots ,I_{n}$ are not prime ideals. If $J\subseteq \bigcup \limits _{k=1}^{n}I_{k}$ , then there exists an $m\in \mathbb {N}$ such that $J\subseteq I_{m}$ .

Proof 1:

We prove the theorem directly. First consider the case $n=2$ . Let $a\in J\setminus I_{1}$ and $b\in J\setminus I_{2}$ . Then $a\in I_{2}$ , $b\in I_{1}$ and $a+b\in J$ . In case $a+b\in I_{1}$ , we have $a\in I_{1}$ and in case $a+b\in I_{2}$ we have $b\in I_{2}$ . Both are contradictions.

Now consider the case $n>2$ . Without loss of generality, we may assume $I_{1},I_{2}$ are not prime and all the other ideals are prime. If $J\subseteq I_{1}\cup I_{2}$ , the claim follows by what we already proved. Otherwise, there exists an element $b\in J\cap (I_{3}\cup \cdots \cup I_{n}\setminus (I_{1}\cup I_{2}))$ . Without loss of generality, we may assume $b\in J\cap (I_{3}\setminus (I_{1}\cup I_{2}))$ . We claim that $J\subseteq I_{3}$ . First assume

Assume otherwise. If there exists $a\in I_{1}$ (or $I_{2}$ ), then .

INCOMPLETE

Proof 2:

We prove the theorem by induction on $n$ . The case $n=2$ we take from the preceding proof. Let $n>2$ . By induction, we have that $J$ is not contained within any of $I_{1}\cup \cdots \cup {\hat {I_{k}}}\cup \cdots \cup I_{n}$ , where the hat symbol means that the $k$ -th ideal is not counted in the union, for each $k\in \{1,\ldots ,n\}$ . Hence, we may choose for each $k\in \{1,\ldots ,n\}$ $a_{k}\in J\setminus I_{1}\cup \cdots \cup {\hat {I_{k}}}\cup \cdots \cup I_{n}$ . Since $n>2$ , at least one of the ideals $I_{1},\ldots ,I_{n}$ is prime; say $I_{m}$ is this prime ideal. Consider the element of $J$

b:=a_{m}+a_{1}\cdots a_{m-1}a_{m+1}\cdots a_{n}

.

For $j\neq m$ , $b$ is not contained in $I_{j}$ because otherwise $a_{m}$ would be contained within $I_{j}$ . For $j=m$ , $b$ is also not contained within $I_{j}$ , this time because otherwise $a_{1}\cdots a_{m-1}a_{m+1}\cdots a_{n}\in I_{j}=I_{m}$ , contradicting $I_{m}$ being prime. Hence, we have a contradiction to the hypothesis. $\Box$

Valuation rings

Augmented ordered Abelian groups

In this section, for reasons that will become apparent soon, we write Abelian groups multiplicatively.

Definition 18.1:

An ordered Abelian group is a group $G$ together with a subset $T\subset G$ such that:

$T$ is closed under multiplication (that is, $a,b\in T\Rightarrow ab\in T$ ).
If $a\in T$ , then $a^{-1}\notin T$ . (This implies in particular that $1\notin T$ .)
$G=T\cup \{1\}\cup T^{-1}$ .

We write ordered Abelian groups as pair $(G,T)$ .

The last two conditions may be summarized as: $G$ is the disjoint union of $T$ , $\{1\}$ and $T^{-1}$ .

Theorem 18.2:

Let an ordered group $(G,T)$ be given. Define an order on $G$ by

a<b:\Leftrightarrow ab^{-1}\in T

,

a\leq b:\Leftrightarrow a\leq b\vee a=b

.

Then $<$ has the following properties:

$<$ is a total order of $G$ .
$<$ is compatible with multiplication of $G$ (that is, $c\in G$ and $a\leq b$ implies $ca\leq cb$ ).

Proof:

We first prove the first assertion.

$\leq$ is reflexive by definition. It is also transitive: Let $a\leq b$ and $b\leq c$ . When $a=b$ or $b=c$ , the claim $a\leq c$ follows trivially by replacing $b$ in either of the given equations. Thus assume $ab^{-1}\in T$ and $bc^{-1}\in T$ . Then $(ab^{-1})(bc^{-1})=ac^{-1}\in T$ and hence $a\leq c$ (even $a<c$ ).

Let $a\leq b$ and $b\leq a$ . Assume $a\neq b$ for a contradiction. Then $ab^{-1}\in T$ and $ba^{-1}\in T$ , and since $T$ is closed under multiplication, $1\in T$ , contradiction. Hence $a=b$ .

Let $a,b\in G$ such that $a\neq b$ . Since $G=T\cup \{1\}\cup T^{-1}$ , $ab^{-1}$ (which is not equal $1$ ) is either in $T$ or in $T^{-1}$ (but not in both, since otherwise $ab^{-1}\in T^{-1}\Rightarrow (ab^{-1})^{-1}\in T$ and since $ab^{-1}\in T$ , $1\in T$ , contradiction). Thus either $a<b$ or $b<a$ .

Then we proceed to the second assertion.

Let $c\in G$ . If $a=b$ , the claim is trivial. If $a<b$ , then $ab^{-1}\in T$ , but $ab^{-1}=acc^{-1}b^{-1}=ac(bc)^{-1}$ . Hence $ac<bc$ . $\Box$

Definition 18.3:

Let $(G,T)$ be an ordered Abelian group. An augmented ordered Abelian group is $(G,T)$ together with an element $0$ (zero) such that the following rules hold:

00=0

,

\forall g\in G:0g=0

.

We write an augmented ordered Abelian group as triple $(G,T,0)$ .

Valuations and valuation rings

Definition 18.4:

Let $\mathbb {F}$ be a field, and let $(G,T,0)$ be an augmented ordered Abelian group. A valuation of the field $\mathbb {F}$ is a mapping $\varphi :\mathbb {F} \to G\cup \{0\}$ such that:

$\varphi (x)=0\Leftrightarrow x=0$ .
$\forall x,y\in \mathbb {F} :\varphi (xy)=\varphi (x)\varphi (y)$ .
$\forall x,y\in \mathbb {F} :\varphi (x+y)\leq \max\{\varphi (x),\varphi (y)\}$ .

Definition 18.5:

A valuation ring is an integral domain $D$ , such that there exists an augmented ordered Abelian group $(G,T,0)$ and a valuation $\varphi :\operatorname {Quot} (D)\to G\cup \{0\}$ with $D=\{c\in \operatorname {Quot} (D)|\varphi (c)\leq 1\}$ .

Theorem 18.6:

Let $R$ be a valuation ring, and let $\operatorname {Quot} (R)$ be its field of fractions. Then the following are equivalent:

$R$ is a valuation ring.
$R$ is an integral domain and the ideals of $R$ are linearly ordered with respect to set inclusion.
$R$ is an integral domain and for each $c\in \operatorname {Quot} (R)$ , either $c\in R$ or ${\frac {1}{c}}\in R$ .

Proof:

We begin with 3. $\Rightarrow$ 1.; assume that $R$

1. $\Rightarrow$ 2.: Let $I,J\leq R$ any two ideals. Assume there exists $a\in J\setminus I$ . Let any element $b\in I$ be given.

Properties of valuation rings

Theorem 18.8:

A valuation ring is a local ring.

Proof:

The ideals of a valuation ring $R$ are ordered by inclusion. Set $m:=\bigcup _{I\leq R \atop I\neq R}I$ . We claim that $m$ is a proper ideal of $R$ . Certainly $1\notin m$ for otherwise $1\in I$ for some proper ideal $I$ of $R$ . Furthermore, .

Theorem 18.9:

Let $R$ be a Noetherian ring and a valuation ring. Then $R$ is a principal ideal domain.

Proof:

For, let $I\leq R$ be an ideal; in any Noetherian ring, the ideals are finitely generated. Hence let $I=\langle a_{1},\ldots ,a_{n}\rangle$ . Consider the ideals of $R$ $\langle a_{1}\rangle ,\ldots ,\langle a_{n}\rangle$ . In a valuation rings, the ideals are totally ordered, so we may renumber the $a_{j}$ such that $\langle a_{1}\rangle \subseteq \langle a_{2}\rangle \subseteq \cdots \subseteq \langle a_{n}\rangle$ . Then $I=\langle a_{1},\ldots ,a_{n}\rangle =\langle a_{n}\rangle$ . $\Box$

Algebras and integral elements

Algebras

note to self: 21.4 is false when the constant polynomials are allowed!

Definition 21.1:

Let $R$ be a ring. An algebra $A$ over $R$ is an $R$ -module together with a multiplication $\cdot :A\times A\to A$ . This multiplication shall be $R$ -bilinear.

Within an algebra it is thus true that we have an addition and a multiplication, and many of the usual rules of algebra stay true. Thus the name algebra.

Of course, there are some algebras whose multiplication is not commutative or associative. If the underlying ring is commutative, the ring gives a certain commutativity property in the sense of

r(sa)=(rs)a=(sr)a=s(ra)

.

Definition 21.2:

Let $A$ be an algebra, and let $Z\subseteq A$ be a subset of $A$ . $Z$ is called a subalgebra of $A$ iff it is closed with respect to the operations

addition
multiplication
module operation

of $A$ .

Note that this means that $Z$ , together with the operations inherited from $A$ , is itself an $R$ -algebra; the necessary rules just carry over from $A$ .

Example 21.3: Let $R$ be a ring, let $S$ be another ring, and let $\varphi :R\to S$ be a ring homomorphism. Then $S$ is an $R$ -algebra, where the module operation is given by

rs:=\varphi (r)s

,

and multiplication and addition for this algebra are given by the multiplication and addition of $S$ , the ring.

Proof:

The required rules for the module operation follow as thus:

$1_{r}s=\varphi (1_{R})s=1_{S}s=s$
$r(s+t)=\varphi (r)(s+t)=\varphi (r)s+\varphi (r)t=rs+rt$
$(r+r')s=\varphi (r+r')s=(\varphi (r)+\varphi (r'))s=rs+r's$
$r(r's)=\varphi (r)r's=\varphi (r)\varphi (r')s=\varphi (rr')s=(rr')s$

Since in $S$ we have all the rules for a ring, the only thing we need to check for the $R$ -bilinearity of the multiplication is compatibility with the module operation.

Indeed,

(rs)t=\varphi (r)st=r(st)

and analogously for the other argument. $\Box$

We shall note that if we are given an $R$ -algebra $A$ , then we can take a polynomial $p\in R[x_{1},\ldots ,x_{n}]$ and some elements $a_{1},\ldots ,a_{n}$ of $A$ and evaluate $p(a_{1},\ldots ,a_{n})\in A$ as thus:

Using the algebra multiplication, we form the monomials $a_{1}^{k_{1}}a_{2}^{k_{2}}\cdots a_{n}^{k_{n}}$ .
Using the module operation, we multiply each monomial with the respective coefficient: $r_{k_{1},\ldots ,k_{n}}a_{1}^{k_{1}}a_{2}^{k_{2}}\cdots a_{n}^{k_{n}}$ .
Using the algebra addition (=module addition), we add all these $r_{k_{1},\ldots ,k_{n}}a_{1}^{k_{1}}a_{2}^{k_{2}}\cdots a_{n}^{k_{n}}$ together.

The commutativity of multiplication (1.) and addition (3.) ensure that this procedure does not depend on the choices of order, that can be made in regard to addition and multiplication.

Definition 21.4:

Let $A$ be an $R$ -algebra, and let $a_{1},\ldots ,a_{n}$ be any elements of $A$ . We then define a new object, $R[a_{1},\ldots ,a_{n}]$ , to be the set of all elements of $A$ that arise when applying the algebra operations of $A$ and the module operation (with arbitrary elements $r\in R$ of the underlying ring) to the elements $a_{1},\ldots ,a_{n}$ a finite number of times, in an arbitrary fashion (for example the elements $a_{1}\cdot a_{2}$ , $a_{3}+ra_{1}\cdot a_{2}$ , $a_{1}\cdot (ra_{2})$ are all in $R[a_{1},\ldots ,a_{n}]$ ). By multiplying everything out (using the rules we are given for an algebra), we find that this is equal to

R[a_{1},\ldots ,a_{n}]=\{p(a_{1},\ldots ,a_{n})|p\in R[x_{1},\ldots ,x_{n}]\}

.

We call $R[a_{1},\ldots ,a_{n}]$ the algebra generated by the elements $a_{1},\ldots ,a_{n}$ .

Theorem 21.5:

Let an $R$ -algebra $A$ be given, and let $a_{1},\ldots ,a_{n}\in A$ . Then

$R[a_{1},\ldots ,a_{n}]$ is a subalgebra of $A$ .

Furthermore,

$R[a_{1},\ldots ,a_{n}]=\bigcap _{\{a_{1},\ldots ,a_{n}\}\subseteq Z\subseteq A \atop Z{\text{ subalgebra}}}Z$

and

$R[a_{1},\ldots ,a_{n}]$ is (with respect to set inclusion) smaller than any other subalgebra of $A$ containing each element $a_{1},\ldots ,a_{n}$ .

Proof:

The first claim follows from the very definition of subalgebras of $A$ : The closedness under the three operations. For, if we are given any elements of $R[a_{1},\ldots ,a_{n}]$ , applying any operation to them is just one further step of manipulations with the elements $a_{1},\ldots ,a_{n}$ .

We go on to prove the equation

R[a_{1},\ldots ,a_{n}]=\bigcap _{\{a_{1},\ldots ,a_{n}\}\subseteq Z\subseteq A \atop Z{\text{ subalgebra}}}Z

.

For " $\subseteq$ " we note that since $a_{1},\ldots ,a_{n}$ are contained within every $Z$ occurring on the right hand side. Thus, by the closedness of these $Z$ , we can infer that all finite manipulations by the three algebra operations (addition, multiplication, module operation) are included in each $Z$ . From this follows " $\subseteq$ ".

For " $\supseteq$ " we note that $R[a_{1},\ldots ,a_{n}]$ is also a subalgebra of $A$ containing $\{a_{1},\ldots ,a_{n}\}$ , and intersection with more things will only make the set at most smaller.

Now if any other subalgebra of $A$ is given that contains $a_{1},\ldots ,a_{n}$ , the intersection on the right hand side of our equation must be contained within it, since that subalgebra would be one of the $Z$ . $\Box$

Exercises

Exercise 21.1.1:

Symmetric polynomials

Definition 21.6:

Let $R$ be a ring. A polynomial $f\in R[x_{1},\ldots ,x_{n}]$ is called symmetric if and only if for all $\sigma \in S_{n}$ ( $S_{n}$ being the symmetric group), we have

f(x_{1},\ldots ,x_{n})=f(x_{\sigma (1)},\ldots ,x_{\sigma (n)})

.

That means, we can permute the variables arbitrarily and still get the same result.

This section shall be devoted to proving a very fundamental fact about these polynomials. That is, there are some so-called elementary symmetric polynomials, and every symmetric polynomial can be written as a polynomial in those elementary symmetric polynomials.

Definition 21.7:

Fix an $n\in \mathbb {N}$ . The elementary symmetric polynomials in $n$ variables are the $n$ polynomials

{\begin{aligned}s_{n,1}(x_{1},\ldots ,x_{n})&:=x_{1}+x_{2}+\cdots +x_{n-1}+x_{n}\\s_{n,2}(x_{1},\ldots ,x_{n})&:=x_{1}x_{2}+\cdots +x_{1}x_{n}~~~~+~~~~x_{2}x_{3}+\cdots +x_{2}x_{n}~~~~+~~~~\cdots ~~~~+~~~~x_{n-2}x_{n-1}+x_{n-2}x_{n}~~~~+~~~~x_{n-1}x_{n}\\\vdots &\\s_{n,k}(x_{1},\ldots ,x_{n})&:=\sum _{1\leq j_{1}<j_{2}<\cdots <j_{k}\leq n}~~\prod _{i=1}^{k}x_{j_{i}}\\\vdots &\\s_{n,n}(x_{1},\ldots ,x_{n})&:=x_{1}x_{2}\cdots x_{n-1}x_{n}.\end{aligned}}

Without further ado, we shall proceed to the theorem that we promised:

Theorem 21.8:

Let any symmetric polynomial $f\in R[x_{1},\ldots ,x_{n}]$ be given. Then we find another polynomial $p\in R[x_{1},\ldots ,x_{n}]$ such that

f(x_{1},\ldots ,x_{n})=p(s_{n,1}(x_{1},\ldots ,x_{n}),s_{n,2}(x_{1},\ldots ,x_{n}),\ldots ,s_{n,n}(x_{1},\ldots ,x_{n}))

.

Hence, every symmetric polynomial is a polynomial in the elementary symmetric polynomials.

Proof 1:

We start out by ordering all monomials (remember, those are polynomials of the form $x_{1}^{k_{1}}x_{2}^{k_{2}}\cdots x_{n-1}^{k_{n-1}}x_{n}^{k_{n}}$ ), using the following order:

x_{1}^{k_{1}}x_{2}^{k_{2}}\cdots x_{n-1}^{k_{n-1}}x_{n}^{k_{n}}<x_{1}^{m_{1}}x_{2}^{m_{2}}\cdots x_{n-1}^{m_{n-1}}x_{n}^{m_{n}}:\Leftrightarrow {\begin{cases}k_{1}+\cdots +k_{n}<m_{1}+\cdots +m_{n}&\\{\text{or}}&\\{\big (}k_{1}+\cdots +k_{n}=m_{1}+\cdots +m_{n}{\big )}\wedge {\big (}k_{j}<m_{j},{\text{ where }}j:=\min _{1\leq i\leq n}k_{i}\neq m_{i}{\big )}&\end{cases}}

.

With this order, the largest monomial of $s_{n,m}$ is given by $x_{1}\cdots x_{m}$ ; this is because for all monomials of $s_{n,m}$ , the sum of the exponent equals $m$ , and the last condition of the order is optimized by monomials which have the first zero exponent as late as possible.

Furthermore, for any given $r_{1},\ldots ,r_{n}\in \mathbb {N} _{0}$ , the largest monomial of

s_{n,1}^{r_{1}}\cdots s_{n,n}^{r_{n}}

is given by $x_{1}^{r_{1}+\cdots +r_{n}}x_{2}^{r_{2}+\cdots +r_{n}}\cdots x_{n-1}^{r_{n-1}+r_{n}}x_{n}^{r_{n}}$ ; this is because the sum of the exponents always equals $r_{1}+2r_{2}+\cdots +(n-1)r_{n-1}+nr_{n}$ , further the above monomial does occur (multiply all the maximal monomials from each elementary symmetric factor together) and if one of the factors of a given monomial of $s_{n,1}^{r_{1}}\cdots s_{n,n}^{r_{n}}$ coming from an elementary symmetric polynomial is not the largest monomial of that elementary symmetric polynomial, we may replace it by a larger monomial and obtain a strictly larger monomial of the product $s_{n,1}^{r_{1}}\cdots s_{n,n}^{r_{n}}$ ; this is because a part of the sum $r_{1}+2r_{2}+\cdots +(n-1)r_{n-1}+nr_{n}$ is moved to the front.

Now, let a symmetric polynomial $f\in R[x_{1},\ldots ,x_{n}]$ be given. We claim that if $x_{1}^{k_{1}}x_{2}^{k_{2}}\cdots x_{n-1}^{k_{n-1}}x_{n}^{k_{n}}$ is the largest monomial of $f$ , then we have $k_{1}\geq k_{2}\geq \cdots \geq k_{n-1}\geq k_{n}$ .

For assume otherwise, say $k_{j}<k_{j+1}$ . Then since $f$ is symmetric, we may exchange the exponents of the $j$ -th and $j+1$ -th variable respectively and still obtain a monomial of $f$ , and the resulting monomial will be strictly larger.

Thus, if we define for $j=1,\ldots ,n-1$

d_{j}:=k_{j}-k_{j+1}

and furthermore $d_{n}:=k_{n}$ , we obtain numbers that are non-negative. Hence, we may form the product

h(x):=s_{n,1}^{d_{1}}\cdots s_{n,n}^{d_{n}}

,

and if $c$ is the coefficient of the largest monomial of $f$ , then the largest monomial of

f(x)-ch(x)

is strictly smaller than that of $f$ ; this is because the largest monomial of $h$ is, by our above computation and calculating some telescopic sums, equal to the largest monomial of $f$ , and the two thus cancel out.

Since the elementary symmetric polynomials are symmetric and sums, linear combinations and products of symmetric polynomials are symmetric, we may repeat this procedure until we are left with nothing. All the stuff that we subtracted from $f$ collected together then forms the polynomial in elementary symmetric polynomials we have been looking for. $\Box$

Proof 2:

Let $f\in R[x_{1},\ldots ,x_{n}]$ be an arbitrary symmetric polynomial, and let $d$ be the degree of $f$ and $n$ be the number of variables of $f$ .

In order to prove the theorem, we use induction on the sum $n+d$ of the degree and number of variables of $f$ .

If $n+d=1$ , we must have $n=1$ (since $d=1$ would imply the absurd $n=0$ ). But any polynomial of one variable is already a polynomial of the symmetric polynomial $s_{1,1}(x)=x$ .

Let now $n+d=k$ . We write

f(x_{1},\ldots ,x_{n})=g(x_{1},\ldots ,x_{n})+x_{1}\cdots x_{n}h(x_{1},\ldots ,x_{n})

,

where every monomial occurring within $g$ lacks at least one variable, that is, is not divisible by $x_{1}\cdots x_{n}$ .

The polynomial $g$ is still symmetric, because any permutation of a monomial that lacks at least one variable, also lacks at least one variable and hence occurs in $g$ with same coefficient, since no bit of it could have been sorted to the " $x_{1}\cdots x_{n}h(x_{1},\ldots ,x_{n})$ " part.

The polynomial $h$ has the same number of variables, but the degree of $h$ is smaller than the degree of $f$ . Furthermore, $h$ is symmetric because of

h(x_{1},\ldots ,x_{n})={\frac {f(x_{1},\ldots ,x_{n})-g(x_{1},\ldots ,x_{n})}{x_{1}\cdots x_{n}}}

.

Hence, by induction hypothesis, $h$ can be written as a polynomial in the symmetric polynomials:

h(x_{1},\ldots ,x_{n})=p_{1}(s_{n,1}(x_{1},\ldots ,x_{n}),\ldots ,s_{n,n}(x_{1},\ldots ,x_{n}))

for a suitable $p_{1}\in R[x_{1},\ldots ,x_{n}]$ .

If $n=1$ , then $f$ is a polynomial of the elementary symmetric polynomial $s_{1,1}(x)$ anyway. Hence, it is sufficient to only consider the case $n\geq 2$ . In that case, we may define the polynomial

q(x_{1},\ldots ,x_{n-1}):=g(x_{1},\ldots ,x_{n-1},0)

.

Now $q$ has one less variable than $f$ and at most the same degree, which is why by induction hypothesis, we find a representation

q(x_{1},\ldots ,x_{n-1})=p_{2}(s_{n-1,1}(x_{1},\ldots ,x_{n-1}),\ldots ,s_{n-1,n-1}(x_{1},\ldots ,x_{n-1}))

for a suitable $p_{2}\in R[x_{1},\ldots ,x_{n-1}]$ .

We observe that for all $j\in \{1,\ldots ,n-1\}$ , we have $s_{n-1,j}(x_{1},\ldots ,x_{n-1})=s_{n,j}(x_{1},\ldots ,x_{n-1},0)$ . This is because the unnecessary monomials just vanish. Hence,

g(x_{1},\ldots ,x_{n-1},0)=p_{2}(s_{n,1}(x_{1},\ldots ,x_{n-1},0),\ldots ,s_{n,n-1}(x_{1},\ldots ,x_{n-1},0))

.

We claim that even

g(x_{1},\ldots ,x_{n-1},x_{n})=p_{2}(s_{n,1}(x_{1},\ldots ,x_{n-1},x_{n}),\ldots ,s_{n,n-1}(x_{1},\ldots ,x_{n-1},x_{n}))~~~~~~~(*)

.

Indeed, by the symmetry of $g$ and $s_{n,1},\ldots ,s_{n,n-1}$ and renaming of variables, the above equation holds where we may set an arbitrary of the variables equal to zero. But each monomial of $g$ lacks at least one variable. Hence, by successively equating coefficients in $(*)$ where one of the variables is set to zero, we obtain that the coefficients on the right and left of $(*)$ are equal, and thus the polynomials are equal. $\Box$

Integral dependence

Definition 21.9:

If $R$ is any ring and $S\subseteq R$ a subring, $r\in R$ is called integral over $S$ iff

r^{n}+a_{n-1}r^{n-1}+\cdots +a_{1}r+a_{0}=0

for suitable $a_{n-1},\ldots ,a_{0}\in S$ .

A polynomial of the form

x^{n}+a_{n-1}x^{n-1}+\cdots +a_{1}x+a_{0}

(leading coefficient equals

1

)

is called a monic polynomial. Thus, $r$ being integral over $S$ means that $r$ is the root of a monic polynomial with coefficients in $S$ .

Whenever we have a subring $S\subseteq R$ of a ring $R$ , we consider the module structure of $R$ as an $S$ -module, where the module operation and summation are given by the ring operations of $R$ .

Theorem 21.10 (characterisation of integral dependence):

Let $R$ be a ring, $S\subseteq R$ a subring. The following are equivalent:

$r$ is integral over $S$
$S[r]$ is a finitely generated $S$ -module.
$S[r]$ is contained in a subring $T\subseteq R$ that is finitely generated as an $S$ -module.
There exists a faithful, nonzero $S[r]$ -module which is finitely generated as an $S$ -module.

Proof:

1. $\Rightarrow$ 2.: Let $r$ be integral over $S$ , that is, $r^{n}=-a_{n-1}r^{n-1}+\cdots +a_{1}r+a_{0}$ . Let $b_{k}r^{k}+b_{k-1}r^{k-1}+\cdots +b_{1}r+b_{0}$ be an arbitrary element of $S[r]$ . If $j$ is larger or equal $n$ , then we can express $r^{j}$ in terms of lower coefficients using the integral relation. Repetition of this process yields that $1,r,r^{2},\ldots ,r^{n-1}$ generate $S[r]$ over $S$ .

2. $\Rightarrow$ 3.: $T=S[r]$ .

3. $\Rightarrow$ 4.: Set $M=T$ ; $T$ is faithful because if $u\in S[r]$ annihilates $T$ , then in particular $u=u\cdot 1=0$ .

4. $\Rightarrow$ 1.: Let $M$ be such a module. We define the morphism of modules

\phi :M\to M,m\mapsto rm

.

We may restrict the module operation of $M$ to $S$ to obtain an $S$ -module. $\phi$ is also a morphism of $S$ -modules. Further, set $I=S$ . Then $\phi (M)\subseteq M=IM$ ( $1\in S$ ). The Cayley–Hamilton theorem gives an equation

r^{n}+a_{n-1}r^{n-1}+\cdots +a_{1}r+a_{0}=0

,

a_{n-1},\ldots ,a_{0}\in S

,

where $r$ is to be read as the multiplication operator by $r$ and $0$ as the zero operator, and by the faithfulness of $M$ , $r^{n}+a_{n-1}r^{n-1}+\cdots +a_{1}r+a_{0}=0$ in the usual sense. $\Box$

Theorem 21.11:

Let $\mathbb {F}$ be a field and $S\subseteq \mathbb {F}$ a subring of $\mathbb {F}$ . If $\mathbb {F}$ is integral over $S$ , then $S$ is a field.

Proof:

Let $s\in S$ . Since $\mathbb {F}$ is a field, we find an inverse $s^{-1}\in \mathbb {F}$ ; we don't know yet whether $s^{-1}$ is contained within $S$ . Since $\mathbb {F}$ is integral over $S$ , $s^{-1}$ satisfies an equation of the form

(s^{-1})^{n}+a_{n-1}(s^{-1})^{n-1}+\cdots +a_{1}s^{-1}+a_{0}=0

for suitable $a_{n-1},\ldots ,a_{1},a_{0}\in S$ . Multiplying this equation by $s^{n-1}$ yields

s^{-1}=-(a_{n-1}+a_{n-2}s+\cdots +a_{1}s^{n-2}+a_{0}s^{n-1})\in S

.

\Box

Theorem 21.12:

Let $S$ be a subring of $R$ . The set of all elements of $R$ which are integral over $S$ constitutes a subring of $R$ .

Proof 1 (from the Atiyah–Macdonald book):

If $x,y\in R$ are integral over $S$ , $y$ is integral over $S[x]$ . By theorem 21.10, $S[x]$ is finitely generated as $S$ -module and $S[x][y]=S[x,y]$ is finitely generated as $S[x]$ -module. Hence, $S[x,y]$ is finitely generated as $S$ -module. Further, $S[x+y]\subseteq S[x,y]$ and $S[x\cdot y]\subseteq S[x,y]$ . Hence, by theorem 21.10, $x+y$ and $x\cdot y$ are integral over $S$ . $\Box$

Proof 2 (Dedekind):

If $x,y$ are integral over $S$ , $S[x]$ and $S[y]$ are finitely generated as $S$ -modules. Hence, so is

S[x]\cdot S[y]:=\left\{\sum _{j=1}^{n}a_{j}b_{j}{\big |}n\in \mathbb {N} ,a_{j}\in S[x],b_{j}\in S[y]\right\}

.

Furthermore, $S[xy]\subseteq S[x]\cdot S[y]$ and $S[x+y]\subseteq S[x]\cdot S[y]$ . Hence, by theorem 21.10, $x\cdot y,x+y$ are integral over $S$ . $\Box$

Definition 21.13:

Let $S$ be a subring of the ring $R$ . The integral closure of $S$ over $R$ is the ring consisting of all elements of $R$ which are integral over $S$ .

Definition 21.14:

Let $S$ be a subring of the ring $R$ . If all elements of $R$ are integral over $S$ , $R$ is called an integral ring extension of $S$ .

Irreducibility, algebraic sets and varieties

Irreducibility

Definition 21.1:

Let $X$ be a topological space. $X$ is said to be irreducible if and only if no two non-empty open subsets of $X$ are disjoint.

Some people (topologists) call irreducible spaces hyperconnected.

Theorem 21.2 (characterisation of irreducible spaces):

Let $X$ be a topological space. The following are equivalent:

$X$ is irreducible.
$X$ can not be written as the union of two proper closed subsets.
Every open subset of $X$ is dense in $X$ .
The interior of every proper closed subset of $X$ is empty.

Proof 1: We prove 1. $\Rightarrow$ 2. $\Rightarrow$ 3. $\Rightarrow$ 4. $\Rightarrow$ 1.

1. $\Rightarrow$ 2.: Assume that $X=A\cup B$ , where $A$ , $B$ are proper and closed. Define $O:=X\setminus A$ and $U:=X\setminus B$ . Then $O,U$ are open and

O\cap U=(X\setminus A)\cap (X\setminus B)=X\setminus (A\cup B)=\emptyset

by one of deMorgan's rules, contradicting 1.

2. $\Rightarrow$ 3.: Assume that $U\subseteq X$ is open but not dense. Then $A:={\overline {U}}$ is closed and proper in $X$ , and so is $B:=X\setminus U$ . Furthermore, $X=A\cup B$ , contradicting 2.

3. $\Rightarrow$ 4.: Let $\emptyset \neq A\subsetneq X$ be closed such that ${\overset {\circ }{A}}\neq \emptyset$ . By definition of the closure, ${\overline {\overset {\circ }{A}}}\subseteq A$ , which is why ${\overset {\circ }{A}}$ is a non-dense open set, contradicting 3.

4. $\Rightarrow$ 1.: Let $O,U\subseteq X$ be open and non-empty such that $O\cap U=\emptyset$ . Define $A:=X\setminus O$ . Then $A$ is a proper, closed subset of $X$ , since $O\subseteq X\setminus A$ . Furthermore, $O\subseteq {\overset {\circ }{A}}$ , which is why $A$ has non-empty interior. $\Box$

Proof 2: We prove 1. $\Rightarrow$ 4. $\Rightarrow$ 3. $\Rightarrow$ 2. $\Rightarrow$ 1.

1. $\Rightarrow$ 4.: Assume we have a proper closed subset $A$ of $X$ with nonempty interior. Then $X\setminus A$ and ${\overset {\circ }{A}}$ are two disjoint nonempty open subsets of $X$ .

4. $\Rightarrow$ 3.: Let $O\subseteq X$ be open. If $O$ was not dense in $X$ , then ${\overline {O}}$ would be a proper closed subset of $X$ with nonempty interior.

3. $\Rightarrow$ 2.: Assume $X=A\cup B$ , $A,B\subsetneq X$ proper and closed. Set $O:=X\setminus B$ . Then $A\supset O$ , and hence $O$ is not dense within $X$ .

2. $\Rightarrow$ 1.: Let $O,U\subseteq X$ be open. If they are disjoint, then $X=(X\setminus O)\cup (X\setminus U)$ . $\Box$

Remaining arrows:

1. $\Rightarrow$ 3.: Assume $U\subset X$ open, not dense. Then $X\setminus {\overline {U}}$ is nonempty and disjoint from $U$ .

3. $\Rightarrow$ 1.: Let $O,U\subseteq X$ be open. If they are disjoint, then $O\subseteq X\setminus U$ and thus $O$ is not dense.

2. $\Rightarrow$ 4.: Let $A\subsetneq X$ be proper and closed with nonempty interior. Then $X=A\cup (X\setminus {\overset {\circ }{A}})$ .

4. $\Rightarrow$ 2.: Let $X=A\cup B$ , $A,B\subsetneq X$ proper and closed. Then $\emptyset \neq X\setminus A\subseteq {\overset {\circ }{A}}$ .

We shall go on to prove a couple of properties of irreducible spaces.

Theorem 21.3:

Every irreducible space $X$ is connected and locally connected.

Proof:

1. Connectedness: Assume $X=U{\dot {\cup }}O$ , $U,O$ open, non-empty. This certainly contradicts irreducibility.

2. Local connectedness: Let $x\in V\subseteq X$ , where $V$ is open. But any open subset of $X$ is connected as in 1., which is why we have local connectedness. $\Box$

Theorem 21.4:

Let $X$ be an irreducible space. Then $X$ is Hausdorff if and only if $|X|\leq 1$ .

Proof:

If $|X|\leq 1$ , then $X$ is trivially Hausdorff. Assume that $X$ is Hausdorff and contains two distinct points $x\neq y$ . Then we find $U_{x},U_{y}\subseteq X$ open such that $x\in U_{x}$ , $y\in U_{y}$ and $U_{x}\cap U_{y}=\emptyset$ , contradicting irreducibility. $\Box$

Theorem 21.5:

Let $X,Y$ be topological spaces, where $X$ is irreducible, and let $f:X\to Y$ be a continuous function (i.e. a morphism in the category of topological spaces). Then $f(X)$ is irreducible with the subspace topology induced by $Y$ .

Proof: Let $O,U$ be two disjoint non-empty open subsets of $f(X)$ . Since we are working with the subspace topology, we may write $O=f(X)\cap V$ , $U=f(X)\cap W$ , where $V,W\subseteq Y$ are open. We have

f^{-1}(O)=f^{-1}(f(X)\cap V)=f^{-1}(V)

and similarly

f^{-1}(U)=f^{-1}(W)

.

Hence, $f^{-1}(O)$ and $f^{-1}(U)$ are open in $X$ by continuity, and since they further are disjoint (since if $x\in f^{-1}(O)$ , then $f(x)\in O$ and thus $f(x)\notin U$ ) and non-empty (since e.g. if $y\in O$ , since $O\subset f(X)$ , $y=f(x)$ for an $x\in X$ and hence $x\in f^{-1}(O)$ ), we have a contradiction. $\Box$

Corollary 21.6:

If $X$ is irreducible, $Y$ is Hausdorff and $f:X\to Y$ is continuous, then $f$ is constant.

Proof: Follows from theorems 21.4 and 21.5. $\Box$

We may now connect irreducible spaces with Noetherian spaces.

Theorem 21.7:

Let $X$ be a Noetherian topological space, and let $A\subseteq X$ be closed. Then there exists a finite decomposition

A=B_{1}\cup \cdots \cup B_{n}

where each $B_{j}$ is irreducible, and no $B_{j}$ is a subset of (or equals) any of the other $B_{i},i\neq j$ . Furthermore, this decomposition is unique up to order.

Proof:

First we prove existence. Let $A\subseteq X$ be closed. Then either $A$ is irreducible, and we are done, or $A$ can be written as the union of two proper closed subsets $A=B_{1}\cup B_{2}$ . Now again either $B_{1}$ and $B_{2}$ are irreducible, or they can be written as the union of two proper closed subsets again. The process of thus splitting up the sets must eventually terminate with all involved subsets being irreducible, since $X$ is Noetherian and otherwise we would have an infinite properly descending chain of closed subsets, contradiction. To get the last condition satisfied, we unite any subset contained within another with the greater subset (this can be done successively since there are only finitely many of them). Hence, we have a decomposition of the desired form.

We proceed to proving uniqueness up to order. Let $A=B_{1}\cup \cdots \cup B_{n}=C_{1}\cup \cdots \cup C_{m}$ be two such decompositions. For $k\in \{1,\ldots ,n\}$ , we may thus write $B_{k}=(B_{k}\cap C_{1})\cup \cdots \cup (B_{k}\cap C_{m})$ . Assume that there does not exist $j\in \{1,\ldots ,m\}$ such that $B_{k}\subseteq C_{j}\Leftrightarrow B_{k}=(B_{k}\cap C_{j})$ . Then we may define $S_{1}:=(B_{k}\cap C_{1})$ and then successively

S_{l+1}:=S_{l}\cup (B_{k}\cap C_{l+1})

for $1\leq l<m$ . Then we set $l=1$ and increase $l$ until $S_{l}\cup (B_{k}\cap C_{l+1})$ is a decomposition of $B_{k}$ into two proper closed subsets (such an $l$ exists since it equals the first $l$ such that $S_{l}\cup (B_{k}\cap C_{l+1})=B_{k}$ ). Thus, our assumption was false; there does exist $j\in \{1,\ldots ,m\}$ such that $B_{k}\subseteq C_{j}$ . Thus, each $B_{k}$ is contained within a $C_{j}$ , and by symmetry $C_{j}$ is contained within some $B_{k'}$ . Since by transitivity of $\subseteq$ this implies $B_{k}\subseteq B_{k'}$ , $k=k'$ and $C_{j}=B_{k}$ . For a fixed $k$ , we set $\sigma (k)=j$ , where $j$ is thus defined ( $j$ is unique since otherwise there exist two equals among the $C$ -sets). In a symmetric fashion, we may define $\tau (j)=k$ , where $B_{k}=C_{j}$ . Then $\tau$ and $\sigma$ are inverse to each other, and hence follows $n=m$ (sets with a bijection between them have equal cardinality) and the definition of $\sigma$ , for example, implies that both decompositions are equal except for order. $\Box$

Exercises

Exercise 21.1.1: Let $X$ be an irreducible topological space, and let $O\subseteq X$ be open. Prove that $O$ is irreducible.

Algebraic sets and varieties

Definition 21.8:

Let $\mathbb {F}$ be a field. Then the sets of the form

V(S):=\{(x_{1},\ldots ,x_{n})\in \mathbb {F} ^{n}|\forall f\in S:f(x_{1},\ldots ,x_{n})=0\}

,

where $S$ is a subset of the ring of polynomials in $n$ variables over $\mathbb {F}$ (that is $S\subseteq \mathbb {F} [x_{1},\ldots ,x_{n}]$ ), are called algebraic sets. If $S=\{f\}$ for a single $f\in \mathbb {F} [x_{1},\ldots ,x_{n}]$ , we shall occasionally write

V(f):=V(\{f\})

.

The following picture depicts three algebraic sets (apart from the cube lines):

The orange surface is the set $V(f_{1})$ , the blue surface is the set $V(f_{2})$ , and the green line is the intersection of the two, equal to the set $V(\{f_{1},f_{2}\})$ , where

f_{1}(x,y,z)=z+y^{3}

and

f_{2}(x,y,z)=x+z^{2}

.

Three immediate lemmata are apparent.

Lemma 21.9:

S\subseteq T\Rightarrow V(T)\subseteq V(S)

.

Proof: Being in $V(T)$ is the stronger condition. $\Box$

Lemma 21.10 (formulas for algebraic sets):

Let $\mathbb {F}$ be a field and set $R:=\mathbb {F} [x_{1},\ldots ,x_{n}]$ . Then the following rules hold for algebraic sets of $\mathbb {F} ^{n}$ :

$V(S)=V(\langle S\rangle )$ ( $S\subseteq R$ a set)
$V(R)=\emptyset$ and $V(\emptyset )=\mathbb {F} ^{n}$
$V(I_{1})\cup \cdots \cup V(I_{k})=V(I_{1}\cap \cdots \cap I_{k})$ ( $I_{1},\ldots ,I_{k}\leq R$ ideals)
$\bigcap _{j\in J}V(S_{j})=V\left(\bigcup _{j\in J}S_{j}\right)$ ( $S_{j}\subseteq R$ sets)

Proof:

1. Let $i:=\sum _{j=1}^{k}r_{j}s_{j}\in \langle S\rangle$ . If $x=(x_{1},\ldots ,x_{n})\in V(S)$ follows $i(x)=0$ . This proves $\subseteq$ . The other direction follows from lemma 21.9.

2. $V(R)=\emptyset$ follows from the constant functions being contained within $R$ , and $V(\emptyset )$ gives no condition on the points of $\mathbb {F} ^{n}$ to be contained within it.

3. $\subseteq$ follows by

{\begin{aligned}x\in V(I_{1})\cup \cdots \cup V(I_{k})&\Rightarrow \exists j\in \{1,\ldots ,k\}:x\in V(I_{j})\\&\Rightarrow \forall f\in I_{j}:f(x)=0\\&\Rightarrow \forall f\in I_{1}\cap \cdots \cap I_{k}:f(x)=0,\end{aligned}}

since clearly $I_{1}\cap \cdots \cap I_{k}\subseteq I_{j}$ .

We will first prove $\supseteq$ for the case $k=2$ . Indeed, let $x\notin V(I_{1})\cup V(I_{2})$ , that is, neither $x\in V(I_{1})$ nor $x\in V(I_{2})$ . Hence, we find a polynomial $f\in I_{1}$ such that $f(x)\neq 0$ and a polynomial $g\in S_{2}$ such that $g(x)\neq 0$ . The polynomial $f\cdot g$ is contained within $I_{1}\cap I_{2}$ and $(f\cdot g)(x)=f(x)\cdot g(x)\neq 0$ , since every field is an integral domain. Thus, $x\notin V(I_{1}\cap I_{2})$ .

Assume $\supseteq$ holds for $k-1$ many sets. Then we have

V(I_{1}\cap \cdots \cap I_{k})=V((I_{1}\cap \cdots I_{k-1})\cap I_{k})\subseteq V(I_{1}\cap \cdots I_{k-1})\cup V(I_{k})\subseteq V_{(}I_{1})\cup \cdots \cup V(I_{k-1})\cup V(I_{k})

.

4.

{\begin{aligned}x\in \bigcap _{j\in J}V(S_{j})&\Leftrightarrow \forall j\in J:x\in V(S_{j})\\&\Leftrightarrow \forall j\in J:\forall f\in S_{j}:f(x)=0\\&\Leftrightarrow \forall f\in \bigcup _{j\in J}S_{j}:f(x)=0\\&\Leftrightarrow x\in V\left(\bigcup _{j\in J}S_{j}\right).\end{aligned}}

\Box

From this lemma we see that the algebraic sets form the closed sets of a topology, much like the Zariski-closed sets we got to know in chapter 14. We shall soon find a name for that topology, but we shall first define it in a different way to justify the name we will give.

Lemma 21.11:

Let $\mathbb {F}$ be a field and $I\subseteq \mathbb {F} [x_{1},\ldots ,x_{n}]$ . Then

V(I)=V(r(I))

;

we recall that $r(I)$ is the radical of $I$ .

Proof: " $\supseteq$ " follows from lemma 21.9. Let on the other hand $x\in V(I)$ and $g\in r(I)$ . Then $g^{k}\in I$ for a suitable $k\in \mathbb {N}$ . Thus, $g^{k}(x)=g(x)^{k}=0$ . Assume $g(x)\neq 0$ . Then $g(x)^{k}\neq 0$ , contradiction. Hence, $x\in V(r(I))$ . $\Box$

From calculus, we all know that there is a natural topology on $\mathbb {R} ^{n}$ , namely the one induced by the Euclidean norm. However, there exists also a different topology on $\mathbb {R} ^{n}$ , and in fact, on $\mathbb {F} ^{n}$ for any field $\mathbb {F}$ . This topology is called the Zariski topology on $\mathbb {F} ^{n}$ . Now the Zariski topology actually is a topology on $\operatorname {Spec} R$ , for $R$ a ring, isn't it? Yes, and if $R=\mathbb {F} [x_{1},\ldots ,x_{n}]$ , then $\mathbb {F} ^{n}$ is in bijective correspondence with a subset of $\operatorname {Spec} R$ . Through this correspondence we will define the Zariski topology. So let's establish this correspondence by beginning with the following lemma.

Lemma 21.12:

Let $\mathbb {F}$ be a field and set $R:=\mathbb {F} [x_{1},\ldots ,x_{n}]$ . If $(\alpha _{1},\ldots ,\alpha _{n})\in \mathbb {F} ^{n}$ , then the ideal

\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle

is a maximal ideal of $R$ .

Proof:

Set

\varphi :\mathbb {F} [x_{1},\ldots ,x_{n}]\to \mathbb {F} ,\varphi (f):=f(\alpha _{1},\ldots ,\alpha _{n})

.

This is a surjective ring homomorphism. We claim that its kernel is given by $\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle$ . This is actually not trivial and requires explanation. The relation $\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle \subseteq \ker \varphi$ is trivial. We shall now prove the other direction, which isn't. For a given $f\in \mathbb {F} [x_{1},\ldots ,x_{n}]$ , we define ${\tilde {f}}(x_{1},x_{2},\ldots ,x_{n}):=f(x_{1}+\alpha _{1},\ldots ,x_{n}+\alpha _{n})$ ; hence,

{\begin{aligned}f(x_{1},x_{2},\ldots ,x_{n})&=f(x_{1}+\alpha _{1}-\alpha _{1},\ldots ,x_{n}+\alpha _{n}-\alpha _{n})\\&={\tilde {f}}(x_{1}-\alpha _{1},x_{2}-\alpha _{2},\ldots ,x_{n}-\alpha _{n}).\end{aligned}}

Furthermore, $f(\alpha _{1},\ldots ,\alpha _{n})=0$ if and only if ${\tilde {f}}(0,\ldots ,0)=0$ . The latter condition is satisfied if and only if ${\tilde {f}}$ has no constant, and this happens if and only if ${\tilde {f}}$ is contained within the ideal $\langle x_{1},\ldots ,x_{n}\rangle$ . This means we can write ${\tilde {f}}$ as an $\mathbb {F} [x_{1},\ldots ,x_{n}]$ -linear combination of $x_{1},\ldots ,x_{n}$ , and inserting $x_{j}-\alpha _{j}$ for $x_{j}$ gives the desired statement.

Hence, by the first isomorphism theorem for rings,

\mathbb {F} [x_{1},\ldots ,x_{n}]{\big /}\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle \cong \mathbb {F}

.

Thus, $\mathbb {F} [x_{1},\ldots ,x_{n}]{\big /}\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle$ is a field and hence $\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle$ is maximal. $\Box$

Lemma 21.13:

Let $\mathbb {F}$ be a field. Define

{\mathcal {M}}_{\mathbb {F} }:=\left\{\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle {\big |}(\alpha _{1},\ldots ,\alpha _{n})\in \mathbb {F} ^{n}\right\}

(according to the previous lemma this is a subset of $\operatorname {Spec} \mathbb {F} [x_{1},\ldots ,x_{n}]$ , as maximal ideals are prime). Then the function

\Phi :\mathbb {F} ^{n}\to {\mathcal {M}}_{\mathbb {F} },f((\alpha _{1},\ldots ,\alpha _{n})):=\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle

is a bijection.

Proof:

The function is certainly surjective. Let $\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle =\langle x_{1}-\beta _{1},\ldots ,x_{n}-\beta _{n}\rangle$ , and assume $\beta _{j}\neq \alpha _{j}$ for a certain $j\in \{1,\ldots ,n\}$ . Then $x_{j}-\beta _{j}\in \langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle$ , and thus

0\neq \alpha _{j}-\beta _{j}=x_{j}-\beta _{j}-(x_{j}-\alpha _{j})\in \langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle

.

Thus, $\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle$ contains a unit and therefore equals $\mathbb {F} [x_{1},\ldots ,x_{n}]$ , contradicting its maximality that was established in the last lemma. $\Box$

Definition 21.14:

Let $\mathbb {F}$ be a field. Then the Zariski topology on $\mathbb {F} ^{n}$ is defined to consist of the open sets

\Phi ^{-1}(O)

,

O\subseteq {\mathcal {M}}_{\mathbb {F} }

open

where $\Phi$ and ${\mathcal {M}}_{\mathbb {F} }$ are given as in lemma 21.13 (that is, the Zariski topology on $\mathbb {F} ^{n}$ is defined to be the initial topology with respect to $\Phi$ ).

It is easy to check that the sets $\Phi ^{-1}(O)$ , $O\subseteq {\mathcal {M}}_{\mathbb {F} }$ really do form a topology.

There is a very simple different way to characterise the Zariski topology:

Theorem 21.15:

Let $\mathbb {F}$ be a field. The closed sets of the Zariski topology on $\mathbb {F} ^{n}$ are exactly the algebraic sets.

Proof:

Unfortunately, for a set $T\subseteq \mathbb {F} [x_{1},\ldots ,x_{n}]$ , the notation $V(T)$ is now ambiguous; it could refer to the algebraic set associated to $T$ , or to the set of prime ideals $p$ of $\mathbb {F} [x_{1},\ldots ,x_{n}]$ satisfying $T\subseteq p$ . Hence, we shall write the latter as ${\tilde {V}}(T)$ for the remainder of this wikibook.

Let $A\subseteq \mathbb {F} ^{n}$ be closed w.r.t. the Zariski topology; that is, $A=\Phi ^{-1}({\tilde {V}}(T)\cap {\mathcal {M}}_{\mathbb {F} })$ , where $\Phi$ is the function from lemma 21.13 and $T\subseteq R:=\mathbb {F} [x_{1},\ldots ,x_{n}]$ . We claim that $A=V(T)$ . Indeed, for $\alpha \in \mathbb {F} ^{n}$ ,

{\begin{aligned}\alpha \in V(T)&\Leftrightarrow \forall f\in T:f(\alpha )=0\\&\Leftrightarrow \forall f\in T:\left((x_{1}-\alpha _{1})|f\vee \cdots \vee (x_{1}-\alpha _{1})|f\right)\\&\Leftrightarrow \forall f\in T:f\in \langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle \\&\Leftrightarrow T\subseteq \Phi (\alpha )\\&\Leftrightarrow \Phi (\alpha )\in {\tilde {V}}(T)\\&\Leftrightarrow \alpha \in \Phi ^{-1}({\tilde {V}}(T)\cap {\mathcal {M}}_{\mathbb {F} })\end{aligned}}

.

Let now $V(S)$ be an algebraic set. We claim $V(S)=\Phi ^{-1}({\tilde {V}}(S)\cap {\mathcal {M}}_{\mathbb {F} })$ . Indeed, the above equivalences prove also this identity (with $S$ replacing $T$ ). $\Box$

In fact, we could have defined the Zariski topology in this way (that is, just defining the closed sets to be the algebraic sets), but then we would have hidden the connection to the Zariski topology we already knew.

We shall now go on to give the next important definition, which also shows why we dealt with irreducible spaces.

Definition 21.16:

Let $\mathbb {F}$ be a field and let $V(S)$ be an algebraic set. If $V(S)$ is irreducible w.r.t. the subspace topology induced by the Zariski topology, $V(S)$ is called an algebraic variety.

Often, we shall just write variety for algebraic variety.

We have an easy characterisation of algebraic varieties. But in order to prove it, we need a definition with theorem first.

Theorem and definition 21.17:

Let $V(S)$ be an algebraic set. We define

I(V(S)):=\{f\in \mathbb {F} [x_{1},\ldots ,x_{n}]|\forall x\in V(S):f(x)=0\}

and call $I(V(S))$ the ideal associated to $V(S)$ or the ideal of vanishing of $V(S)$ . We have

V(I(V(S)))=V(S)

and any set $T$ such that $V(T)=V(S)$ is contained within $I(V(S))$ .

Proof:

Let first $T$ be any set such that $V(T)=V(S)$ . Then for all $f\in T$ and $x\in V(S)=V(T)$ , $f(x)=0$ and hence $f\in I(V(S))$ . Thus $T\subseteq I(V(S))$ .

Therefore, $S\subseteq I(V(S))$ , and hence $V(I(V(S)))\subseteq V(S)$ by lemma 21.9. On the other hand, if $x\in V(S)$ , then $f(x)=0$ for all $f\in I(V(S))$ by definition. Hence $x\in V(I(V(S)))$ . This proves $V(S)\subseteq V(I(V(S)))$ . $\Box$

Theorem 21.18:

Let $\mathbb {F}$ be a field and let $V(S)$ be an algebraic set. Then $V(S)$ is an algebraic variety if and only if $V(S)=V(p)$ for a prime ideal $p\leq \mathbb {F} [x_{1},\ldots ,x_{n}]$ .

Proof:

Let first $p\leq \mathbb {F} [x_{1},\ldots ,x_{n}]$ be a prime ideal. Assume that $V(p)=V(S)\cup V(T)$ , where $V(S),V(T)$ are two proper closed subsets of $V(p)$ (according to lemma 21.10, all subsets of $V(p)$ closed w.r.t. the subspace topology of $V(p)$ have this form). Then there exist $x\in V(p)\setminus V(T)$ and $y\in V(p)\setminus V(S)$ . Hence, there is $g\in T$ such that $g(x)\neq 0$ and $f\in S$ such that $f(y)\neq 0$ . Furthermore, $f\cdot g\in p$ since for all $z\in V(S)\cup V(T)$ either $f(z)=0$ or $g(z)=0$ , but neither $f\in p$ nor $g\in p$ .

Let now $V(S)$ be an algebraic set, and assume that $I:=I(V(S))$ is not prime. Let $fg\in I$ such that neither $f\in I$ nor $g\in I$ . Set $J_{f}:=I+\langle f\rangle$ and $J_{g}:=I+\langle g\rangle$ . Then $J_{f}$ and $J_{g}$ are strictly larger than $I$ . According to 21.17, $V(J_{f})\neq V(S)$ and $V(J_{g})\neq V(S)$ , since otherwise $J_{f}\subseteq I$ or $J_{g}\subseteq I$ respectively. Hence, both $V(J_{f})$ and $V(J_{g})$ are proper subsets of $V(S)$ . But if $x\in V(S)=V(I)$ , then $fg(x)=f(x)g(x)=0$ . Hence, either $f(x)=0$ or $g(x)=0$ , and thus either $x\in J_{f}$ or $x\in J_{g}$ . Thus, $V(S)$ is the union of two proper closed subsets,

V(S)=V(J_{g})\cup V(J_{f})

,

and is not irreducible. Hence, if irreducibility is present, then $I(V(S))$ is prime and from 21.17 $V(I(V(S)))=V(S)$ . $\Box$

Theorem 21.19:

$\mathbb {F} ^{n}$ , equipped with the Zariski topology, is a Noetherian space.

Proof:

Let $O_{1}\subseteq O_{2}\subseteq \cdots \subseteq O_{k}\subseteq \cdots$ be an ascending chain of open sets. Let $\Phi$ and ${\mathcal {M}}_{\mathbb {F} }$ be given as in lemma 21.13 and definition 21.14. Set $U_{j}=\Phi (O_{j})$ for all $j\in \mathbb {N}$ . Then, since $\Phi$ , being a function, preserves inclusion,

U_{1}\subseteq U_{2}\subseteq \cdots \subseteq U_{k}\subseteq \cdots

.

Since $\mathbb {F}$ is a Noetherian ring, so is $\mathbb {F} [x_{1},\ldots ,x_{n}]$ (by repeated application of Hilbert's basis theorem). Hence, the above ascending chain of the $U_{j}$ eventually stabilizes at some $N\in \mathbb {N}$ . Since $\Phi$ is a bijection, $O_{j}=\Phi ^{-1}(\Phi (O_{j}))=\Phi ^{-1}(U_{j})$ . Hence, the $O_{j}$ stabilize at $N$ as well. $\Box$

Corollary 21.20:

Every algebraic set $V(S)$ has a decomposition

V(S)=V(p_{1})\cup V(p_{2})\cup \cdots \cup V(p_{k})

for certain prime ideals $p_{1},\ldots ,p_{n}\leq \mathbb {F} [x_{1},\ldots ,x_{n}]$ such that none of the $V(p_{j})$ is a proper subset of the other. This decomposition is unique up to order.

That is, we can decompose algebraic sets into algebraic varieties.

Proof:

Combine theorems 21.19, 21.7 and 21.18. $\Box$

Exercises

Exercise 21.2.1: Let $f,g\in \mathbb {F} [x_{1},\ldots ,x_{n}]$ . Prove that $V(\{f\cdot g\})=V(\{f\})\cup V(\{g\})$ .

Noether's normalisation lemma

Computational preparation

Lemma 23.1:

Let $R$ be a ring, and let $f\in R[x_{1},\ldots ,x_{n}]$ be a polynomial. Let $N\in \mathbb {N}$ be a number that is strictly larger than the degree of any monomial of $f$ (where the degree of an arbitrary monomial $x_{1}^{k_{1}}x_{2}^{k_{2}}\cdots x_{n}^{k_{n}}$ of $f$ is defined to be $k_{1}+k_{2}+\cdots +k_{n}$ ). Then the largest monomial (with respect to degree) of the polynomial

g(x_{1},\ldots ,x_{n}):=f(x_{1}+x_{n}^{N^{n-1}},x_{2}+x_{n}^{N^{n-2}},\ldots ,x_{n-2}+x_{n}^{N^{2}},x_{n-1}+x_{n}^{N},x_{n})

has the form $x_{n}^{m}$ for a suitable $m\in \mathbb {N}$ .

Proof:

Let $x_{1}^{k_{1}}x_{2}^{k_{2}}\cdots x_{n}^{k_{n}}$ be an arbitrary monomial of $f$ . Inserting $x_{1}+x_{n}^{N^{n-1}}$ for $x_{1}$ , $x_{2}+x_{n}^{N^{n-2}}$ for $x_{2}$ gives

(x_{1}+x_{n}^{N^{n-1}})^{k_{1}}(x_{2}+x_{n}^{N^{n-2}})^{k_{2}}\cdots (x_{n-1}+x_{n}^{N})^{k_{n-1}}x_{n}^{k_{n}}

.

This is a polynomial, and moreover, by definition $g$ consists of certain coefficients multiplied by polynomials of that form.

We want to find the largest coefficient of $g$ . To do so, we first identify the largest monomial of

(x_{1}+x_{n}^{N^{n-1}})^{k_{1}}(x_{2}+x_{n}^{N^{n-2}})^{k_{2}}\cdots (x_{n-1}+x_{n}^{N})^{k_{n-1}}x_{n}^{k_{n}}

by multiplying out; it turns out, that always choosing $x_{n}^{N^{j}}$ yields a strictly larger monomial than instead preferring the other variable $x_{j}$ . Hence, the strictly largest monomial of that polynomial under consideration is

(x_{n}^{N^{n-1}})^{k_{1}}(x_{n}^{N^{n-2}})^{k_{2}}\cdots (x_{n}^{N})^{k_{n-1}}x_{n}^{k_{n}}=x_{n}^{k_{1}N^{n-1}+k_{2}N^{n-2}+\cdots +k_{n-1}N+k_{n}}

.

Now $N$ is larger than all the $k_{j}$ involved here, since it's even larger than the degree of any monomial of $f$ . Therefore, for $(k_{1},\ldots ,k_{n})$ coming from monomials of $f$ , the numbers

k_{1}N^{n-1}+k_{2}N^{n-2}+\cdots +k_{n-1}N+k_{n}

represent numbers in the number system base $N$ . In particular, no two of them are equal for distinct $(k_{1},\ldots ,k_{n})$ , since numbers of base $N$ must have same $N$ -cimal places to be equal. Hence, there is a largest of them, call it $m_{1}N^{n-1}+m_{2}N^{n-2}+\cdots +m_{n-1}N+m_{n}$ . The largest monomial of

(x_{1}+x_{n}^{N^{n-1}})^{m_{1}}(x_{2}+x_{n}^{N^{n-2}})^{m_{2}}\cdots (x_{n-1}+x_{n}^{N})^{m_{n-1}}x_{n}^{m_{n}}

is then

x_{n}^{m_{1}N^{n-1}+m_{2}N^{n-2}+\cdots +m_{n-1}N+m_{n}}

;

its size dominates certainly all monomials coming from the monomial of $f$ with powers $(m_{1},\ldots ,m_{n})$ , and by choice it also dominates the largest monomial of any polynomials generated by any other monomial of $f$ . Hence, it is the largest monomial of $g$ measured by degree, and it has the desired form. $\Box$

Algebraic independence in algebras

A notion well-known in the theory of fields extends to algebras.

Theorem 23.2:

Let $R$ be a ring and $A$ an $R$ -algebra. Elements $a_{1},\ldots ,a_{n}$ in $A$ are called algebraically independent over $R$ iff there does not exist a polynomial $p\in R[x_{1},\ldots ,x_{n}]$ such that $p(a_{1},\ldots ,a_{n})=0$ (where the polynomial is evaluated as explained in chapter 21).

Transitivity of localisation

The theorem

Theorem 23.3 (Noether's normalisation lemma):

Let $D$ be an integral domain, and let $E\supseteq D$ be a ring extension of $D$ that is finitely generated as a $D$ -module; in particular, $E$ is a $D$ -algebra, where the algebra operations are induced by the ring operations. Then we may pick a $d\in D$ such that there exist $c_{1},\ldots ,c_{n}\in E_{d}$ ( $E_{d}$ denoting the localisation of $E_{d}$ at $d$ ) which are algebraically independent over $E_{d}$ as a $D_{d}$ -algebra

Localisation of fields

Hilbert's Nullstellensatz

Zariski's lemma

Definition 24.1 (Finitely generated algebra):

Let $R$ be a ring. An $R$ -algebra $A$ is called finitely generated, iff there are elements $a_{1},\ldots ,a_{n}\in A$ such that $R[a_{1},\ldots ,a_{n}]$ is already all of $A$ ; that is $A=R[a_{1},\ldots ,a_{n}]$ .

$A$ being a finitely generated $R$ -algebra thus means that we may write any element of $A$ as a polynomial $p(a_{1},\ldots ,a_{n})$ for a certain $p\in R[x_{1},\ldots ,x_{n}]$ (where polynomials are evaluated as explained in chapter 21).

Lemma 24.2 (Artin–Tate):

Let $R\subseteq S\subseteq T$ be ring extensions such that $R$ is a Noetherian ring, and $T$ is finitely generated as an $S$ -module and also finitely generated as an $R$ -algebra. Then $S$ is finitely generated as an $R$ -algebra.

Proof:

Since $T$ is finitely generated as an $S$ -module, there exist $u_{1},\ldots ,u_{n}\in T$ such that $T=\langle u_{1},\ldots ,u_{n}\rangle$ as an $S$ -module. Further, since $T$ is finitely generated as $R$ -algebra, we find $v_{1},\ldots ,v_{m}\in T$ such that $T$ equals $R[v_{1},\ldots ,v_{n}]$ . Now by the generating property of the $u_{1},\ldots ,u_{n}$ , we may determine suitable coefficients $a_{i,j}\in S$ (where $i$ ranges in $\{1,\ldots ,n\}$ and $j$ in $\{1,\ldots ,m\}$ ) such that

v_{j}=a_{1,j}u_{1}+\cdots +a_{n,j}u_{n}

,

j\in \{1,\ldots ,m\}~~~~~~~(*)

.

Furthermore, there exist suitable $b_{i,j,k}$ ( $i,j,k\in \{1,\ldots ,n\}$ ) such that

u_{j}u_{k}=b_{1,j,k}u_{1}+\cdots +b_{n,j,k}u_{n}~~~~~~~(**)

.

We define $S':=R[a_{i,j}(1\leq i\leq n,1\leq j\leq m),b_{i,j,k}(1\leq i,j,k\leq n)]\subseteq T$ ; this notation shall mean: $S'$ is the algebra generated by all the elements $a_{i,j},b_{i,j,k}$ . Since the algebra operations of $T$ are the ones induced by its ring operations, $S'$ , being a subalgebra, is a subring of $T$ . Furthermore, $S'\subseteq S$ and $R\subseteq S'$ . Since $R$ is a Noetherian ring, $S'$ is also Noetherian by theorem 16.?.

We claim that $T$ is even finitely generated as an $S'$ -module. Indeed, if any element $t\in T$ is given, we may write it as a polynomial in the $v_{1},\ldots ,v_{m}$ . Using $(*)$ , multiplying everything out, and then using $(**)$ repeatedly, we can write this polynomial as a linear combination of the $u_{1},\ldots ,u_{n}$ with coefficients all in $S'$ . This proves that indeed, $T$ is finitely generated as an $S'$ -module. Hence, $T$ is Noetherian as an $S'$ -module.

Therefore, $S$ , being a submodule of $T$ as $S'$ -module, is finitely generated as an $S'$ -module. We claim that $S$ is finitely generated as an $R$ -algebra. To this end, assume we are given a set of generators $m_{1},\ldots ,m_{l}\in S$ of $S$ as an $S'$ -module. Any element $s\in S$ can be written

s=c_{1}m_{1}+\cdots +c_{l}m_{l}

,

c_{1},\ldots ,c_{l}\in S'

.

Each of the $c_{i}$ is a polynomial in the generators of $S'$ (that is, the elements $a_{i,j},b_{i,j,k}$ ) with coefficients in $R$ . Inserting this, we see that $s$ is a polynomial in the elements $a_{i,j},b_{i,j,k},m_{i}$ with coefficients in $R$ . But this implies the claim. $\Box$

Theorem 24.3 (Zariski's lemma):

Let $L$ be a field extension of a field $K$ . Assume that for some $\alpha _{1},\ldots ,\alpha _{n}$ in $L$ , $R=K[\alpha _{1},\ldots ,\alpha _{n}]$ is a field. Then every $\alpha _{i}$ is algebraic over $K$ .

Proof 1 (Azarang 2015):

Before giving the proof of the lemma, we recall the following two well-known facts.

Fact 1. If a field $F$ is integral over a subdomain $D$ , then $D$ is a field.

Fact 2. If $D$ is any principal ideal domain (or just a UFD) with infinitely many (non-associate) prime elements, then its field of fractions is not a finitely generated $D$ -algebra.

Proof of the Lemma: We use induction on $n$ for arbitrary fields $K$ and $L$ . For $n=1$ the assertion is clear. Let us assume that $n>1$ and the lemma is true for less than $n$ . Now to show it for $n$ , one may assume that one of $\alpha _{i}$ , say $\alpha _{1}$ , is not algebraic over $K$ and since $K[\alpha _{1},\ldots ,\alpha _{n}]=K(\alpha _{1})[\alpha _{2},\ldots ,\alpha _{n}]$ is a field, by induction hypothesis, we infer $\alpha _{2},\ldots ,\alpha _{n}$ are all algebraic over $K(\alpha _{1})$ . This implies that there are polynomials $f_{2}(\alpha _{1}),\ldots ,f_{n}(\alpha _{1})\in K[\alpha _{1}]$ such that all $\alpha _{i}$ 's are integral over the domain $A=K[\alpha _{1}][1/f_{2}(\alpha _{1}),\ldots ,1/f_{n}(\alpha _{1})]$ . Since $R$ is integral over $A$ , by Fact 1, $A$ is a field. Consequently, $A=K(\alpha _{1})$ , which contradicts Fact 2.

Proof 2 (Artin–Tate):

If all of the generators of $A$ over $\mathbb {F}$ are algebraic over $\mathbb {F}$ , the last paragraph of the preceding proof shows that $A$ is a finite field extension of $\mathbb {F}$ . Hence, we only have to consider the case where at least one of the generators of $A$ over $\mathbb {F}$ is transcendental over $\mathbb {F}$ .

Indeed, assume that $A=\mathbb {F} [a_{1},\ldots ,a_{n}]$ . By reordering, we may assume that $a_{1},\ldots ,a_{r}$ are transcendental over $\mathbb {F}$ ( $r\geq 1$ ) and $a_{r+1},\ldots ,a_{n}$ are algebraic over $\mathbb {F}$ . We have $A=\mathbb {F} [a_{1},\ldots ,a_{n}]\subseteq \mathbb {F} (a_{1},\ldots ,a_{n})$ , and furthermore $\mathbb {F} (a_{1},\ldots ,a_{n})\subseteq A$ since $A$ is a field extension of $\mathbb {F}$ containing all the elements $a_{1},\ldots ,a_{n}$ . Hence, $A=\mathbb {F} (a_{1},\ldots ,a_{n})$ .

Since all the $a_{r+1},\ldots ,a_{n}$ are algebraic over $\mathbb {F}$ , they are also algebraic over $\mathbb {F} (a_{1},\ldots ,a_{r})$ . Assume that there exists a polynomial $f\in \mathbb {F} [x_{1},\ldots ,x_{n}]\setminus \{0\}$ such that $f(a_{1},\ldots ,a_{r})=0$ . Then $a_{r}$ is algebraic over $\mathbb {F} (a_{1},\ldots ,a_{r-1})$ ; for, the part of the monomials not being a power of $a_{r}$ may be seen as coefficients within that field. Hence, we may lower $r$ by one and still obtain that $a_{r+1},\ldots ,a_{n}$ are algebraic over $\mathbb {F} (a_{1},\ldots ,a_{r})$ . Repetition of this process eventually terminates, or otherwise $a_{1}$ would be algebraic over $\mathbb {F}$ , and $A$ would be a finite tower of algebraic extensions ( $\mathbb {F} (a_{1})$ , $\mathbb {F} (a_{1})(a_{2})$ and so on) and thus a finite field extension.

Therefore, we may assume that $a_{1},\ldots ,a_{r}$ are algebraically independent over $\mathbb {F}$ . In this case, the map

\mathbb {F} [x_{1},\ldots ,x_{r}]\to \mathbb {F} [a_{1},\ldots ,a_{r}],f(x_{1},\ldots ,x_{r})\mapsto f(a_{1},\ldots ,a_{r})

is an isomorphism (it is a homomorphism, surjective and injective), and hence, $\mathbb {F} [a_{1},\ldots ,a_{r}]$ is a unique factorisation domain (since $\mathbb {F} [x_{1},\ldots ,x_{r}]$ is).

Now set $\mathbb {G} :=\mathbb {F} (a_{1},\ldots ,a_{r})$ . Then $\mathbb {F} \subseteq \mathbb {G} \subseteq A$ , and $A$ is finitely generated as an $\mathbb {F}$ -algebra and finitely generated as a $\mathbb {G}$ -module (since it is a finite field extension of $\mathbb {G}$ ). Therefore, by lemma 24.2, $\mathbb {G}$ is finitely generated as an $\mathbb {F}$ -algebra. Let

{\frac {f_{1}(a_{1},\ldots ,a_{n})}{g_{1}(a_{1},\ldots ,a_{n})}},\ldots ,{\frac {f_{m}(a_{1},\ldots ,a_{n})}{g_{m}(a_{1},\ldots ,a_{n})}}

be generators of $\mathbb {G}$ as $\mathbb {F}$ -algebra. Let $p_{1},\ldots ,p_{l}$ be all the primes occurring in the (unique) prime factorisations of $g_{1},\ldots ,g_{m}$ . Now $\mathbb {F} [a_{1},\ldots ,a_{r}]$ contains an infinite number of primes. This is seen as follows.

Assume $q_{1},\ldots ,q_{k}$ were the only primes of $\mathbb {F} [a_{1},\ldots ,a_{r}]$ . Since we have prime factorisation, the element $q_{1}\cdot q_{2}\cdots q_{n}+1\in \mathbb {F} [a_{1},\ldots ,a_{r}]$ is divisible by at least one of $q_{1},\ldots ,q_{k}$ , say $q_{j}$ . This means

1=q_{j}(q_{1}\cdots q_{j-1}q_{j+1}\cdots q_{k}+s)

for a certain $s\in \mathbb {F} [a_{1},\ldots ,a_{r}]$ , which is absurd, since applying the inverse of the above isomorphism to $\mathbb {F} [x_{1},\ldots ,x_{r}]$ , we find that $1$ is mapped to $1$ , but the right hand side has strictly positive degree.

Hence, we may pick $p\notin \{p_{1},\ldots ,p_{l}\}$ prime. Then $1/p$ can not be written as a polynomial in terms of the generators, but is nonetheless contained within $\mathbb {G}$ . This is a contradiction. $\Box$

Proof 3 (using Noether normalisation):

According to Noether's normalisation lemma for fields, we may pick $c_{1},\ldots ,c_{k}\in A$ algebraically independent over $\mathbb {F}$ such that $A$ is a finitely generated $\mathbb {F} [c_{1},\ldots ,c_{k}]$ -module. Let $m_{1},\ldots ,m_{l}$ be elements of $A$ that generate $A$ as an $\mathbb {F} [c_{1},\ldots ,c_{k}]$ -module. Then according to theorem 21.10 3. $\Rightarrow$ 1., the generators are all integral over $\mathbb {F} [c_{1},\ldots ,c_{k}]$ , and since the integral elements form a ring, $A$ is integral over $\mathbb {F} [c_{1},\ldots ,c_{k}]$ . Hence, $\mathbb {F} [c_{1},\ldots ,c_{k}]$ is a field by theorem 21.11. But if $k\geq 1$ , then the $c_{1},\ldots ,c_{k}$ being algebraically independent means that the homomorphism

\mathbb {F} [x_{1},\ldots ,x_{k}]\to \mathbb {F} [c_{1},\ldots ,c_{k}],f(x_{1},\ldots ,x_{k})\mapsto f(c_{1},\ldots ,c_{k})

is in fact an isomorphism, whence $\mathbb {F} [c_{1},\ldots ,c_{k}]$ is not a field, contradiction. Thus, $k=0$ , and hence $A$ is finitely generated as an $\mathbb {F}$ -module. This implies that we have a finite field extension; all elements of $A$ are finite $\mathbb {F}$ -linear combinations of certain generators. $\Box$

Hilbert's Nullstellensatz

There are several closely related results bearing the name Hilbert's Nullstellensatz. We shall state and prove the ones commonly found in the literature. These are the "weak form", the "common roots form" and the "strong form". The result that Hilbert originally proved was the strong form.

Weak form

The formulation and proof of the weak form of Hilbert's Nullstellensatz are naturally preceded by the following lemma.

Lemma 24.5:

Let $\mathbb {F}$ be any field. For any maximal ideal $m\leq \mathbb {F} [x_{1},\ldots ,x_{n}]$ , the field $\mathbb {F} [x_{1},\ldots ,x_{n}]/m$ is a finite field extension of the field $\{c+m|c\in \mathbb {F} \}\subseteq \mathbb {F} [x_{1},\ldots ,x_{n}]/m$ . In particular, if $\mathbb {F}$ is algebraically closed (and thus has no proper finite field extensions), then $\mathbb {F} [x_{1},\ldots ,x_{n}]/m=\{c+m|c\in \mathbb {F} \}$ .

Proof 1 (using Zariski's lemma):

$\mathbb {F} [x_{1},\ldots ,x_{n}]/m$ is a finitely generated $\{c+m|c\in \mathbb {F} \}$ -algebra, where all the operations are induced by the ring structure of $\mathbb {F} [x_{1},\ldots ,x_{n}]$ ; this is because the set $\{x_{1}+m,\ldots ,x_{n}+m\}$ constitutes a set of generators, since every element in $\mathbb {F} [x_{1},\ldots ,x_{n}]/m$ can be written as polynomials in those elements over $\{c+m|c\in \mathbb {F} \}$ . Therefore, Zariski's lemma implies that $\mathbb {F} [x_{1},\ldots ,x_{n}]/m$ is a finite field extension of the field $\{c+m|c\in \mathbb {F} \}$ . $\Box$

Proof 2 (using Jacobson rings):

We proceed by induction on $n$ .

The case $n=1$ follows by noting that $\mathbb {F} [x_{1}]$ is a principal ideal domain (as an Euclidean domain) and hence, if $m\leq \mathbb {F} [x_{1}]$ is a (maximal) ideal, then $m=\langle f\rangle$ for a suitable $f\in \mathbb {F} [x_{1}]$ . Now $\mathbb {F} [x_{1}]/m$ is a field if $m$ is maximal; we claim that it is a finite field extension of the field $\{c+m|c\in \mathbb {F} \}$ . Indeed, as basis elements we may take $1+m,x_{1}+m,x_{1}^{2}+m,\ldots ,x_{1}^{d-1}+m$ , where $d:=\deg f$ is the degree of the generating polynomial of the maximal ideal $m=\langle f\rangle$ . Any element of $\mathbb {F} [x_{1}]/m$ can thus be expressed as linear combination of these basis elements, since the relation

a_{d}x^{d}+m=-(a_{d-1}x^{d-1}+\cdots +a_{1}x+a_{0})+m

(where

f(x)=a_{d}x^{d}+\cdots +a_{1}x+a_{0}

)

allows us to express monomials of degree $\geq d$ in terms of smaller ones.

Assume now the case $n-1$ is proven. Let $m\leq \mathbb {F} [x_{1},\ldots ,x_{n}]$ be a maximal ideal. According to Jacobson's first criterion, $\mathbb {F} [x_{1},\ldots ,x_{n-1}]$ is a Jacobson ring (since $\mathbb {F}$ is, being a field). Now $\mathbb {F} [x_{1},\ldots ,x_{n}]=\mathbb {F} [x_{1},\ldots ,x_{n-1}][x_{n}]$ and hence $m$ is a maximal ideal of $\mathbb {F} [x_{1},\ldots ,x_{n-1}][x_{n}]$ . Thus, Goldman's second criterion asserts that $m_{0}:=\mathbb {F} [x_{1},\ldots ,x_{n-1}]\cap m$ is a maximal ideal of $\mathbb {F} [x_{1},\ldots ,x_{n-1}]$ . Thus, $\mathbb {F} [x_{1},\ldots ,x_{n-1}]/m_{0}$ is a field, and, by the induction hypothesis, a finite field extension of $\{c+m_{0}|c\in \mathbb {F} \}$ .

We define the ideal $p:=m_{0}\mathbb {F} [x_{1},\ldots ,x_{n}]$ . The following map is manifestly an isomorphism:

{\begin{aligned}\varphi :\mathbb {F} [x_{1},\ldots ,x_{n-1}][x_{n}]/p&\mapsto (\mathbb {F} [x_{1},\ldots ,x_{n-1}]/m_{0})[x_{n}]\\a_{k}x_{n}^{k}+\cdots +a_{1}x_{n}+a_{0}+p&\mapsto (a_{k}+m_{0})x_{n}^{k}+\cdots +(a_{1}+m_{0})x_{n}+(a_{0}+m_{0})\end{aligned}}

This map sends $\{c+p|c\in \mathbb {F} \}$ to $\{(c+m_{0})|c\in \mathbb {F} \}$ (and, being an isomorphism, vice versa).

Furthermore, since $m\subsetneq p$ , the ideal $\pi _{p}(m)$ is maximal in $\mathbb {F} [x_{1},\ldots ,x_{n-1}][x_{n}]/p$ . Hence, $\varphi (p/m)$ is maximal in $(\mathbb {F} [x_{1},\ldots ,x_{n-1}]/m_{0})[x_{n}]$ and thus $\left((\mathbb {F} [x_{1},\ldots ,x_{n-1}]/m_{0})[x_{n}]\right){\big /}\varphi (\pi _{p}(m))$ is a field. By the case $n=1$ it is a finite field extension of the field $\{d+\varphi (m/p)|d\in \mathbb {F} [x_{1},\ldots ,x_{n-1}]/m_{0}\}$ .

In general, any proper ideal of $\mathbb {F} [x_{1},\ldots ,x_{n}]$ , where $\mathbb {F}$ is a field, does not contain any constants (apart from zero), for else it would contain a unit and thus be equal to the whole of $\mathbb {F} [x_{1},\ldots ,x_{n}]$ . This applies, in particular, to all maximal ideals of $\mathbb {F} [x_{1},\ldots ,x_{n}]$ . Thus, elements of $\mathbb {F} [x_{1},\ldots ,x_{n}]/m$ of the form $c+m$ are distinct for pairwise distinct $c$ . By definition of addition and multiplication of residue class rings, this implies that we have an isomorphism of rings (and thus, of fields)

\mathbb {F} [x_{1},\ldots ,x_{n}]/m\supseteq \{c+m|c\in \mathbb {F} \}\cong \mathbb {F} ,c+m\mapsto c

.

Hence, in the case that $\mathbb {F}$ is algebraically closed, the above lemma implies $\mathbb {F} [x_{1},\ldots ,x_{n}]/m\cong \mathbb {F}$ via that isomorphism.

Theorem 24.6 (Hilbert's Nullstellensatz, weak form):

Let $\mathbb {F} ={\overline {\mathbb {F} }}$ be an algebraically closed field. For any $\xi =(\xi _{1},\ldots ,\xi _{n})\in \mathbb {F} ^{n}$ , set

m_{\xi }:=\langle x_{1}-\xi _{1},\ldots ,x_{n}-\xi _{n}\rangle \leq \mathbb {F} [x_{1},\ldots ,x_{n}]

;

according to lemma 21.12, $m_{\xi }$ is a maximal ideal.

The claim of the weak Hilbert's Nullstellensatz is this: Every maximal ideal $m\leq \mathbb {F} [x_{1},\ldots ,x_{n}]$ has the form $m_{\xi }$ for a suitable $\xi \in \mathbb {F} ^{n}$ .

Proof:

Let $m\leq \mathbb {F} [x_{1},\ldots ,x_{n}]$ be any maximal ideal of $\mathbb {F} [x_{1},\ldots ,x_{n}]$ . According to the preceding lemma, and since $\mathbb {F}$ is algebraically closed, we have $\mathbb {F} [x_{1},\ldots ,x_{n}]/m\cong \mathbb {F}$ via an isomorphism that sends elements of the type $c+m$ to $c$ . Now this isomorphism must send any element of the type $x_{j}+m$ to some element $\alpha _{j}$ of $\mathbb {F}$ . But further, the element $\alpha _{j}+m$ is sent to $\alpha _{j}\in \mathbb {F}$ . Since we have an isomorphism (in particular injectivity), we have $\alpha _{j}+m=x_{j}+m\Leftrightarrow x_{j}-\alpha _{j}\in m$ . Thus $\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle \subseteq m$ for suitable $\alpha _{1},\ldots ,\alpha _{n}$ . Since the ideal $\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle$ is maximal (lemma 21.12), we have equality: $\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle =m$ . $\Box$

Common roots form

Theorem 24.7 (Hilbert's Nullstellensatz, common root form):

Let $\mathbb {F} ={\overline {\mathbb {F} }}$ be an algebraically closed field and let $f_{1},\ldots ,f_{k}\in \mathbb {F} [x_{1},\ldots ,x_{n}]$ . If

\langle f_{1},\ldots ,f_{k}\rangle \subsetneq \mathbb {F} [x_{1},\ldots ,x_{n}]

,

then there exists $\xi =(\xi _{1},\ldots ,\xi _{n})\in \mathbb {F} ^{n}$ such that $f_{1}(\xi )=f_{2}(\xi )=\ldots =f_{k}(\xi )=0$ .

Proof:

This follows from the weak form, since $\langle f_{1},\ldots ,f_{k}\rangle$ is contained within some maximal ideal $m\leq \mathbb {F} [x_{1},\ldots ,x_{n}]$ , which by the weak form has the form $m=\langle x_{1}-\alpha _{1},\ldots ,x_{n}-\alpha _{n}\rangle$ for suitable $\alpha _{1},\ldots ,\alpha _{n}\in \mathbb {F}$ and hence $\{(\alpha _{1},\ldots ,\alpha _{n})\}=V(m)\subseteq V(\langle f_{1},\ldots ,f_{k}\rangle )$ ; in particular, $(\alpha _{1},\ldots ,\alpha _{n})\in V(\langle f_{1},\ldots ,f_{k}\rangle )$ , that is, $\xi :=(\alpha _{1},\ldots ,\alpha _{n})$ is a common root of $f_{1},\ldots ,f_{k}$ . $\Box$

Strong form

Theorem 24.8 (Hilbert's Nullstellensatz, strong form):

Let $\mathbb {F} ={\overline {\mathbb {F} }}$ be an algebraically closed field. If $I\leq \mathbb {F} [x_{1},\ldots ,x_{n}]$ is an arbitrary ideal, then

I(V(I))=r(I)

;

recall: $r(I)$ is the radical of $I$ .

In particular, if $I$ is a radical ideal (that is, $r(I)=I$ ), then

I(V(I))=I

.

Note that together with the rule

V(I(V(S)))=V(S)

for any algebraic set $V(S)$ (that was established in chapter 22), this establishes a bijective correspondence between radical ideals of $\mathbb {F} [x_{1},\ldots ,x_{n}]$ and algebraic sets in $\mathbb {F} ^{n}$ , given by the function

V(\cdot ):\{{\text{radical ideals of }}\mathbb {F} [x_{1},\ldots ,x_{n}]\}\to \{{\text{algebraic sets in }}\mathbb {F} ^{n}\}

and inverse

I(\cdot ):\{{\text{algebraic sets in }}\mathbb {F} ^{n}\}\to \{{\text{radical ideals of }}\mathbb {F} [x_{1},\ldots ,x_{n}]\}

.

Proof 1 (using Jacobson rings):

Certainly, a field is a Jacobson ring. Furthermore, from Goldman's first criterion (theorem 14.4) we may infer that $\mathbb {F} [x_{1},\ldots ,x_{n}]$ is a Jacobson ring as well. Let now $f\in \mathbb {F} [x_{1},\ldots ,x_{n}]$ be a polynomial vanishing at all of $V(I)$ , and let $m\leq \mathbb {F} [x_{1},\ldots ,x_{n}]$ be any maximal ideal of $\mathbb {F} [x_{1},\ldots ,x_{n}]$ that contains $I$ . By the weak Nullstellensatz, $m$ has the form $m_{\xi }=\langle x_{1}-\xi _{1},\ldots ,x_{n}-\xi _{n}\rangle$ for a suitable $\xi =(\xi _{1},\ldots ,\xi _{n})\in \mathbb {F} ^{n}$ .

Now we have $\xi \in V(m_{\xi })$ , since any polynomial in $m_{\xi }$ can be written as a $\mathbb {F} [x_{1},\ldots ,x_{n}]$ -linear combination of the generators $x_{1}-\xi _{1},\ldots ,x_{n}-\xi _{n}$ . Hence, $I(V(m_{\xi }))$ is not all of $\mathbb {F} [x_{1},\ldots ,x_{n}]$ ; due to the constant functions, only the empty set has this ideal of vanishing. This, in combination with the fact that $m_{\xi }\subseteq I(V(m_{\xi }))$ and the maximality of $m_{\xi }$ implies $I(V(m_{\xi }))=m_{\xi }$ .

Furthermore, $V(m_{\xi })\subseteq V(I)$ , and hence $I(V(I))\subseteq I(V(m_{\xi }))$ . Therefore, $f\in I(V(m_{\xi }))=m_{\xi }$ .

Since $f\in I(V(I))$ was arbitrary, $I(V(I))$ is thus contained in all maximal ideals containing $I$ and hence, since $\mathbb {F} [x_{1},\ldots ,x_{n}]$ is Jacobson, $I(V(I))\subseteq r(I)$ . However, the other direction $r(I)\subseteq I(V(I))$ is easy to see (we will prove this in the first paragraph of the next proof; there is no need to repeat the same argument in two proofs). Thus, $I(V(I))=r(I)$ . $\Box$

Proof 2 (Rabinowitsch trick):

First we note $\supseteq$ : Indeed, if $g^{n}\in I$ , then $g(x)^{n}=0$ for all $x\in V(I)$ . Hence also $g(x)=0$ for all $x\in V(I)$ since a field does not have nilpotent elements except zero (in fact, not even zero divisors). This implies $g\in I(V(I))$ .

$\subseteq$ is the longer direction. Note that any field is Noetherian, and thus, by Hilbert's basis theorem, so is $\mathbb {F} [x_{1},\ldots ,x_{n}]$ . Hence, $I$ , being an ideal of $\mathbb {F} [x_{1},\ldots ,x_{n}]$ , is finitely generated. Write

I=\langle f_{1}(x),\ldots ,f_{k}(x)\rangle

.

Let $g\in I(V(I))$ . Consider the polynomial ring $\mathbb {F} [x_{1},\ldots ,x_{n},z]$ , which is augmented by an additional variable. In that ring, consider the polynomial $h(x_{1},\ldots ,x_{n},z):=1-zg(x_{1},\ldots ,x_{n})$ . The polynomials $f_{1},\ldots ,f_{n},h$ have no common zero (where the polynomials $f_{1},\ldots ,f_{n}$ are seen as polynomials in the variables $x_{1},\ldots ,x_{n},z$ by the way of $f_{j}(x_{1},\ldots ,x_{n},z):=f_{j}(x_{1},\ldots ,x_{n})$ ), since if all the polynomials $f_{1},\ldots ,f_{n}$ are zero at $(\alpha _{1},\ldots ,\alpha _{n},\beta )$ (where the variable $\beta$ does not matter for the evaluation of $f_{1},\ldots ,f_{n}$ ), then so is $g$ . Hence, in this case, $h(\alpha _{1},\ldots ,\alpha _{n},\beta )=1-\beta \cdot 0=1\neq 0$ .

Now we may apply the common roots form of the Nullstellensatz for the case of $n+1$ variables. The polynomials $f_{1},\ldots ,f_{n},h$ have no common zero, and therefore, the common roots form Nullstellensatz implies that the ideal $\langle f_{1},\ldots ,f_{n},h\rangle$ must be all of $\mathbb {F} [x_{1},\ldots ,x_{n},z]$ . In particular, we can find $\eta _{1},\ldots ,\eta _{n},\mu \in \mathbb {F} [x_{1},\ldots ,x_{n},z]$ such that

1=\eta _{1}(x_{1},\ldots ,x_{n},z)f_{1}(x_{1},\ldots ,x_{n},z)+\cdots +\eta _{n}(x_{1},\ldots ,x_{n},z)f_{n}(x_{1},\ldots ,x_{n},z)+\mu (x_{1},\ldots ,x_{n},z)h(x_{1},\ldots ,x_{n},z)

.

Passing to the field of rational functions $\mathbb {F} (x_{1},\ldots ,x_{n})$ , we may insert ${\frac {1}{g(x_{1},\ldots ,x_{n})}}$ for $z$ (recall that we assumed $g\not \equiv 0$ ) to obtain

1=\eta _{1}(x_{1},\ldots ,x_{n},1/g)f_{1}(x_{1},\ldots ,x_{n},1/g)+\cdots +\eta _{n}(x_{1},\ldots ,x_{n},1/g)f_{n}(x_{1},\ldots ,x_{n},1/g)+\mu (x_{1},\ldots ,x_{n},1/g)h(x_{1},\ldots ,x_{n},1/g)

,

where we left out the variables of $g$ so that it still fits on the screen. Now $h(x_{1},\ldots ,x_{n},1/g)=1-{\frac {g(x_{1},\ldots ,x_{n})}{g(x_{1},\ldots ,x_{n})}}=0$ , whence

1=\eta _{1}(x_{1},\ldots ,x_{n},1/g)f_{1}(x_{1},\ldots ,x_{n},1/g)+\cdots +\eta _{n}(x_{1},\ldots ,x_{n},1/g)f_{n}(x_{1},\ldots ,x_{n},1/g)

.

Multiplying this equation by an appropriate power of $g$ , call it $N\in \mathbb {N}$ , sufficiently large such that we clear out all denominators, and noting that the last variable does not matter for $f_{1},\ldots ,f_{n}$ , yields that $g^{N}$ equals an $\mathbb {F} [x_{1},\ldots ,x_{n}]$ -linear combination of $f_{1},\ldots ,f_{n}$ and is thus contained within $I$ . Hence, $g\in r(I)$ . $\Box$

Note how Yuri Rainich ("Rabinowitsch") may have found this trick. Perhaps he realized that the weak Nullstellensatz is a claim for arbitrary $n$ , and for the proof of the strong Nullstellensatz, we can do one $n$ at a time, using the infinitude of cases of the common roots form Nullstellensatz. That is, compared to a particular dimensional case in the strong Nullstellensatz, the infinitude of cases for the common roots form Nullstellensatz are not so weak at all, despite the common roots form being a consequence of the weak Nullstellensatz. This could have given Rainich evidence that using more cases, one obtains a stronger tool. And indeed, it worked out.