# Mathematical Proof/Functions

## Introduction

In the previous chapter, we have discussed the concept of relations. There are not much restrictions on the relations. For instance, for a relation from set ${\displaystyle A}$ to set ${\displaystyle B}$, every element of ${\displaystyle A}$ can be related to no elements of ${\displaystyle B}$, exactly one element of ${\displaystyle B}$, or multiple elements of ${\displaystyle B}$. In this chapter, we will focus on a particular type of relations from set ${\displaystyle A}$ to set ${\displaystyle B}$ where every element in ${\displaystyle A}$ is related to exactly one (or an unique) element of ${\displaystyle B}$.

You should have encountered the concept of functions before, e.g. a function defined by ${\displaystyle f(x)=x^{2}}$ gives you a value of ${\displaystyle f(x)}$ for every given value of ${\displaystyle x}$. The functions you have encountered are likely to be in the above form, i.e. in the form of an equation. Also, the value of ${\displaystyle x}$ and ${\displaystyle f(x)}$ are likely to be real numbers.

But we can generalize such ideas to more general situations where ${\displaystyle x}$ and ${\displaystyle f(x)}$ are not necessarily real numbers (e.g., they can be elements of certain sets), and the functions need not be expressed by a formula like above. As a result, the concept and definition of functions discussed here may seem foreign to you, and look quite different from what you have previously learnt about functions.

## Definitions

Definition. (Functions) Let ${\displaystyle A}$ and ${\displaystyle B}$ be sets. A function or mapping from set ${\displaystyle A}$ to set ${\displaystyle B}$, written ${\displaystyle f:A\to B}$, is a relation from set ${\displaystyle A}$ to set ${\displaystyle B}$ such that for every element ${\displaystyle a\in A}$, there exists a unique ${\displaystyle b\in B}$ such that ${\displaystyle (a,b)\in f}$.

Remark.

• The notation "${\displaystyle (a,b)\in f}$" may seem weird at the first place. However, when we think about what ${\displaystyle f}$ represents here, it becomes more natural: since ${\displaystyle f}$ is a relation from set ${\displaystyle A}$ to set ${\displaystyle B}$, it is indeed a subset of ${\displaystyle A\times B}$. So, it makes sense to say ${\displaystyle (a,b)}$ is an element of ${\displaystyle f}$.
• However, it is much more common to write ${\displaystyle f(a)=b}$ instead (and you should have seen this notation before).
• Since ${\displaystyle f}$ is a relation, we can apply the terminologies to functions: the set ${\displaystyle A}$ is the domain of function ${\displaystyle f}$, denoted by ${\displaystyle \operatorname {dom} f}$.
• We also have some more terminologies for functions: The set ${\displaystyle B}$ is called the codomain of ${\displaystyle f}$, ${\displaystyle b=f(a)}$ is the image of ${\displaystyle a}$ under ${\displaystyle f}$. Also, ${\displaystyle a}$ is referred to as the pre-image of ${\displaystyle b}$ under ${\displaystyle f}$. Furthermore, we say that ${\displaystyle f}$ maps ${\displaystyle a}$ into ${\displaystyle b}$.
• From the definition, every element ${\displaystyle a\in A}$ has an unique (or exactly one) image.
• When the set ${\displaystyle A}$ is empty, then the only function (and also the only relation) from set ${\displaystyle A}$ to ${\displaystyle B}$ is the empty set. When set ${\displaystyle A}$ is nonempty while the set ${\displaystyle B}$ is empty, since the only relation from set ${\displaystyle A}$ to ${\displaystyle B}$ is also the empty set, but such relation does not satisfy the requirements to be a function (there does not exist ${\displaystyle b\in B}$ such that ${\displaystyle (a,b)\in f}$ for every ${\displaystyle a\in A}$).

Exercise. Write down the meaning of "${\displaystyle f}$ is not a function from ${\displaystyle A}$ to ${\displaystyle B}$", i.e., the negation of the definition above. (Hint: first split the unique existential quantifier into two the existence part and uniqueness part.)

Solution

First, we express the meaning of "${\displaystyle f}$ is a function from ${\displaystyle A}$ to ${\displaystyle B}$" using logical notations:

${\displaystyle {\big (}\forall a\in A,\exists b\in B,f(a)=b{\big )}{\text{ and }}{\big (}\forall a\in A,\forall b,c\in B,{\text{if }}f(a)=b{\text{ and }}f(a)=c,{\text{ then }}b=c{\big )}.}$
Then, the meaning of "${\displaystyle f}$ is not a function from ${\displaystyle A}$ to ${\displaystyle B}$" is:
${\displaystyle {\big (}\exists a\in A,\forall b\in B,f(a)\neq b{\big )}{\text{ or }}{\big (}\exists a\in A,\exists b,c\in B,f(a)=b{\text{ and }}f(a)=c{\text{ and }}b\neq c{\big )}.}$
(LHS means the existential requirement is violated, and RHS means the uniqueness requirement is violated.) In other words, the meaning is "there exists some ${\displaystyle a\in A}$ with no image, OR there exists some ${\displaystyle a\in A}$ with more than one image".

Example. Let ${\displaystyle A=\{1,2,3\}}$ and ${\displaystyle B=\{a,b\}}$. Then,

${\displaystyle f=\{(1,a),(2,a),(3,b)\}}$
defines a function. Graphically, it looks like

  A               B
*----*          *----*
|1*  #
|    |  #       |    |
|    |       #  |    |
|    |          |#   |
|2* # # # # # #  # *a|
|    |          |    |
|3* # # # # # #  # *b|
|    |          |    |
*----*          *----*


Here, ${\displaystyle a}$ is the image of 1 and 2 under ${\displaystyle f}$. In other words, 1 and 2 are the pre-images of ${\displaystyle a}$. Similarly, ${\displaystyle b}$ is the image of 3 under ${\displaystyle f}$ and 3 is the pre-image of ${\displaystyle b}$ under ${\displaystyle f}$.

Example. Let ${\displaystyle A=\{a,b,c,d\}}$ and ${\displaystyle B=\{1,2,3,4,5,6\}}$. Then, ${\displaystyle f=\{(a,1),(a,2),(b,3),(c,4),(d,5)\}}$ is not a function from ${\displaystyle A}$ to ${\displaystyle B}$ since ${\displaystyle a}$ has two images: 1 and 2. Also, ${\displaystyle f=\{(a,1),(b,3)\}}$ is not a function from ${\displaystyle A}$ to ${\displaystyle B}$ since ${\displaystyle c}$ and ${\displaystyle d}$ have no image.

Example. It is quite common to refer, say, "${\displaystyle f(x)=x^{2}}$" to as a "function". But, when we look at the definition of function, such saying actually has some issues, strictly speaking:

1. What are the domain and codomain?
2. ${\displaystyle f(x)}$ is just an image of a real number ${\displaystyle x}$ under ${\displaystyle f}$. The function ${\displaystyle f}$ itself should be a set.

For the first question, usually the domain and codomain are taken to be ${\displaystyle \mathbb {R} }$. But for some functions, ${\displaystyle f(x)}$ becomes undefined for some values of real number ${\displaystyle x}$. In this case, it is common to set the domain as ${\displaystyle \{x:f(x){\text{ is a real number}}\}}$, which is referred to as the natural domain.

For the second question, the actual function ${\displaystyle f}$ is indeed the set

${\displaystyle f=\{(x,x^{2}):x\in \mathbb {R} \}.}$
This set of point in the plane is the graph of ${\displaystyle f}$ (a parabola). However, it is acceptable to have such saying for convenience.

Exercise. A commonly encountered function is the exponential function, which is defined by ${\displaystyle g(x)=e^{x}}$. Another one is the natural logarithm function defined by ${\displaystyle h(x)=\ln x}$.

(a) Find the domain and codomain of the functions ${\displaystyle g}$ and ${\displaystyle h}$.

(b) Express the functions ${\displaystyle g}$ and ${\displaystyle h}$ using sets.

Solution

(a) The domain and codomain of ${\displaystyle g}$ are both ${\displaystyle \mathbb {R} }$. The domain of ${\displaystyle h}$ is ${\displaystyle (0,\infty )}$ (when ${\displaystyle x}$ is less than or equal to zero, ${\displaystyle \ln x}$ is undefined), and the codomain of ${\displaystyle h}$ is ${\displaystyle \mathbb {R} }$.

(b) ${\displaystyle g=\{(x,e^{x}):x\in \mathbb {R} \}}$ and ${\displaystyle h=\{(x,\ln x):x\in (0,\infty )\}}$.

We should be aware that the codomain of a function is not necessarily the same as the range of a function. The following is the definition of the range of a function (same as the definition for a relation).

Definition. (Range of function) The range of a function ${\displaystyle f}$, denoted by ${\displaystyle \operatorname {ran} f}$, is

${\displaystyle \operatorname {ran} f=\{b\in B:b=f(x){\text{ for some }}x\in A\}=\{f(x):x\in A\}\subseteq B.}$

Remark.

• Comparing the definitions of range and codomain, we can notice that for the range, it includes all elements in the codomain that have at least one pre-image. On the other hand, the codomain is the set that contains everything into which every element ${\displaystyle a\in A}$ can be possibly mapped.
• When the range and the codomain of a function ${\displaystyle f}$ turn out to be the same, we call such function ${\displaystyle f}$ to be surjective. We will discuss more properties for a function later.
• We can observe that we can actually manually set (or restrict) the codomain of function to be its range. So, even if a given function is not surjective, we can make it surjective by changing its codomain manually.

Example. Let ${\displaystyle A=\{1,2\}}$ and ${\displaystyle B=\{3,4,5\}}$. Define the function ${\displaystyle f}$ by

${\displaystyle f=\{(1,3),(2,4)\}.}$
Then, ${\displaystyle \operatorname {ran} f=\{3,4\}.}$

Exercise. Let ${\displaystyle A}$ and ${\displaystyle B}$ be sets, and ${\displaystyle b}$ be a fixed element of ${\displaystyle B}$. Then, the function ${\displaystyle f:A\to B}$ defined by ${\displaystyle f(a)=b}$ for every ${\displaystyle a\in A}$ is called the constant function from ${\displaystyle A}$ to ${\displaystyle B}$.

(a) What is the range of constant function?

(b) What is the codomain of constant function?

(c) Can the range of constant function equal its codomain?

(d) What is/are the pre-image(s) of ${\displaystyle b}$?

Solution

(a) The range is ${\displaystyle \{b\}}$.

(b) The codomain is the set ${\displaystyle B}$.

(c) Yes, if ${\displaystyle B=\{b\}}$.

(d) The pre-images of ${\displaystyle b}$ are all elements of ${\displaystyle A}$.

Example. (Identity function) Let ${\displaystyle A}$ be a set. Then the function ${\displaystyle id_{A}}$ defined by ${\displaystyle id_{A}(a)=a}$ for every ${\displaystyle a\in A}$ is called the identity function of ${\displaystyle A}$. Every ${\displaystyle a\in A}$ is the image of itself. The domain, codomain, and range of ${\displaystyle id_{A}}$ are all ${\displaystyle A}$.

Remark.

• This function is quite important in the section about inverse functions.

Example. Consider the function ${\displaystyle f:\mathbb {R} \to \mathbb {R} }$ defined by ${\displaystyle f(x)=e^{x}}$. Find the range of the function ${\displaystyle f}$, ${\displaystyle \operatorname {ran} f}$, and prove your answer.

The range of the function ${\displaystyle f}$ is ${\displaystyle \operatorname {ran} f=(0,\infty )}$.

Proof. To prove that ${\displaystyle \operatorname {ran} f=(0,\infty )}$, we will prove (i) ${\displaystyle \operatorname {ran} f\subseteq [0,\infty )}$; (ii) ${\displaystyle [0,\infty )\subseteq \operatorname {ran} f}$.

Firstly, for ${\displaystyle \operatorname {ran} f\subseteq (0,\infty )}$: for every ${\displaystyle y}$,

{\displaystyle {\begin{aligned}y\in \operatorname {ran} f&\implies y=e^{x}{\text{ for some }}x\in \mathbb {R} \\&\implies y>0&({\text{we do not have the reverse implication for this step}})\\&\implies y\in (0,\infty ).\end{aligned}}}
On the other hand, for ${\displaystyle (0,\infty )\subseteq \operatorname {ran} f}$: for every ${\displaystyle y\in (0,\infty )}$, we choose ${\displaystyle x=\ln y\in \mathbb {R} }$. Then, ${\displaystyle f(x)=e^{x}=e^{\ln y}=y}$. So, by definition, we have ${\displaystyle y\in \operatorname {ran} f}$.

${\displaystyle \Box }$

Exercise. Consider the function ${\displaystyle f:\mathbb {R} \to \mathbb {R} }$ defined by ${\displaystyle f(x)=x^{2}}$. Find ${\displaystyle \operatorname {ran} f}$, and prove your answer.

Solution

We have ${\displaystyle \operatorname {ran} f=[0,\infty )}$.

Proof. First, for every ${\displaystyle y}$,

{\displaystyle {\begin{aligned}y\in \operatorname {ran} f&\implies y=x^{2}{\text{ for some }}x\in \mathbb {R} \\&\implies y\geq 0\\&\implies y\in [0,\infty ).\end{aligned}}}
This shows that ${\displaystyle \operatorname {ran} f\subseteq [0,\infty )}$.

On the other hand, for every ${\displaystyle y\in [0,\infty )}$, we choose ${\displaystyle x={\sqrt {y}}\in \mathbb {R} }$. Then, ${\displaystyle f(x)=x^{2}=\left({\sqrt {y}}\right)^{2}=y}$ . So, we have ${\displaystyle y\in \operatorname {ran} f}$. This shows that ${\displaystyle [0,\infty )\subseteq \operatorname {ran} f}$, and we are done.

${\displaystyle \Box }$

Since the domain and codomain, in addition to the "way" of the mapping, all affect the "behaviour" and properties of a function, it is natural to incorporate all these features into the definition of equality of functions:

Definition. (Equality of functions) Let ${\displaystyle A}$ and ${\displaystyle B}$ be sets, and let ${\displaystyle f:A\to B}$ and ${\displaystyle g:A\to B}$ be two functions from ${\displaystyle A}$ to ${\displaystyle B}$ (notice that their domains and codomains have to be the same). Then, the functions ${\displaystyle f}$ and ${\displaystyle g}$ are said to be equal, written ${\displaystyle f=g}$, if ${\displaystyle f(a)=g(a)}$ for every ${\displaystyle a\in A}$.

Exercise. Write down the meaning for two functions ${\displaystyle f}$ and ${\displaystyle g}$ to be not equal.

Solution

(${\displaystyle f}$ and ${\displaystyle g}$ have different domain) or (${\displaystyle f}$ and ${\displaystyle g}$ have different codomain) or (there exists ${\displaystyle a\in A}$ such that ${\displaystyle f(a)\neq g(a)}$).

Example. Consider the functions ${\displaystyle f:\mathbb {R} \to \mathbb {R} }$ and ${\displaystyle g:\mathbb {R} \to (0,\infty )}$ defined by ${\displaystyle f(x)=e^{x}}$ and ${\displaystyle g(x)=e^{x}}$. Although their "formulas" are the same, they are not equal since their codomains are different.

Exercise. Consider the functions ${\displaystyle f}$ and ${\displaystyle g}$ defined by ${\displaystyle f(x)=x}$ and ${\displaystyle g(x)={\frac {x^{2}}{x}}}$. Are the functions ${\displaystyle f}$ and ${\displaystyle g}$ equal? Why?

Solution

No.

Notice hat ${\displaystyle g(x)}$ is undefined when ${\displaystyle x=0}$ (and defined for every other real number ${\displaystyle x}$), while ${\displaystyle f(x)}$ is defined for every real number ${\displaystyle x}$. Thus, the domain of ${\displaystyle f}$ is ${\displaystyle \mathbb {R} }$ and the domain of ${\displaystyle g}$ is ${\displaystyle \mathbb {R} \setminus \{0\}}$. So, ${\displaystyle f}$ and ${\displaystyle g}$ are not equal.

## Injective, surjective and bijective functions

In this section, we will discuss some important properties that a function may possess, namely injectivity, surjectivity, and bijectivity.

Definition. (Injective function) A function ${\displaystyle f:A\to B}$ is injective or one-to-one if for every ${\displaystyle x,y\in A}$, ${\displaystyle f(x)=f(y)\implies x=y}$, or equivalently (contrapositive), for every ${\displaystyle x,y\in A}$, ${\displaystyle x\neq y\implies f(x)\neq f(y)}$.

Remark.

• The definition tells us that an injective function from ${\displaystyle A}$ to ${\displaystyle B}$ maps distinct elements of ${\displaystyle A}$ into distinct elements of ${\displaystyle B}$.
• To prove that a function is injective, it is common to use a direct proof: for every ${\displaystyle x,y\in A}$, assume ${\displaystyle f(x)=f(y)}$, and then proceed to prove that ${\displaystyle x=y}$.

Exercise. Write down the meaning of "a function ${\displaystyle f:A\to B}$ is not injective".

Solution

There exist ${\displaystyle x,y\in A}$ such that ${\displaystyle f(x)=f(y)}$ and ${\displaystyle x\neq y}$.

Example. A function ${\displaystyle f:\mathbb {R} \to \mathbb {R} }$ is defined by ${\displaystyle f(x)=2x+5}$. Prove that ${\displaystyle f}$ is injective.

Proof. For every ${\displaystyle x,y\in \mathbb {R} }$,

${\displaystyle f(x)=f(y)\implies 2x+5=2y+5\implies 2x=2y\implies x=y.}$
Thus, ${\displaystyle f}$ is injective.

${\displaystyle \Box }$

Example. A function ${\displaystyle f:\mathbb {R} \to \mathbb {R} }$ is defined by ${\displaystyle f(x)=x^{2}}$. Prove or disprove that ${\displaystyle f}$ is injective.

Disproof. Since ${\displaystyle f(-1)=f(1)=1}$, ${\displaystyle f}$ is not injective.

${\displaystyle \Box }$

Exercise. A function ${\displaystyle g:[0,\infty )\to \mathbb {R} }$ is defined by ${\displaystyle g(x)=x^{2}}$. Prove that ${\displaystyle g}$ is injective.

Proof

Proof. For every ${\displaystyle x,y\in [0,\infty )}$,

${\displaystyle g(x)=g(y)\implies x^{2}=y^{2}\implies {\sqrt {x^{2}}}={\sqrt {y^{2}}}\implies |x|=|y|\implies x=y.}$

${\displaystyle \Box }$

Remark.

• It suggests that the domain of a function is important when we determine whether a function is injective. We should not just look at the formula of the function.

Example. A function ${\displaystyle f:\mathbb {R} \to \mathbb {R} }$ is defined by ${\displaystyle f(x)=x^{2}-3x+5}$. Prove or disprove that ${\displaystyle f}$ is injective.

This time, it is not immediately clear that whether this function is injective or not. So, we may first try to prove that ${\displaystyle f}$ is injective, and see what we get: for every ${\displaystyle x,y\in \mathbb {R} }$,

${\displaystyle f(x)=f(y)\implies x^{2}-3x+5=y^{2}-3y+5\implies x(x-3)=y(y-3).}$
Notice that for the last equation to imply ${\displaystyle x=y}$, we need ${\displaystyle x\neq 3}$ and ${\displaystyle y\neq 3}$. However, ${\displaystyle x}$ and ${\displaystyle y}$ can take any real values. Hence, this suggests us to consider ${\displaystyle f(3)}$ and disprove that ${\displaystyle f}$ is injective. Another observation is that when ${\displaystyle x=3}$, then ${\displaystyle x(x-3)=0}$. But to make it equal zero, we can also take ${\displaystyle x=0}$. Thus, this gives us a counterexample for disproving that ${\displaystyle f}$ is injective:

Disproof. Since ${\displaystyle f(0)=f(3)=5}$, ${\displaystyle f}$ is not injective.

${\displaystyle \Box }$

Exercise. A function ${\displaystyle f:[-1,1]\to [0,1]}$ is defined by ${\displaystyle f(x)={\sqrt {1-x^{2}}}}$. Prove or disprove that ${\displaystyle f}$ is injective.

Solution

Disproof. Since ${\displaystyle f(-1)=f(1)=0}$, ${\displaystyle f}$ is not injective.

${\displaystyle \Box }$

Remark.

• The graph of this function is a semicircle located above the ${\displaystyle x}$-axis.

Definition. (Surjective function) A function ${\displaystyle f:A\to B}$ is surjective or onto if for every ${\displaystyle y\in B}$, there exists ${\displaystyle x\in A}$ such that ${\displaystyle f(x)=y}$.

Remark.

• The definition tells us that ${\displaystyle f}$ is surjective if every element of the codomain ${\displaystyle B}$ is the image of some element of ${\displaystyle A}$. In other words, ${\displaystyle \operatorname {ran} f=B}$.

Exercise. Write down the meaning of "${\displaystyle f:A\to B}$ is not surjective".

Solution

There exists ${\displaystyle y\in B}$ such that for every ${\displaystyle x\in A}$, ${\displaystyle f(x)\neq y}$.

In other words, some element of the codomain ${\displaystyle B}$ is not the image of every element of ${\displaystyle A}$. In other words, ${\displaystyle \operatorname {ran} f\neq B}$ (since ${\displaystyle \operatorname {ran} f\subseteq B}$ by definition, we may also write this as ${\displaystyle \operatorname {ran} f\subset B}$).

Example. A function ${\displaystyle f:\mathbb {R} \to \mathbb {R} }$ is defined by ${\displaystyle f(x)=2x+5}$. Prove that ${\displaystyle f}$ is surjective.

To prove that ${\displaystyle f}$ is surjective, we need to prove that for every ${\displaystyle y\in \mathbb {R} }$, there exists ${\displaystyle x\in \mathbb {R} }$ such that ${\displaystyle f(x)=y}$, i.e., ${\displaystyle y=2x+5}$. To show this, we can see that the choices of ${\displaystyle x\in \mathbb {R} }$ to satisfy the requirement for different values of ${\displaystyle y}$ are different. Indeed, the choice of ${\displaystyle x}$ depend on the value of ${\displaystyle y\in \mathbb {R} }$. But it is not hard to choose such ${\displaystyle x\in \mathbb {R} }$ with a given value of ${\displaystyle y\in \mathbb {R} }$: we can just rearrange the above equation to make ${\displaystyle x}$ to be the subject of the equation:

${\displaystyle x={\frac {y-5}{2}}\in \mathbb {R} }$

Proof. For every ${\displaystyle y\in \mathbb {R} }$, choose ${\displaystyle x={\frac {y-5}{2}}\in \mathbb {R} }$. Then,

${\displaystyle f(x)=f\left({\frac {y-5}{2}}\right)=2\left({\frac {y-5}{2}}\right)+5=y-5+5=y.}$

${\displaystyle \Box }$

Example. Prove or disprove that the function ${\displaystyle f:\mathbb {R} \to \mathbb {R} }$ defined by ${\displaystyle f(x)=x^{2}}$ is surjective.

Disproof. Take ${\displaystyle y=-1}$. Then, there is no ${\displaystyle x\in \mathbb {R} }$ such that

${\displaystyle x^{2}=f(x)=-1}$
since ${\displaystyle x^{2}=-1}$ has no real solution. Hence, ${\displaystyle f}$ is not surjective.

${\displaystyle \Box }$

Exercise. Prove that the function ${\displaystyle g:\mathbb {R} \to [0,\infty )}$ defined by ${\displaystyle g(x)=x^{2}}$ is surjective.

Proof

Proof. For every ${\displaystyle y\in [0,\infty )}$, choose ${\displaystyle x={\sqrt {y}}\in \mathbb {R} }$ (it is possible to choose such real number ${\displaystyle x}$ since ${\displaystyle y\in [0,\infty )}$). Then,

${\displaystyle f(x)=f\left({\sqrt {y}}\right)=\left({\sqrt {y}}\right)^{2}=y}$

${\displaystyle \Box }$

Remark.

• It suggests that the codomain of a function is important when we determine whether a function is surjective. So, again, we should not just look at the formula of the function.

Definition. (Bijective function) A function ${\displaystyle f:A\to B}$ is bijective or one-to-one correspondence if it is both injective and surjective.

Remark.

• We should be aware that "one-to-one correspondence" (which means bijective) is different from "one-to-one" (which means injective).
• In other words, ${\displaystyle f}$ is bijective if for every ${\displaystyle y\in B}$, there exists a unique ${\displaystyle x\in A}$ such that ${\displaystyle f(x)=y}$ (the existence part corresponds to the surjectivity, and the uniqueness part corresponds to the injectivity).
• However, we usually still prove the bijectivity by proving the injectivity and surjectivity separately, similar to the case for proving "there exists a unique ..." where we prove the existence part and uniqueness part separately.

Exercise. Write down the meaning of "${\displaystyle f:A\to B}$ is not bijective".

Solution

${\displaystyle f}$ is not injective or ${\displaystyle f}$ is not surjective.

Example. We have proved the function ${\displaystyle f:\mathbb {R} \to \mathbb {R} }$ defined by ${\displaystyle f(x)=2x+5}$ is both injective and surjective. Hence, it is bijective.

Example. We have proved the function ${\displaystyle g_{1}:[0,\infty )\to \mathbb {R} }$ defined by ${\displaystyle g_{1}(x)=x^{2}}$ is injective and the function ${\displaystyle g_{2}:\mathbb {R} \to [0,\infty )}$ defined by ${\displaystyle g_{2}(x)=x^{2}}$ is surjective. Then, with basically the same proof, we can prove that the function ${\displaystyle g_{3}:[0,\infty \to [0,\infty )}$ defined by ${\displaystyle g_{3}(x)=x^{2}}$ is both injective or surjective, and hence bijective.

Example. Prove that the function ${\displaystyle f:\mathbb {R} \setminus \{1\}\to \mathbb {R} \setminus \{3\}}$ defined by ${\displaystyle f(x)={\frac {3x+1}{x-1}}}$ is bijective.

For proving the surjective part, we need to make ${\displaystyle x}$ as the subject for the equaton ${\displaystyle y={\frac {3x+1}{x-1}}}$:

${\displaystyle y={\frac {3x+1}{x-1}}\iff xy-y=3x+1\iff (y-3)x=y+1\iff x={\frac {y+1}{y-3}}.}$

Proof.

Injective: For every ${\displaystyle x,y\in \mathbb {R} \setminus \{1\}}$,

${\displaystyle f(x)=f(y)\implies {\frac {3x+1}{x-1}}={\frac {3y+1}{y-1}}\implies 3x+1=3y+1\implies 3x=3y\implies x=y.}$
Surjective: For every ${\displaystyle y\in \mathbb {R} \setminus \{3\}}$, choose ${\displaystyle x={\frac {y+1}{y-3}}}$. We then need to show that ${\displaystyle x\in \mathbb {R} \setminus \{1\}}$. Since ${\displaystyle x\in \mathbb {R} }$ (${\displaystyle y\neq 3}$, so ${\displaystyle x}$ is defined), it suffices to show that ${\displaystyle x\neq 1}$. Let us prove it by contradiction:

Assume to the contrary that ${\displaystyle x=1}$. Then, ${\displaystyle {\frac {y+1}{y-3}}=1\implies y+1=y-3\implies 1=-3}$. So we arrive at a contradiction.

Then, we have

${\displaystyle f(x)=f\left({\frac {y+1}{y-3}}\right)={\frac {3\left({\frac {y+1}{y-3}}\right)+1}{{\frac {y+1}{y-3}}-1}}={\frac {3y+3+y-3}{y+1-y+3}}={\frac {4y}{4}}=y.}$

${\displaystyle \Box }$

Exercise. Let ${\displaystyle A}$ be a set. Prove that the identity function ${\displaystyle id_{A}:A\to A}$ is bijective.

Proof

Proof.

Injective: For every ${\displaystyle x,y\in A}$, ${\displaystyle id_{A}(x)=id_{A}(y)\implies x=y}$.

Surjective: For every ${\displaystyle y\in A}$, choose ${\displaystyle x=y\in A}$. Then, ${\displaystyle id_{A}(x)=id_{A}(y)=y}$.

${\displaystyle \Box }$

Exercise. A function ${\displaystyle f:\mathbb {Z} \times \mathbb {Z} \to \mathbb {Z} \times \mathbb {Z} }$ is defined by

${\displaystyle f(m,n)=(n,m).}$
Prove or disprove that ${\displaystyle f}$ is bijective.

Solution

Proof.

Injective: For every ${\displaystyle (m_{1},n_{1}),(m_{2},n_{2})\in \mathbb {Z} \times \mathbb {Z} }$,

${\displaystyle f(m_{1},n_{1})=f(m_{2},n_{2})\implies (n_{1},m_{1})=(n_{2},m_{2})\implies n_{1}=n_{2}{\text{ and }}m_{1}=m_{2}\implies (m_{1},n_{1})=(m_{2},n_{2}).}$

Surjective: For every ${\displaystyle (x,y)\in \mathbb {Z} \times \mathbb {Z} }$, choose ${\displaystyle (y,x)\in \mathbb {Z} \times \mathbb {Z} }$. Then,

${\displaystyle f(y,x)=(x,y).}$

${\displaystyle \Box }$

Exercise. Consider a function ${\displaystyle f:\mathbb {N} \times \mathbb {N} \to \mathbb {N} }$ defined by ${\displaystyle f(m,n)=m+n}$, and another function ${\displaystyle g:\mathbb {N} \times \mathbb {N} \to \mathbb {N} }$ defined by ${\displaystyle g(m,n)=mn}$.

(a) Prove or disprove that ${\displaystyle f}$ is (i) injective; (ii) surjective.

(b) Prove or disprove that ${\displaystyle g}$ is (i) injective; (ii) surjective.

Solution
(a) (i)

Disproof. Since ${\displaystyle f(1,2)=f(2,1)=3}$, ${\displaystyle f}$ is not injective.

${\displaystyle \Box }$

(ii)

Disproof. Take ${\displaystyle y=1}$. Then, for every ${\displaystyle (m,n)\in \mathbb {N} \times \mathbb {N} }$, ${\displaystyle m+n\neq y=1}$.

${\displaystyle \Box }$

(b) (i)

Disproof. Since ${\displaystyle f(1,2)=f(2,1)=2}$, ${\displaystyle f}$ is not injective.

${\displaystyle \Box }$

(ii)

Proof. For every ${\displaystyle y\in \mathbb {N} }$, choose ${\displaystyle (m,n)=(1,y)\in \mathbb {N} \times \mathbb {N} }$. Then, ${\displaystyle f(m,n)=1(y)=y}$.

${\displaystyle \Box }$

## Composition of functions

Let ${\displaystyle A,B}$ and ${\displaystyle C}$ be nonempty sets, and let ${\displaystyle f:A\to B}$ and ${\displaystyle g:B\to C}$ be two functions. In this section, we are going to discuss a way to create a new function from "combining" the two functions ${\displaystyle f}$ and ${\displaystyle g}$, called their composition:

Definition. (Composition) Let ${\displaystyle f:A\to B}$ and ${\displaystyle g:B\to C}$ be two functions, where ${\displaystyle A,B}$ and ${\displaystyle C}$ are nonempty sets. Then, the composition of ${\displaystyle f}$ and ${\displaystyle g}$, denoted by ${\displaystyle g\circ f}$ (read "g-circle-f"), is the function from ${\displaystyle A}$ to ${\displaystyle C}$ defined by

${\displaystyle (g\circ f)(a)=g(f(a)){\text{ for every }}a\in A.}$

Remark.

• We can see that the composition ${\displaystyle g\circ f}$ is obtained by first applying ${\displaystyle f}$ and then applying ${\displaystyle g}$.

Example. Let ${\displaystyle A=\{1,2,3\},B=\{a,b,c,d\}}$ and ${\displaystyle C=\{4,5\}}$. Consider the functions ${\displaystyle f:A\to B}$ and ${\displaystyle g:B\to C}$ defined by

${\displaystyle f=\{(1,a),(2,b),(3,b)\}{\text{ and }}g=\{((a,4),(b,5),(c,5),(d,5)\}.}$
Then, we have ${\displaystyle (g\circ f)(1)=4,(g\circ f)(2)=5}$ and ${\displaystyle (g\circ f)(3)=5}$. So, the function ${\displaystyle g\circ f:A\to C}$ is given by
${\displaystyle g\circ f=\{(1,4),(2,5),(3,5)\}.}$
However, ${\displaystyle f\circ g}$ is undefined, since the codomain of ${\displaystyle g}$ (${\displaystyle C}$) is different from the domain of ${\displaystyle f}$ (${\displaystyle A}$).

Remark.

• To make both ${\displaystyle g\circ f}$ and ${\displaystyle f\circ g}$ defined, we need to have ${\displaystyle f:A\to B}$ and ${\displaystyle g:B\to A}$, that is, the domain (codomain) of ${\displaystyle f}$ must be equal to the codomain (domain) of ${\displaystyle g}$.
• In this case, ${\displaystyle g\circ f}$ is a function from ${\displaystyle B}$ to ${\displaystyle B}$, and ${\displaystyle f\circ g}$ is a function from ${\displaystyle A}$ to ${\displaystyle A}$. So, in order for them to be possibly equal, we need to further have ${\displaystyle A=B}$.
• But, even if ${\displaystyle A=B}$, it is still possible that ${\displaystyle f\circ g\neq g\circ f}$. Consider the following example.

Example. Let ${\displaystyle A=\{1,2\}}$. Consider the functions ${\displaystyle f:A\to A}$ and ${\displaystyle g:A\to A}$ defined by

${\displaystyle f=\{(1,2),(2,2)\}{\text{ and }}g=\{(1,1),(2,1)\}.}$
Then, ${\displaystyle g\circ f=\{(1,1),(2,1)\}}$ and ${\displaystyle f\circ g=\{(1,2),(2,2)\}}$. Since, for example, ${\displaystyle (g\circ f)(1)=1\neq 2=(f\circ g)(1)}$, it follows that ${\displaystyle g\circ f\neq f\circ g}$.

Remark.

• This example shows that the composition of functions is not commutative. That is, after changing the order of the composition, the result may be different.

Although the composition of functions is not commutative, it is associative.

Theorem. (Associativity of the composition of functions) For every nonempty set ${\displaystyle A,B,C,D}$, and for every function ${\displaystyle f:A\to B,g:B\to C}$ and ${\displaystyle h:C\to D}$, we have

${\displaystyle h\circ (g\circ f)=(h\circ g)\circ f.}$

Proof. First, since ${\displaystyle g\circ f}$ is a function from ${\displaystyle A}$ to ${\displaystyle C}$, it follows that ${\displaystyle h\circ (g\circ f)}$ is a function from ${\displaystyle A}$ to ${\displaystyle D}$. Similarly, since ${\displaystyle h\circ g}$ is a function from ${\displaystyle B\to D}$, it follows that ${\displaystyle (h\circ g)\circ f}$ is function from ${\displaystyle A}$ to ${\displaystyle D}$.

It now remains to prove that both functions map every element ${\displaystyle a\in A}$ into the same image in ${\displaystyle D}$. For every ${\displaystyle a\in A}$,

${\displaystyle {\big (}h\circ (g\circ f){\big )}(a)=h{\big (}(g\circ f)(a){\big )}=h{\big (}g(f(a)){\big )}.}$
(a bracket for ${\displaystyle g\circ f}$ means "grouping" ${\displaystyle g}$ and ${\displaystyle f}$ first, that is, we regard ${\displaystyle g\circ f}$ as a function first. To understand this more easily, one may write ${\displaystyle f_{1}}$ instead of "${\displaystyle g\circ f}$" above.)

Also,

${\displaystyle {\big (}(h\circ g)(f(a)){\big )}=h{\big (}g(f(a)){\big )}.}$

${\displaystyle \Box }$

Example. Let ${\displaystyle A=\{1,2,3\},B=\{a,b,c,d\},C=\{4,5\}}$ and ${\displaystyle D=\{\alpha ,\beta ,\gamma \}}$. Consider the functions ${\displaystyle f:A\to B}$, ${\displaystyle g:B\to C}$ and ${\displaystyle h:C\to D}$ defined by

${\displaystyle f=\{(1,a),(2,b),(3,b)\},g=\{((a,4),(b,5),(c,5),(d,5)\}{\text{ and }}h=\{(4,\beta ),(5,\gamma )\}.}$
Recall that we have shown that ${\displaystyle g\circ f=\{(1,4),(2,5),(3,5)\}}$. We can further show that ${\displaystyle h\circ g=\{(a,\beta ),(b,\gamma ),(c,\gamma ),(d,\gamma )\}}$. Thus, ${\displaystyle h\circ (g\circ f):A\to D}$ and ${\displaystyle (h\circ g)\circ f:A\to D}$ are given by
${\displaystyle h\circ (g\circ f)=(h\circ g)\circ f=\{(1,\beta ),(2,\gamma ),(3,\gamma )\}.}$

Now, let us study some results that are related to the composition, injectivity, surjectivity, and bijectivity.

Proposition. For every nonempty set ${\displaystyle A,B,C}$ and for every function ${\displaystyle f:A\to B}$ and ${\displaystyle g:B\to C}$,

(a) if ${\displaystyle f}$ and ${\displaystyle g}$ are injective, then ${\displaystyle g\circ f}$ is injective.

(b) if ${\displaystyle f}$ and ${\displaystyle g}$ are surjective, then ${\displaystyle g\circ f}$ is surjective.

Proof. For (a), assume that ${\displaystyle f}$ and ${\displaystyle g}$ are injective. Then, for every ${\displaystyle x,y\in A}$

{\displaystyle {\begin{aligned}g\circ f(x)=g\circ f(y)&\implies g(f(x))=g(f(y))\\&\implies f(x)=f(y)&(g{\text{ is injective}})\\&\implies x=y.&(f{\text{ is injective}})\\\end{aligned}}}
For (b), assume that ${\displaystyle f}$ and ${\displaystyle g}$ are injective. Then, for every ${\displaystyle c\in C}$, since ${\displaystyle g}$ is surjective, there exists ${\displaystyle b\in B}$ such that ${\displaystyle g(b)=c}$. For this ${\displaystyle b\in B}$, there must also exist ${\displaystyle a\in A}$ such that ${\displaystyle f(a)=b}$ since ${\displaystyle f}$ is surjective. It follows that for every ${\displaystyle c\in C}$, there exists ${\displaystyle a\in A}$ such that
${\displaystyle c=g(b)=g(f(a))=(g\circ f)(a).}$

${\displaystyle \Box }$

Corollary. For every nonempty set ${\displaystyle A,B,C}$ and for every function ${\displaystyle f:A\to B}$ and ${\displaystyle g:B\to C}$, if ${\displaystyle f:A\to B}$ and ${\displaystyle g:B\to C}$ are bijective, then ${\displaystyle g\circ f}$ is bijective.

Proof. Assume that ${\displaystyle f}$ and ${\displaystyle g}$ are bijective, i.e., are injective and surjective. It follows by the above proposition that ${\displaystyle g\circ f}$ is injective and surjective, i.e., is bijective.

${\displaystyle \Box }$

After knowing such results, it is natural to question that whether the converse of the above proposition holds. Indeed, the converse does not hold, and we have the following results:

Proposition. For every nonempty set ${\displaystyle A,B,C}$ and for every function ${\displaystyle f:A\to B}$ and ${\displaystyle g:B\to C}$,

(a) if ${\displaystyle g\circ f}$ is injective, then ${\displaystyle f}$ is injective.

(b) if ${\displaystyle g\circ f}$ is surjective, then ${\displaystyle g}$ is surjective.

Proof. For (a), assume that ${\displaystyle g\circ f}$ is injective. Then, for every ${\displaystyle x,y\in A}$,

{\displaystyle {\begin{aligned}f(x)=f(y)&\implies g(f(x))=g(f(y))&({\text{definition of function}})\\&\implies x=y.&(g\circ f{\text{ is injective}})\\\end{aligned}}}
For (b), assume that ${\displaystyle g\circ f}$ is surjective. Then, for every ${\displaystyle c\in C}$, there exists ${\displaystyle a\in A}$ such that ${\displaystyle g(f(a))=c}$. So, we now can show that ${\displaystyle g}$ is surjective: for every ${\displaystyle c\in C}$, we can choose ${\displaystyle f(a)\in B}$, and then we have ${\displaystyle g(f(a))=c}$.

${\displaystyle \Box }$

Remark.

• It follows that if ${\displaystyle g\circ f}$ is bijective, then ${\displaystyle f}$ is injective and ${\displaystyle g}$ is surjective.
• For (a), with the assumption, ${\displaystyle f}$ may or may not be surjective. Also, for (b), ${\displaystyle g}$ may or may not be injective.

To summarize the results, we have the following table:

Summary
${\displaystyle g\circ f}$ (given) ${\displaystyle f}$ ${\displaystyle g}$
injective injective injective/not injective
surjective surjective surjective/not surjective
bijective injective surjective

Exercise. Disprove that for every nonempty set ${\displaystyle A,B,C}$ and for every function ${\displaystyle f:A\to B}$ and ${\displaystyle g:B\to C}$,

(a) if ${\displaystyle g\circ f}$ is injective, then ${\displaystyle g}$ is injective.

(b) if ${\displaystyle g\circ f}$ is surjective, then ${\displaystyle f}$ is surjective.

Solution
(a)

Disproof. Take ${\displaystyle A=\{1,2\},B=\{1,2,3\},C=\{1,2\}}$, and take the functions ${\displaystyle f:A\to B}$ and ${\displaystyle g:B\to C}$ defined by ${\displaystyle f=\{(1,1),(2,2)\},g=\{(1,1),(2,2),(3,1)\}}$.

Then, the function ${\displaystyle g\circ f:A\to C}$ is given by ${\displaystyle g\circ f=\{(1,1),(2,2)\}}$, which is injective, since for every ${\displaystyle x,y\in A}$, ${\displaystyle (g\circ f)(x)=(g\circ f)(y)\implies x=y}$. However, the function ${\displaystyle g}$ is not injective since ${\displaystyle g(1)=g(3)=1}$.

${\displaystyle \Box }$

(b)

Disproof. Take ${\displaystyle A=\{1,2\},B=\{1,2,3\},C=\{1,2\}}$, and take the functions ${\displaystyle f:A\to B}$ and ${\displaystyle g:B\to C}$ defined by ${\displaystyle f=\{(1,1),(2,2)\},g=\{(1,1),(2,2),(3,1)\}}$.

Then, the function ${\displaystyle g\circ f:A\to C}$ is given by ${\displaystyle g\circ f=\{(1,1),(2,2)\}}$, which is surjective, since for every ${\displaystyle c\in C}$, there exists ${\displaystyle a\in A}$ such that ${\displaystyle (g\circ f)(a)=c}$. However, the function ${\displaystyle f}$ is not surjective, since, for example, take ${\displaystyle c=3}$. Then, for every ${\displaystyle a\in A}$, ${\displaystyle f(a)\neq c=3}$.

${\displaystyle \Box }$

Example. Consider two arbitrary functions ${\displaystyle f:\mathbb {R} \to \mathbb {R} }$ and ${\displaystyle g:\mathbb {R} \to \mathbb {R} }$ such that the function ${\displaystyle g\circ f:\mathbb {R} \to \mathbb {R} }$ is given by

${\displaystyle g\circ f(x)=x^{3}.}$
What properties do the functions ${\displaystyle f}$ and ${\displaystyle g}$ possess?

Solution. First, we claim that ${\displaystyle g\circ f}$ is bijective.

Proof.

Injective: for every ${\displaystyle x,y\in \mathbb {R} }$, ${\displaystyle (g\circ f)(x)=(g\circ f)(y)=x^{3}=y^{3}\implies x=y}$.

Surjective: for every ${\displaystyle y\in \mathbb {R} }$, choose ${\displaystyle x=y^{1/3}\in \mathbb {R} }$. Then, ${\displaystyle (g\circ f)(x)=(y^{1/3})^{3}=y}$.

${\displaystyle \Box }$

Then, we can conclude that ${\displaystyle f}$ is injective and ${\displaystyle g}$ is surjective by the above proposition.

Exercise. Consider the function ${\displaystyle f:\mathbb {R} \to \mathbb {R} }$ satisfying ${\displaystyle (f\circ f)(x)=x}$ for every ${\displaystyle x\in \mathbb {R} }$. Prove or disprove that ${\displaystyle f}$ is (i) injective; (ii) surjective.

Solution

(i) and (ii):

Proof. First, notice that ${\displaystyle f\circ f=id_{\mathbb {R} }}$, and we have proved that ${\displaystyle id_{\mathbb {R} }}$ is bijective. So, by the above proposition, ${\displaystyle f}$ is injective and surjective.

${\displaystyle \Box }$

## Inverse functions

Recall that a function ${\displaystyle f:A\to B}$ is a special relation from set ${\displaystyle A}$ to set ${\displaystyle B}$ satisfying some requirements. Also, recall that given a relation from ${\displaystyle A}$ to ${\displaystyle B}$, the inverse relation ${\displaystyle R^{-1}}$ is defined to be

${\displaystyle R^{-1}=\{(b,a):(a,b)\in R\}\subseteq B\times A.}$
We know that the inverse relation is always a relation itself. However, is the inverse relation of a function ${\displaystyle f:A\to B}$ always a function from ${\displaystyle B}$ to ${\displaystyle A}$ itself? The answer is, indeed, no. Consider the following example.

Example. Let ${\displaystyle A=\{1,2,3\}}$ and ${\displaystyle B=\{a,b\}}$. Consider the function ${\displaystyle f:A\to B}$ defined by

${\displaystyle f=\{(1,a),(2,a),(3,b)\}.}$
Then, the inverse relation (we are not calling it inverse function) of ${\displaystyle f}$, denoted by ${\displaystyle f^{-1}}$, is
${\displaystyle f^{-1}=\{(a,1),(a,2),(b,3)\}.}$
We can then see that this inverse relation ${\displaystyle f^{-1}}$ is not a function from ${\displaystyle B}$ to ${\displaystyle A}$, since the element ${\displaystyle a\in B}$ has two images: 1 and 2.

Of course, when the inverse relation of a function ${\displaystyle f:A\to B}$ turns out to be also a function from ${\displaystyle B}$ to ${\displaystyle A}$, it is natural to define it as the inverse function of ${\displaystyle f}$:

Definition. (Inverse function) Let ${\displaystyle A}$ and ${\displaystyle B}$ be sets, and let ${\displaystyle f:A\to B}$ be a function. The inverse function of the function ${\displaystyle f}$ is the inverse relation of ${\displaystyle f}$, denoted by ${\displaystyle f^{-1}}$, provided that ${\displaystyle f^{-1}}$ is a function from ${\displaystyle B}$ to ${\displaystyle A}$.

Remark.

• Since given a relation from ${\displaystyle A}$ to ${\displaystyle B}$, it has a unique (or exactly one) inverse relation from ${\displaystyle B}$ to ${\displaystyle A}$ (according to the definition), it follows that the inverse function of a function ${\displaystyle f}$, if exists, is unique.

We are then interested in knowing under what conditions the inverse relation ${\displaystyle f^{-1}}$ is a function from ${\displaystyle B}$ to ${\displaystyle A}$, so that the inverse function exists.

First, in order for ${\displaystyle f^{-1}}$ to be a function from ${\displaystyle B}$ to ${\displaystyle A}$, we must have ${\displaystyle \operatorname {dom} f^{-1}=B}$. So, we need to ensure that every element of ${\displaystyle B}$ is related to some elements in ${\displaystyle A}$, so that when we "reverse" the ordered pairs in ${\displaystyle f}$ to get ${\displaystyle f^{-1}}$, there is at least one image for every ${\displaystyle b\in B}$. This means ${\displaystyle \operatorname {ran} f=B}$, i.e., ${\displaystyle f}$ needs to be surjective.

Of course, we also need to ensure that there is a unique image for every ${\displaystyle b\in B}$. Under the condition that ${\displaystyle f}$ is surjective, there is at least one image for every ${\displaystyle b\in B}$ already. So, it remains to ensure that there is at most one image for every ${\displaystyle b\in B}$.

To ensure this, we need the function ${\displaystyle f}$ to be injective, since, if ${\displaystyle f}$ is injective, then every element of ${\displaystyle B}$ has at most one pre-image. So, when we "reverse" the ordered pairs in ${\displaystyle f}$ to get ${\displaystyle f^{-1}}$, every element of ${\displaystyle B}$ has at most one image.

From this discussion, we know that if ${\displaystyle f^{-1}}$ is a function from ${\displaystyle B}$ to ${\displaystyle A}$, then ${\displaystyle f}$ has to be injective and surjective, i.e. bijective. This shows that the bijectivity of ${\displaystyle f}$ is the necessary condition for the existence of the inverse function. Is it also the sufficient condition? It turns out that the bijectivity of ${\displaystyle f}$ is actually the necessary and sufficient condition for the existence of the inverse function:

Theorem. Let ${\displaystyle f:A\to B}$ be a function. Then its inverse relation ${\displaystyle f^{-1}}$ is a function from ${\displaystyle B}$ to ${\displaystyle A}$ (i.e., the inverse function of ${\displaystyle f}$ exists) if and only if the function ${\displaystyle f}$ is bijective.

Proof.

"${\displaystyle \Rightarrow }$" direction: Assume that ${\displaystyle f^{-1}}$ is a function from ${\displaystyle B}$ to ${\displaystyle A}$. Then, we will proceed to prove that ${\displaystyle f}$ is injective and surjective:

Injective: For every ${\displaystyle x,y\in {\color {darkgreen}A}}$, first assume ${\displaystyle z=f(x)=f(y){\color {blue}\in B}}$. Then, ${\displaystyle (x,z),(y,z)\in f}$. By definition of inverse relation, we have ${\displaystyle (z,x),(z,y)\in f^{-1}}$. Since ${\displaystyle f^{-1}}$ is a function from ${\displaystyle {\color {blue}B}}$ to ${\displaystyle {\color {darkgreen}A}}$, we have by definition ${\displaystyle x=y}$.

Surjective: For every ${\displaystyle b\in B}$, since ${\displaystyle f^{-1}}$ is a function from ${\displaystyle B}$ to ${\displaystyle A}$, there exists a unique ${\displaystyle a\in A}$ such that ${\displaystyle (b,a)\in f^{-1}}$. This means by definition of inverse relation that ${\displaystyle (a,b)\in f}$, i.e., ${\displaystyle f(a)=b}$ .

"${\displaystyle \Leftarrow }$" direction: Assume that ${\displaystyle f}$ is bijective. Then, we will proceed to prove that ${\displaystyle f^{-1}}$ is a function from ${\displaystyle B}$ to ${\displaystyle A}$. That is, we need to ensure that for every element ${\displaystyle b\in B}$, there exists a unique ${\displaystyle a\in A}$ such that ${\displaystyle (b,a)\in f^{-1}}$.

Existence: For every ${\displaystyle b\in B}$, since ${\displaystyle f}$ is surjective, there exists ${\displaystyle a\in A}$ such that ${\displaystyle f(a)=b}$, i.e. ${\displaystyle (a,b)\in f}$. Hence, ${\displaystyle (b,a)\in f^{-1}}$. This shows that for every ${\displaystyle b\in B}$, there exists ${\displaystyle a\in A}$ such that ${\displaystyle (b,a)\in f^{-1}}$.

Uniqueness: Assume ${\displaystyle (b,a')\in f^{-1}}$, in addition to ${\displaystyle (b,a)\in f^{-1}}$. So, we have ${\displaystyle (a,b),(a',b)\in f}$, i.e., ${\displaystyle f(a)=f(a')=b}$. Since ${\displaystyle f}$ is injective, we have ${\displaystyle a=a'}$.

${\displaystyle \Box }$

Hence, from this theorem, we know that it is only meaningful to talk about inverse function of ${\displaystyle f}$ when ${\displaystyle f}$ is bijective. If ${\displaystyle f}$ is not bijective, then its inverse function does not exist at all, and it is meaningless to talk about it. The following theorem further suggests that the inverse function must also be bijective.

Theorem. If the function ${\displaystyle f:A\to B}$ is bijective, then its inverse function ${\displaystyle f^{-1}:B\to A}$ is also bijective.

Proof. Assume that ${\displaystyle f:A\to B}$ is bijective. Then, ${\displaystyle f^{-1}}$ is a function from ${\displaystyle B}$ to ${\displaystyle A}$. Then, we will show that it is injective and surjective.

Injective: For every ${\displaystyle b,b'\in B}$, first assume ${\displaystyle a=f^{-1}(b)=f^{-1}(b')}$. Then, ${\displaystyle (b,a),(b',a)\in f^{-1}}$. Thus, ${\displaystyle (a,b),(a,b')\in f}$. It follows that ${\displaystyle b=b'}$ since ${\displaystyle f}$ is a function.

Surjective: For every ${\displaystyle a\in A}$, since ${\displaystyle f}$ is a function, there exists a unique ${\displaystyle b\in B}$ such that ${\displaystyle f(a)=b}$, i.e., ${\displaystyle (a,b)\in f}$, and hence ${\displaystyle (b,a)\in f^{-1}}$, i.e., ${\displaystyle f^{-1}(b)}$.

${\displaystyle \Box }$

Remark.

• Notice that in the proof, we do not make use of the bijectivity of ${\displaystyle f}$ to prove the injectivity and surjectivity of ${\displaystyle f^{-1}}$. Indeed, we just use the definition of function to prove them. The bijectivity of ${\displaystyle f}$ serves only one purpose: ensuring that the inverse function exists.

Another common definition of inverse function is that the inverse function of ${\displaystyle f}$, denoted by ${\displaystyle f^{-1}}$, is a function satisfying

${\displaystyle f^{-1}\circ f=id_{A}{\text{ and }}f\circ f^{-1}=id_{B}.}$
It turns out that these two definitions of inverse function are indeed (logically) equivalent. Consider the following theorem.

Theorem. Let ${\displaystyle f:A\to B}$ and ${\displaystyle g:B\to A}$ be two functions such that ${\displaystyle f\circ g=id_{A}}$ and ${\displaystyle g\circ f=id_{B}}$. Then, ${\displaystyle f}$ is bijective and ${\displaystyle g}$ equals the inverse function of ${\displaystyle f}$, ${\displaystyle f^{-1}}$.

Proof. Let us first prove that ${\displaystyle f}$ is bijective.

Injective: For every ${\displaystyle x,y\in A}$,

{\displaystyle {\begin{aligned}f(x)=f(y)&\implies g(f(x))=g(f(y))&(g{\text{ is a function}})\\&\implies (g\circ f)(x)=(g\circ f)(y)\\&\implies id_{A}(x)=id_{A}(y)&({\text{assumption}})\\&\implies x=y.\end{aligned}}}
Surjective: For every ${\displaystyle y\in B}$, choose ${\displaystyle x=g(y)\in A}$. Then,
${\displaystyle f(x)=f(g(y))=id_{B}(y)=y.}$

Now, let us prove that ${\displaystyle g=f^{-1}}$. Since ${\displaystyle f}$ is bijective, the inverse function ${\displaystyle f^{-1}}$ exists. Then, since the domain and codomain of the inverse function ${\displaystyle f^{-1}}$ is ${\displaystyle B}$ and ${\displaystyle A}$ respectively by definition, ${\displaystyle g}$ and ${\displaystyle f^{-1}}$ have the same domain and codomain. It now remains to show that ${\displaystyle g(b)=f^{-1}(b)}$ for every ${\displaystyle b\in B}$. Since ${\displaystyle f}$ is surjective, for every ${\displaystyle b\in B}$, there exists ${\displaystyle a\in A}$ such that ${\displaystyle f(a)=b}$. This means ${\displaystyle a=f^{-1}(b)}$. Now, we have for every ${\displaystyle b\in B}$,

${\displaystyle g(b)=g(f(a))=id_{A}(a)=a=f^{-1}(b).}$

${\displaystyle \Box }$

The converse of the above theorem is also true. More precisely, if ${\displaystyle f}$ is bijective, and thus its inverse function ${\displaystyle f^{-1}}$ exists, then we have ${\displaystyle f\circ f^{-1}=id_{A}}$ and ${\displaystyle f^{-1}\circ f=id_{B}}$. (Details are left to the following exercise.) Hence, the two definitions are actually logically equivalent, in the sense that

• by the above theorem, the conditions in the alternative definition imply the conditions in our definition.
• by the above remark, the conditions in our definition imply the conditions in the alternative definition.

It then follows that the function ${\displaystyle g}$ satisfying ${\displaystyle f\circ g=id_{A}}$ and ${\displaystyle g\circ f=id_{B}}$ is unique, since the inverse function is unique.

Exercise. Let ${\displaystyle f:A\to B}$ be a function. Prove that if ${\displaystyle f}$ is bijective, and thus its inverse function ${\displaystyle f^{-1}}$ exists, then we have ${\displaystyle f\circ f^{-1}=id_{A}}$ and ${\displaystyle f^{-1}\circ f=id_{B}}$.

Proof. First, since ${\displaystyle f^{-1}}$ is a function from ${\displaystyle B}$ to ${\displaystyle A}$, it follows that ${\displaystyle f\circ f^{-1}}$ is a function from ${\displaystyle A}$ to ${\displaystyle A}$, and ${\displaystyle f^{-1}\circ f}$ is a function from ${\displaystyle B}$ to ${\displaystyle B}$.

Then, for every ${\displaystyle a\in A}$, there exists a unique ${\displaystyle b\in B}$ such that ${\displaystyle f(a)=b}$, and hence ${\displaystyle a=f^{-1}(b)}$. So,

${\displaystyle (f^{-1}\circ f)(a)=f^{-1}(f(a))=f^{-1}(b)=a.}$
Also, for every ${\displaystyle b\in B}$,
${\displaystyle (f\circ f^{-1})(b)=f(f^{-1}(b))=f(a)=b.}$

${\displaystyle \Box }$

Here, we will introduce an approach to find a formula for the inverse function. But this approach does not always work.

Example. Recall that we have proved that the function ${\displaystyle f:\mathbb {R} \setminus \{1\}\to \mathbb {R} \setminus \{3\}}$ defined by ${\displaystyle f(x)={\frac {3x+1}{x-1}}}$ is bijective. So, its inverse function ${\displaystyle f^{-1}}$ exists. Determine the inverse function ${\displaystyle f^{-1}}$.

Solution. We have for every ${\displaystyle x\in \mathbb {R} \setminus \{3\}}$, ${\displaystyle (f\circ f^{-1})(x)=x}$. Hence,

${\displaystyle x=(f\circ f^{-1})(x)=f(f^{-1}(x))={\frac {3f^{-1}(x)+1}{f^{-1}(x)-1}}\implies xf^{-1}(x)-x=3f^{-1}(x)+1\implies f^{-1}(x)={\frac {x+1}{x-3}}.}$
Thus, the inverse function ${\displaystyle f^{-1}:\mathbb {R} \setminus \{3\}\to \mathbb {R} \setminus \{1\}}$ is given by
${\displaystyle f^{-1}(x)={\frac {x+1}{x-3}}.}$

In this approach, we use some algebraic manipulation to find the inverse function. However, such method is not always possible. For instance, the function ${\displaystyle f:\mathbb {R} \to (0,\infty )}$ defined by ${\displaystyle f(x)=e^{x}}$ is bijective, but its inverse function ${\displaystyle f^{-1}:(0,\infty )\to \mathbb {R} }$ is given by (indeed, defined to be) ${\displaystyle f^{-1}(x)=\ln x}$. In this case, such inverse function cannot be obtained by such algebraic manipulation.

Exercise. Consider the function ${\displaystyle f:[0,1]\to [0,1]}$ defined by

${\displaystyle f(x)={\sqrt {1-x^{2}}}.}$
(a) Prove that ${\displaystyle f}$ is bijective.

(b) Determine the formula for inverse function, ${\displaystyle f^{-1}(x)}$.

Solution
(a)

Proof.

Injective: For every ${\displaystyle x,y\in [0,1]}$,

{\displaystyle {\begin{aligned}f(x)=f(y)&\implies {\sqrt {1-x^{2}}}={\sqrt {1-y^{2}}}\\&\implies 1-x^{2}=1-y^{2}\\&\implies x^{2}=y^{2}\\&\implies x=y.&(x,y\geq 0)\end{aligned}}}
Surjective: For every ${\displaystyle y\in [0,1]}$, choose ${\displaystyle x={\sqrt {1-y^{2}}}\in [0,1]}$. Then,
${\displaystyle f(x)=f\left({\sqrt {1-y^{2}}}\right)={\sqrt {1-\left({\sqrt {1-y^{2}}}\right)^{2}}}={\sqrt {1-(1-y^{2})}}={\sqrt {y^{2}}}=|y|=y.}$

${\displaystyle \Box }$

(b) Since ${\displaystyle f(f^{-1}(x))=x}$ for every ${\displaystyle x\in [0,1]}$, we have

${\displaystyle x={\sqrt {1-(f^{-1}(x))^{2}}}\implies x^{2}=1-(f^{-1}(x))^{2}\implies f^{-1}(x)=\pm {\sqrt {1-x^{2}}}.}$
But the codomain of the inverse function ${\displaystyle f^{-1}}$ is ${\displaystyle [0,1]}$. So we should choose the positive square root as the formula, i.e.,
${\displaystyle f^{-1}(x)={\sqrt {1-x^{2}}}.}$

Remark.

• It turns out that, in this case the function ${\displaystyle f}$ is equal to its inverse function ${\displaystyle f^{-1}}$.

## Image sets and preimage sets

The concepts discussed in this section are the generalizations of the concepts of image and pre-image.

Definition. (Image (set) and preimage (set)) Let ${\displaystyle A}$ and ${\displaystyle B}$ be sets, and let ${\displaystyle f:A\to B}$ be a function.

(a) Suppose ${\displaystyle X\subseteq A}$. The image (set) of ${\displaystyle X}$ is the set

${\displaystyle f(X)=\{y\in B:y=f(x){\text{ for some }}x\in X\}\subseteq B.}$
(b) Suppose ${\displaystyle Y\subseteq B}$. The preimage (set) of ${\displaystyle Y}$ is the set
${\displaystyle f^{-1}(Y)=\{x\in A:f(x)\in Y\}\subseteq A.}$

Remark.

• Notice that the "${\displaystyle f^{-1}}$" used in the notation for the preimage set is not referring to the inverse function. The preimage set ${\displaystyle f^{-1}(Y)}$ of ${\displaystyle Y}$ still makes sense even if the function ${\displaystyle f}$ is not bijective. However, the inverse function ${\displaystyle f^{-1}}$ does not exist if the function ${\displaystyle f}$ is not bijective.
• In other words, the image set contains the image of every element ${\displaystyle x\in X}$. So, it is a set of images, and hence the name.
• On the other hand, the preimage set contains the pre-images of every element ${\displaystyle y\in Y}$. That is, the preimage set ${\displaystyle f^{-1}(Y)}$ is the set of elements in ${\displaystyle A}$ mapped into ${\displaystyle Y}$ by ${\displaystyle f}$.
• Special case: suppose ${\displaystyle a\in A}$ and ${\displaystyle b\in B}$. Then, the image set of ${\displaystyle \{a\}}$ is the set containing image of ${\displaystyle a}$ under ${\displaystyle f}$: ${\displaystyle \{f(a)\}}$, and the preimage of set ${\displaystyle \{b\}}$ is the set containing the preimages (if any) of ${\displaystyle b}$ under ${\displaystyle f}$.

Graphically, the image set looks like:

   A              B
*------*       *------*
|      |       |      |
|  . X |       | f(X) |
|.###. |       |  .   |
|.###.--------->.###. |
|.###. |       |.###. |
|  .   |       |  .   |
*------*       *------*


The preimage set looks like

   A              B
*------*       *------*
|f-1(Y)|       |      |
|  .   |       |  Y   |
|.###. |       |  .   |
|.###.<---------.###. |
|.###. |       |.###. |
|  .   |       |  .   |
*------*       *------*


Example. Let ${\displaystyle A=\{1,2,3,4,5,6,7\}}$ and ${\displaystyle B=\{a,b,c,d,e\}}$. Consider the function ${\displaystyle f:A\to B}$ defined by

${\displaystyle f=\{(1,a),(2,a),(3,b),(4,b),(5,d),(6,d),(7,a)\}.}$
Then,

• ${\displaystyle f(\{1,3\})=\{a,b\}}$
• ${\displaystyle f^{-1}(\{a,b\})=\{1,2,3,4\}}$
• ${\displaystyle f(\{1,2,3,4\})=\{a,b\}}$
• ${\displaystyle f(\{1,2,3,4,5,6,7\})=\{a,b,d\}}$
• ${\displaystyle f^{-1}(\{c\})=\varnothing }$
• ${\displaystyle f^{-1}(\{a,c,d,e\})=\{1,2,5,6,7\}}$
• ${\displaystyle f(\{1,2,5,6,7\})=\{a,d\}}$

Remark.

• Notice that ${\displaystyle f}$ is not bijective, but it is still meaningful to consider the preimage sets of ${\displaystyle f}$. For instance, we have ${\displaystyle f^{-1}(\{a\})=\{1,2\}}$, but "${\displaystyle f^{-1}(a)}$" has no meaning since ${\displaystyle f^{-1}}$ does not exist at all.
• We can observe that ${\displaystyle f(\{1,3\})=\{a,b\}}$ but ${\displaystyle f^{-1}(\{a,b\})=\{1,2,3,4\}}$. So, ${\displaystyle f^{-1}(f(X))\neq X}$ in general. Also, we can notice that after "applying" the image set and then preimage set on a set ${\displaystyle X}$, it seems that we may get some set that is larger than ${\displaystyle X}$.
• On the other hand, we can see that ${\displaystyle f^{-1}(\{a,c,d,e\})=\{1,2,5,6,7\}}$ but ${\displaystyle f(\{1,2,5,6,7\})=\{a,d\}}$. So, ${\displaystyle f(f^{-1}(Y))\neq Y}$ in general. Also, it seems that we get some set that is smaller than ${\displaystyle X}$.
• Such kind of results turns out to be true in general (see the following theorem).

Exercise. Consider a function ${\displaystyle f:A\to B}$. Find (a) ${\displaystyle f(A)}$; (b) ${\displaystyle f^{-1}(B)}$; (c) ${\displaystyle f(\varnothing )}$; (d) ${\displaystyle f^{-1}(\varnothing )}$.

Solution

(a) ${\displaystyle f(A)=\{y\in B:y=f(x){\text{ for some }}x\in A\}=\operatorname {ran} f}$.

(b) ${\displaystyle f^{-1}(B)=\{x\in A:\underbrace {f(x)\in B} _{\text{always true}}\}=A}$.

(c) ${\displaystyle f(\varnothing )=\{y\in B:\underbrace {y=f(x){\text{ for some }}x\in \varnothing } _{\text{always false}}\}=\varnothing }$.

(d) ${\displaystyle f^{-1}(\varnothing )=\{x\in A:\underbrace {f(x)\in \varnothing } _{\text{always false}}\}=\varnothing }$.

Proposition. Let ${\displaystyle A}$ and ${\displaystyle B}$ be sets, and let ${\displaystyle f:A\to B}$ be a function. Then,

(a) ${\displaystyle X\subseteq f^{-1}(f(X))}$ for every subset ${\displaystyle X\subseteq A}$.

(b) ${\displaystyle f(f^{-1}(Y))\subseteq Y}$ for every subset ${\displaystyle Y\subseteq B}$.

Proof. (a) For every subset ${\displaystyle X\subseteq A}$, we have ${\displaystyle f^{-1}(f(X))=\{x\in A:f(x)\in f(X)\}}$. So, for every ${\displaystyle x}$,

${\displaystyle x\in X\implies f(x)\in f(X)\implies x\in f^{-1}(f(X)).}$
(b) For every subset ${\displaystyle Y\subseteq B}$, we have ${\displaystyle f(f^{-1}(Y))=\{y\in B:y=f(x){\text{ for some }}x\in f^{-1}(Y)\}}$. So, for every ${\displaystyle y}$, assume ${\displaystyle y\in f(f^{-1}(Y))}$. Then, ${\displaystyle y=f(x)}$ for some ${\displaystyle x\in f^{-1}(Y)}$. But ${\displaystyle x\in f^{-1}(Y)}$ means ${\displaystyle f(x)\in Y}$. So, we have ${\displaystyle y=f(x)\in Y}$.

${\displaystyle \Box }$

Exercise. Propose an additional assumption on the function ${\displaystyle f}$ so that (a) ${\displaystyle X=f^{-1}(f(X))}$ under this assumption; (b) ${\displaystyle Y=f(f^{-1}(Y))}$ under this assumption. Then, prove them. (Hint: the assumption is either "${\displaystyle f}$ is injective" or "${\displaystyle f}$ is surjective". Construct some simple examples of injective/surjective functions to make your choice.)

Solution

(a) The assumption is "${\displaystyle f}$ is injective".

Proof. It suffices to show that ${\displaystyle f^{-1}(f(X))\subseteq X}$, since another subset inclusion is immediate by the proposition above. For every ${\displaystyle x}$,

{\displaystyle {\begin{aligned}x\in f^{-1}(f(X))&\implies f(x)\in f(X)\\&\implies f(x)=f(x'){\text{ for some }}x'\in X\\&\implies x=x'&({\text{injective}}).\end{aligned}}}

${\displaystyle \Box }$

(b) The assumption is "${\displaystyle f}$ is surjective".

Proof. It suffices to show that ${\displaystyle Y\subseteq f^{-1}(f(Y))}$. For every ${\displaystyle y}$, assume ${\displaystyle y\in Y}$. Then, since ${\displaystyle f}$ is surjective, ${\displaystyle y=f(x)}$ for some ${\displaystyle x\in A}$. This means ${\displaystyle x\in f^{-1}(Y)}$ since ${\displaystyle f(x)\in Y}$. Combining this with the previous equation, we have ${\displaystyle y=f(x)}$ for some ${\displaystyle x\in f^{-1}(Y)}$. Thus, ${\displaystyle y\in f(f^{-1}(Y))}$ by the definition of image set.

${\displaystyle \Box }$

Exercise. Prove or disprove that (a) if ${\displaystyle x\in X}$, then ${\displaystyle f(x)\in f(X)}$; (b) if ${\displaystyle f(x)\in f(X)}$, then ${\displaystyle x\in X}$.

Solution
(a)

Proof. Assume ${\displaystyle x\in X}$. Then by definition of image set, ${\displaystyle f(x)\in f(X)}$.

${\displaystyle \Box }$

(b)

Disproof. Take ${\displaystyle A=\{1,2\}}$, ${\displaystyle B=\{1\}}$, and the function ${\displaystyle f:A\to B}$ defined by ${\displaystyle f=\{(1,1),(2,1)\}}$. Then, take ${\displaystyle x=2}$ and ${\displaystyle X=\{1\}}$. After that, we have ${\displaystyle f(x)=1\in \{1\}=f(X)}$, but ${\displaystyle x=2\notin \{1\}=X}$.

${\displaystyle \Box }$

## Cardinalities of sets

Recall that a set is ${\displaystyle S}$ finite if it contains a finite number of elements, i.e., ${\displaystyle S=\varnothing }$ or ${\displaystyle |S|=n}$ for some ${\displaystyle n\in \mathbb {N} }$. On the other hand, a set is infinite if it is not finite. Previously, we have studied cardinalities of finite sets, and we have left cardinalities of infinite sets undefined. But, it turns out that we can define cardinalities of infinite sets in a more complicated way, using bijective functions.

It may now seem that we can write something like ${\displaystyle |S|=\infty }$ for an infinite set ${\displaystyle S}$. But it turns out that this is not quite meaningful, and as we shall see, infinite sets can have different cardinalities, or different "sizes". That is, there are different sizes of infinity! (in some sense).

To motivate the definition for the cardinalities of infinite sets, let us first consider a simple example of finite sets.

Example. Let ${\displaystyle A=\{a,b,c\}}$ and ${\displaystyle B=\{1,2,3\}}$. We can then clearly observe that ${\displaystyle |A|=|B|=3}$