Linear Algebra/Length and Angle Measures

From Wikibooks, the open-content textbooks collection

Jump to: navigation, search

We've translated the first section's results about solution sets into geometric terms for insight into how those sets look. But we must watch out not to be mislead by our own terms; labeling subsets of  \mathbb{R}^k of the forms  \{\vec{p}+t\vec{v}\,\big|\, t\in\mathbb{R}\} and  \{\vec{p}+t\vec{v}+s\vec{w}\,\big|\, t,s\in\mathbb{R}\} as "lines" and "planes" doesn't make them act like the lines and planes of our prior experience. Rather, we must ensure that the names suit the sets. While we can't prove that the sets satisfy our intuition— we can't prove anything about intuition— in this subsection we'll observe that a result familiar from  \mathbb{R}^2 and  \mathbb{R}^3 , when generalized to arbitrary  \mathbb{R}^k , supports the idea that a line is straight and a plane is flat. Specifically, we'll see how to do Euclidean geometry in a "plane" by giving a definition of the angle between two \mathbb{R}^n vectors in the plane that they generate.

Definition 2.1

The length of a vector  \vec{v}\in\mathbb{R}^n is this.


|\vec{v}\,|=\sqrt{v_1^2+\cdots+v_n^2}
Remark 2.2

This is a natural generalization of the Pythagorean Theorem. A classic discussion is in (Pólya 1954).

We can use that definition to derive a formula for the angle between two vectors. For a model of what to do, consider two vectors in  \mathbb{R}^3 .

Linalg two vectors in R3.png

Put them in canonical position and, in the plane that they determine, consider the triangle formed by  \vec{u} ,  \vec{v} , and  \vec{u}-\vec{v} .

Linalg triangle formed by two vectors.png

Apply the Law of Cosines, |\vec{u}-\vec{v}\,|^2
=
|\vec{u}\,|^2+|\vec{v}\,|^2-
2\,|\vec{u}\,|\,|\vec{v}\,|\cos\theta, where θ is the angle between the vectors. Expand both sides

(u1v1)2 + (u2v2)2 + (u3v3)2
=(u_1^2+u_2^2+u_3^2)+(v_1^2+v_2^2+v_3^2)-
2\,|\vec{u}\,|\,|\vec{v}\,|\cos\theta

and simplify.


\theta
=
\arccos(\,\frac{u_1v_1+u_2v_2+u_3v_3}{
|\vec{u}\,|\,|\vec{v}\,| }\,)

In higher dimensions no picture suffices but we can make the same argument analytically. First, the form of the numerator is clear— it comes from the middle terms of the squares (u1v1)2, (u2v2)2, etc.

Definition 2.3

The dot product (or inner product, or scalar product) of two n-component real vectors is the linear combination of their components.


\vec{u}\cdot\vec{v}=u_1v_1+u_2v_2+\cdots +u_nv_n

Note that the dot product of two vectors is a real number, not a vector, and that the dot product of a vector from  \mathbb{R}^n with a vector from  \mathbb{R}^m is defined only when n equals m. Note also this relationship between dot product and length: dotting a vector with itself gives its length squared  \vec{u}\cdot\vec{u}=u_1u_1+\cdots+u_nu_n=|\vec{u}\,|^2 .

Remark 2.4

The wording in that definition allows one or both of the two to be a row vector instead of a column vector. Some books require that the first vector be a row vector and that the second vector be a column vector. We shall not be that strict.

Still reasoning with letters, but guided by the pictures, we use the next theorem to argue that the triangle formed by  \vec{u} ,  \vec{v} , and  \vec{u}-\vec{v} in  \mathbb{R}^n lies in the planar subset of  \mathbb{R}^n generated by  \vec{u} and  \vec{v} .

Theorem 2.5 (Triangle Inequality)

For any  \vec{u},\vec{v}\in\mathbb{R}^n ,


|\vec{u}+\vec{v}\,|\leq|\vec{u}\,|+|\vec{v}\,|

with equality if and only if one of the vectors is a nonnegative scalar multiple of the other one.

This inequality is the source of the familiar saying, "The shortest distance between two points is in a straight line."

Linalg triangle inequality.png

Proof

(We'll use some algebraic properties of dot product that we have not yet checked, for instance that  \vec{u}\cdot(\vec{a}+\vec{b})
=\vec{u}\cdot\vec{a}+\vec{u}\cdot\vec{b} and that \vec{u}\cdot\vec{v}=\vec{v}\cdot\vec{u}. See Problem 8.) The desired inequality holds if and only if its square holds.

\begin{array}{rl}
|\vec{u}+\vec{v}\,|^2
&\leq(\,|\vec{u}\,|+|\vec{v}\,|\,)^2                            \\
(\,\vec{u}+\vec{v}\,)\cdot(\,\vec{u}+\vec{v}\,)
&\leq|\vec{u}\,|^2+2\,|\vec{u}\,|\,|\vec{v}\,|
+|\vec{v}\,|^2                                         \\
\vec{u}\cdot\vec{u}+\vec{u}\cdot\vec{v}
+\vec{v}\cdot\vec{u}+\vec{v}\cdot\vec{v}
&\leq\vec{u}\cdot\vec{u}+2\,|\vec{u}\,|\,|\vec{v}\,|
+\vec{v}\cdot\vec{v}                                          \\
2\,\vec{u}\cdot\vec{v}
&\leq 2\,|\vec{u}\,|\,|\vec{v}\,|
\end{array}

That, in turn, holds if and only if the relationship obtained by multiplying both sides by the nonnegative numbers  |\vec{u}\,| and  |\vec{v}\,|


2\,(\,|\vec{v}\,|\,\vec{u}\,)\cdot(\,|\vec{u}\,|\,\vec{v}\,)
\leq
2\,|\vec{u}\,|^2\,|\vec{v}\,|^2

and rewriting


0
\leq
|\vec{u}\,|^2\,|\vec{v}\,|^2
-2\,(\,|\vec{v}\,|\,\vec{u}\,)\cdot(\,|\vec{u}\,|\,\vec{v}\,)
+|\vec{u}\,|^2\,|\vec{v}\,|^2

is true. But factoring


0\leq
(\,|\vec{u}\,|\,\vec{v}-|\vec{v}\,|\,\vec{u}\,)\cdot
(\,|\vec{u}\,|\,\vec{v}-|\vec{v}\,|\,\vec{u}\,)

shows that this certainly is true since it only says that the square of the length of the vector  |\vec{u}\,|\,\vec{v}-|\vec{v}\,|\,\vec{u}\, is not negative.

As for equality, it holds when, and only when,  |\vec{u}\,|\,\vec{v}-|\vec{v}\,|\,\vec{u} is  \vec{0} . The check that  |\vec{u}\,|\,\vec{v}=|\vec{v}\,|\,\vec{u}\, if and only if one vector is a nonnegative real scalar multiple of the other is easy.

This result supports the intuition that even in higher-dimensional spaces, lines are straight and planes are flat. For any two points in a linear surface, the line segment connecting them is contained in that surface (this is easily checked from the definition). But if the surface has a bend then that would allow for a shortcut (shown here grayed, while the segment from P to Q that is contained in the surface is solid).

Linalg shortest path on surface.png

Because the Triangle Inequality says that in any \mathbb{R}^n, the shortest cut between two endpoints is simply the line segment connecting them, linear surfaces have no such bends.

Back to the definition of angle measure. The heart of the Triangle Inequality's proof is the " \vec{u}\cdot\vec{v}\leq |\vec{u}\,|\,|\vec{v}\,| " line. At first glance, a reader might wonder if some pairs of vectors satisfy the inequality in this way: while  \vec{u}\cdot\vec{v} is a large number, with absolute value bigger than the right-hand side, it is a negative large number. The next result says that no such pair of vectors exists.

Corollary 2.6 (Cauchy-Schwartz Inequality)

For any  \vec{u},\vec{v}\in\mathbb{R}^n ,


|\,\vec{u}\cdot\vec{v}\,|
\leq
|\,\vec{u}\,|\,|\vec{v}\,|

with equality if and only if one vector is a scalar multiple of the other.

Proof

The Triangle Inequality's proof shows that  \vec{u}\cdot\vec{v}\leq |\vec{u}\,|\,|\vec{v}\,| so if \vec{u}\cdot\vec{v} is positive or zero then we are done. If  \vec{u}\cdot\vec{v} is negative then this holds.


|\,\vec{u}\cdot\vec{v}\,|
=-(\,\vec{u}\cdot\vec{v}\,)
=(-\vec{u}\,)\cdot\vec{v}
\leq
|-\vec{u}\,|\,|\vec{v}\,|
=|\vec{u}\,|\,|\vec{v}\,|

The equality condition is Problem 9.

The Cauchy-Schwartz inequality assures us that the next definition makes sense because the fraction has absolute value less than or equal to one.

Definition 2.7

The angle between two nonzero vectors  \vec{u},\vec{v}\in\mathbb{R}^n is


\theta
=
\arccos(\,\frac{\vec{u}\cdot\vec{v}}{
|\vec{u}\,|\,|\vec{v}\,| }\,)

(the angle between the zero vector and any other vector is defined to be a right angle).

Thus vectors from  \mathbb{R}^n are orthogonal (or perpendicular) if and only if their dot product is zero.

Example 2.8

These vectors are orthogonal.

Linalg orthog vectors in R2.png \begin{pmatrix} 1 \\ -1 \end{pmatrix}\cdot\begin{pmatrix} 1 \\ 1 \end{pmatrix}=0

The arrows are shown away from canonical position but nevertheless the vectors are orthogonal.

Example 2.9

The  \mathbb{R}^3 angle formula given at the start of this subsection is a special case of the definition. Between these two

Linalg nonorthog vectors in R3.png

the angle is


\arccos(\frac{(1)(0)+(1)(3)+(0)(2)}{\sqrt{1^2+1^2+0^2}\sqrt{0^2+3^2+2^2}})
=\arccos(\frac{3}{\sqrt{2}\sqrt{13}})

approximately 0.94radians. Notice that these vectors are not orthogonal. Although the yz-plane may appear to be perpendicular to the xy-plane, in fact the two planes are that way only in the weak sense that there are vectors in each orthogonal to all vectors in the other. Not every vector in each is orthogonal to all vectors in the other.

[edit] Exercises

This exercise is recommended for all readers.
Problem 1

Find the length of each vector.

  1.  \begin{pmatrix} 3 \\ 1 \end{pmatrix}
  2.  \begin{pmatrix} -1 \\ 2 \end{pmatrix}
  3.  \begin{pmatrix} 4 \\ 1 \\ 1 \end{pmatrix}
  4.  \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix}
  5.  \begin{pmatrix} 1 \\ -1 \\ 1 \\ 0 \end{pmatrix}
Answer
  1.  \sqrt{3^2+1^2}=\sqrt{10}
  2.  \sqrt{5}
  3.  \sqrt{18}
  4. 0
  5.  \sqrt{3}
This exercise is recommended for all readers.
Problem 2

Find the angle between each two, if it is defined.

  1.  \begin{pmatrix} 1 \\ 2 \end{pmatrix}, \begin{pmatrix} 1 \\ 4 \end{pmatrix}
  2.  \begin{pmatrix} 1 \\ 2 \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ 4 \\ 1 \end{pmatrix}
  3.  \begin{pmatrix} 1 \\ 2 \end{pmatrix}, \begin{pmatrix} 1 \\ 4 \\ -1 \end{pmatrix}
Answer
  1.  \arccos(9/\sqrt{85})\approx 0.22\text{ radians}
  2.  \arccos(8/\sqrt{85})\approx 0.52\text{ radians}
  3. Not defined.
This exercise is recommended for all readers.
Problem 3

During maneuvers preceding the Battle of Jutland, the British battle cruiser Lion moved as follows (in nautical miles): 1.2 miles north, 6.1 miles 38 degrees east of south, 4.0 miles at 89 degrees east of north, and 6.5 miles at 31 degrees east of north. Find the distance between starting and ending positions (O'Hanian 1985).

Answer

We express each displacement as a vector (rounded to one decimal place because that's the accuracy of the problem's statement) and add to find the total displacement (ignoring the curvature of the earth).


\begin{pmatrix} 0.0 \\ 1.2 \end{pmatrix}
+\begin{pmatrix} 3.8 \\ -4.8 \end{pmatrix}
+\begin{pmatrix} 4.0 \\ 0.1 \end{pmatrix}
+\begin{pmatrix} 3.3 \\ 5.6 \end{pmatrix}
=\begin{pmatrix} 11.1 \\ 2.1 \end{pmatrix}

The distance is  \sqrt{11.1^2+2.1^2}\approx 11.3 .

Problem 4

Find k so that these two vectors are perpendicular.


\begin{pmatrix} k \\ 1 \end{pmatrix}
\qquad
\begin{pmatrix} 4 \\ 3 \end{pmatrix}
Answer

Solve (k)(4) + (1)(3) = 0 to get k = − 3 / 4.

Problem 5

Describe the set of vectors in  \mathbb{R}^3 orthogonal to this one.


\begin{pmatrix} 1 \\ 3 \\ -1 \end{pmatrix}
Answer

The set


\{\begin{pmatrix} x \\ y \\ z \end{pmatrix}\,\big|\, 1x+3y-1z=0\}

can also be described with parameters in this way.


\{\begin{pmatrix} -3 \\ 1 \\ 0 \end{pmatrix}y+\begin{pmatrix} 1 \\ 0 \\ 1 \end{pmatrix}z
\,\big|\, y,z\in\mathbb{R}\}
This exercise is recommended for all readers.
Problem 6
  1. Find the angle between the diagonal of the unit square in  \mathbb{R}^2 and one of the axes.
  2. Find the angle between the diagonal of the unit cube in  \mathbb{R}^3 and one of the axes.
  3. Find the angle between the diagonal of the unit cube in  \mathbb{R}^n and one of the axes.
  4. What is the limit, as n goes to  \infty , of the angle between the diagonal of the unit cube in  \mathbb{R}^n and one of the axes?
Answer
  1. We can use the x-axis.
    
\arccos (\frac{(1)(1)+(0)(1)}{\sqrt{1}\sqrt{2}})
\approx 0.79 \text{radians}
  2. Again, use the x-axis.
    
\arccos (\frac{(1)(1)+(0)(1)+(0)(1)}{\sqrt{1}\sqrt{3}})
\approx 0.96 \text{radians}
  3. The x-axis worked before and it will work again.
    
\arccos (\frac{(1)(1)+\cdots+(0)(1)}{\sqrt{1}\sqrt{n}})
=\arccos (\frac{1}{\sqrt{n}})
  4. Using the formula from the prior item, \lim_{n\to\infty} \arccos(1/\sqrt{n})
=\pi/2\text{ radians}.
Problem 7

Is any vector perpendicular to itself?

Answer

Clearly  u_1u_1+\cdots+u_nu_n is zero if and only if each ui is zero. So only  \vec{0}\in\mathbb{R}^n is perpendicular to itself.

This exercise is recommended for all readers.
Problem 8

Describe the algebraic properties of dot product.

  1. Is it right-distributive over addition: 
(\vec{u}+\vec{v})\cdot\vec{w}
=
\vec{u}\cdot\vec{w}+\vec{v}\cdot\vec{w} ?
  2. Is is left-distributive (over addition)?
  3. Does it commute?
  4. Associate?
  5. How does it interact with scalar multiplication?

As always, any assertion must be backed by either a proof or an example.

Answer

Assume that  \vec{u},\vec{v},\vec{w}\in\mathbb{R}^n have components  u_1,\ldots,u_n,v_1,\ldots,w_n .

  1. Dot product is right-distributive.
    \begin{array}{rl}
(\vec{u}+\vec{v})\cdot\vec{w}
&=[\begin{pmatrix} u_1 \\ \vdots \\ u_n \end{pmatrix}
+\begin{pmatrix} v_1 \\ \vdots \\ v_n \end{pmatrix}]\cdot
\begin{pmatrix} w_1 \\ \vdots \\ w_n \end{pmatrix}               \\
&=
\begin{pmatrix} u_1+v_1 \\ \vdots \\ u_n+v_n \end{pmatrix}\cdot
\begin{pmatrix} w_1 \\ \vdots \\ w_n \end{pmatrix}               \\
&=
(u_1+v_1)w_1+\cdots+(u_n+v_n)w_n              \\
&=
(u_1w_1+\cdots+u_nw_n)+(v_1w_1+\cdots+v_nw_n)  \\
&=
\vec{u}\cdot\vec{w}+\vec{v}\cdot\vec{w}
\end{array}
  2. Dot product is also left distributive: \vec{w}\cdot(\vec{u}+\vec{v})=
\vec{w}\cdot\vec{u}+\vec{w}\cdot\vec{v}. The proof is just like the prior one.
  3. Dot product commutes.
    
\begin{pmatrix} u_1 \\ \vdots \\ u_n \end{pmatrix}\cdot\begin{pmatrix} v_1 \\ \vdots \\ v_n \end{pmatrix}
=u_1v_1+\cdots+u_nv_n
=v_1u_1+\cdots+v_nu_n
=\begin{pmatrix} v_1 \\ \vdots \\ v_n \end{pmatrix}\cdot\begin{pmatrix} u_1 \\ \vdots \\ u_n \end{pmatrix}
  4. Because  \vec{u}\cdot\vec{v} is a scalar, not a vector, the expression  (\vec{u}\cdot\vec{v})\cdot\vec{w} makes no sense; the dot product of a scalar and a vector is not defined.
  5. This is a vague question so it has many answers. Some are (1)  k(\vec{u}\cdot\vec{v})=(k\vec{u})\cdot\vec{v} and  k(\vec{u}\cdot\vec{v})=\vec{u}\cdot(k\vec{v}) , (2)  k(\vec{u}\cdot\vec{v})\neq (k\vec{u})\cdot(k\vec{v}) (in general; an example is easy to produce), and (3)  |k\vec{v}\,|=|k||\vec{v}\,| (the connection between norm and dot product is that the square of the norm is the dot product of a vector with itself).
Problem 9

Verify the equality condition in Corollary 2.6, the Cauchy-Schwartz Inequality.

  1. Show that if  \vec{u} is a negative scalar multiple of  \vec{v} then  \vec{u}\cdot\vec{v} and  \vec{v}\cdot\vec{u} are less than or equal to zero.
  2. Show that  |\vec{u}\cdot\vec{v}|=
|\vec{u}\,|\,|\vec{v}\,| if and only if one vector is a scalar multiple of the other.
Answer
  1. Verifying that  (k\vec{x})\cdot\vec{y}=k(\vec{x}\cdot\vec{y})
=\vec{x}\cdot(k\vec{y}) for  k\in\mathbb{R} and  \vec{x},\vec{y}\in\mathbb{R}^n is easy. Now, for  k\in\mathbb{R} and  \vec{v},\vec{w}\in\mathbb{R}^n , if  \vec{u}=k\vec{v} then  \vec{u}\cdot\vec{v}=(k\vec{v})\cdot\vec{v}
=k(\vec{v}\cdot\vec{v}) , which is k times a nonnegative real. The  \vec{v}=k\vec{u} half is similar (actually, taking the k in this paragraph to be the reciprocal of the k above gives that we need only worry about the k = 0 case).
  2. We first consider the \vec{u}\cdot\vec{v}\geq 0 case. From the Triangle Inequality we know that  \vec{u}\cdot\vec{v}=|\vec{u}\,|\,|\vec{v}\,| if and only if one vector is a nonnegative scalar multiple of the other. But that's all we need because the first part of this exercise shows that, in a context where the dot product of the two vectors is positive, the two statements "one vector is a scalar multiple of the other" and "one vector is a nonnegative scalar multiple of the other", are equivalent. We finish by considering the \vec{u}\cdot\vec{v}< 0 case. Because 0<|\vec{u}\cdot\vec{v}|=-(\vec{u}\cdot\vec{v})
=(-\vec{u})\cdot\vec{v} and |\vec{u}\,|\,|\vec{v}\,|
=|-\vec{u}\,|\,|\vec{v}\,|, we have that 0<(-\vec{u})\cdot\vec{v}=|-\vec{u}\,|\,|\vec{v}\,|. Now the prior paragraph applies to give that one of the two vectors -\vec{u} and \vec{v} is a scalar multiple of the other. But that's equivalent to the assertion that one of the two vectors \vec{u} and \vec{v} is a scalar multiple of the other, as desired.
Problem 10

Suppose that  \vec{u}\cdot\vec{v}=\vec{u}\cdot\vec{w} and  \vec{u}\neq\vec{0} . Must  \vec{v}=\vec{w} ?

Answer

No. These give an example.


\vec{u}=\begin{pmatrix} 1 \\ 0 \end{pmatrix}
\quad
\vec{v}=\begin{pmatrix} 1 \\ 0 \end{pmatrix}
\quad
\vec{w}=\begin{pmatrix} 1 \\ 1 \end{pmatrix}
This exercise is recommended for all readers.
Problem 11

Does any vector have length zero except a zero vector? (If "yes", produce an example. If "no", prove it.)

Answer

We prove that a vector has length zero if and only if all its components are zero.

Let  \vec{u}\in\mathbb{R}^n have components  u_1,\ldots,u_n . Recall that the square of any real number is greater than or equal to zero, with equality only when that real is zero. Thus  |\vec{u}\,|^2={u_1}^2+\cdots+{u_n}^2 is a sum of numbers greater than or equal to zero, and so is itself greater than or equal to zero, with equality if and only if each ui is zero. Hence  |\vec{u}\,|=0 if and only if all the components of  \vec{u} are zero.

This exercise is recommended for all readers.
Problem 12

Find the midpoint of the line segment connecting (x1,y1) with (x2,y2) in  \mathbb{R}^2 . Generalize to  \mathbb{R}^n .

Answer

We can easily check that


\bigl( \frac{x_1+x_2}{2},\frac{y_1+y_2}{2}  \bigr)

is on the line connecting the two, and is equidistant from both. The generalization is obvious.

Problem 13

Show that if  \vec{v}\neq\vec{0} then  \vec{v}/|\vec{v}\,| has length one. What if  \vec{v}=\vec{0} ?

Answer

Assume that  \vec{v}\in\mathbb{R}^n has components  v_1,\ldots,v_n . If  \vec{v}\neq \vec{0} then we have this.


\sqrt{\left(\frac{v_1}{\sqrt{{v_1}^2+\cdots+{v_n}^2}}\right)^2+
\dots+\left(\frac{v_n}{\sqrt{{v_1}^2+\cdots+{v_n}^2}}\right)^2}

\begin{align}
&=\sqrt{\left(\frac{{v_1}^2}{{v_1}^2+\cdots+{v_n}^2}\right)+
\dots+\left(\frac{{v_n}^2}{{v_1}^2+\cdots+{v_n}^2}\right)}   \\
&=1
\end{align}

If  \vec{v}=\vec{0} then  \vec{v}/|\vec{v}\,| is not defined.

Problem 14

Show that if  r\geq 0 then  r\vec{v} is r times as long as  \vec{v} . What if r < 0?

Answer

For the first question, assume that  \vec{v}\in\mathbb{R}^n and  r\geq 0 , take the root, and factor.


|r\vec{v}\,|
=\sqrt{(rv_1)^2+\cdots+(rv_n)^2}
=\sqrt{r^2({v_1}^2+\cdots+{v_n}^2}
=r|\vec{v}\,|

For the second question, the result is r times as long, but it points in the opposite direction in that  r\vec{v}+(-r)\vec{v}=\vec{0} .

This exercise is recommended for all readers.
Problem 15

A vector  \vec{v}\in\mathbb{R}^n of length one is a unit vector. Show that the dot product of two unit vectors has absolute value less than or equal to one. Can "less than" happen? Can "equal to"?

Answer

Assume that  \vec{u},\vec{v}\in\mathbb{R}^n both have length 1. Apply Cauchy-Schwartz: |\vec{u}\cdot\vec{v}|
\leq|\vec{u}\,|\,|\vec{v}\,|=1.

To see that "less than" can happen, in  \mathbb{R}^2 take


\vec{u}=\begin{pmatrix} 1 \\ 0 \end{pmatrix}
\qquad
\vec{v}=\begin{pmatrix} 0 \\ 1 \end{pmatrix}

and note that  \vec{u}\cdot\vec{v}=0 . For "equal to", note that  \vec{u}\cdot\vec{u}=1 .

Problem 16

Prove that 
|\vec{u}+\vec{v}\,|^2+|\vec{u}-\vec{v}\,|^2
=2|\vec{u}\,|^2+2|\vec{v}\,|^2.

Answer

Write


\vec{u}=\begin{pmatrix} u_1 \\ \vdots \\ u_n \end{pmatrix}
\qquad
\vec{v}=\begin{pmatrix} v_1 \\ \vdots \\ v_n \end{pmatrix}

and then this computation works.

\begin{array}{rl}
|\vec{u}+\vec{v}\,|^2+|\vec{u}-\vec{v}\,|^2
&=(u_1+v_1)^2+\cdots+(u_n+v_n)^2   \\
&\quad +(u_1-v_1)^2+\cdots+(u_n-v_n)^2     \\
&={u_1}^2+2u_1v_1+{v_1}^2+\cdots+{u_n}^2+2u_nv_n+{v_n}^2       \\
&\quad +{u_1}^2-2u_1v_1+{v_1}^2+\cdots+{u_n}^2-2u_nv_n+{v_n}^2 \\
&=2({u_1}^2+\cdots+{u_n}^2)+2({v_1}^2+\cdots+{v_n}^2) \\
&=2|\vec{u}\,|^2+2|\vec{v}\,|^2
\end{array}
Problem 17

Show that if  \vec{x}\cdot\vec{y}=0 for every  \vec{y} then  \vec{x}=\vec{0} .

Answer

We will prove this demonstrating that the contrapositive statement holds: if  \vec{x}\neq\vec{0} then there is a  \vec{y} with  \vec{x}\cdot\vec{y}\neq 0 .

Assume that  \vec{x}\in\mathbb{R}^n . If  \vec{x}\neq\vec{0} then it has a nonzero component, say the i-th one xi. But the vector  \vec{y}\in\mathbb{R}^n that is all zeroes except for a one in component i gives  \vec{x}\cdot\vec{y}=x_i . (A slicker proof just considers \vec{x}\cdot\vec{x}.)

Problem 18

Is  |\vec{u}_1+\cdots+\vec{u}_n| \leq
|\vec{u}_1|+\cdots+|\vec{u}_n| ? If it is true then it would generalize the Triangle Inequality.

Answer

Yes; we can prove this by induction.

Assume that the vectors are in some  \mathbb{R}^k . Clearly the statement applies to one vector. The Triangle Inequality is this statement applied to two vectors. For an inductive step assume the statement is true for n or fewer vectors. Then this


|\vec{u}_1+\cdots+\vec{u}_n+\vec{u}_{n+1}|
\leq
|\vec{u}_1+\cdots+\vec{u}_n|+|\vec{u}_{n+1}|

follows by the Triangle Inequality for two vectors. Now the inductive hypothesis, applied to the first summand on the right, gives that as less than or equal to  |\vec{u}_1|+\cdots+|\vec{u}_n|+|\vec{u}_{n+1}| .

Problem 19

What is the ratio between the sides in the Cauchy-Schwartz inequality?

Answer

By definition


\frac{\vec{u}\cdot\vec{v}}{
|\vec{u}\,|\,|\vec{v}\,|}=\cos\theta

where θ is the angle between the vectors. Thus the ratio is | cosθ | .

Problem 20

Why is the zero vector defined to be perpendicular to every vector?

Answer

So that the statement "vectors are orthogonal iff their dot product is zero" has no exceptions.

Problem 21

Describe the angle between two vectors in  \mathbb{R}^1 .

Answer

The angle between (a) and (b) is found (for  a,b\neq 0 ) with


\arccos(\frac{ab}{\sqrt{a^2}\sqrt{b^2}}).

If a or b is zero then the angle is π / 2 radians. Otherwise, if a and b are of opposite signs then the angle is π radians, else the angle is zero radians.

Problem 22

Give a simple necessary and sufficient condition to determine whether the angle between two vectors is acute, right, or obtuse.

Answer

The angle between  \vec{u} and  \vec{v} is acute if  \vec{u}\cdot\vec{v}> 0 , is right if  \vec{u}\cdot\vec{v}=0 , and is obtuse if  \vec{u}\cdot\vec{v}<0 . That's because, in the formula for the angle, the denominator is never negative.

This exercise is recommended for all readers.
Problem 23

Generalize to  \mathbb{R}^n the converse of the Pythagorean Theorem, that if  \vec{u} and  \vec{v} are perpendicular then  |\vec{u}+\vec{v}\,|^2=|\vec{u}\,|^2+|\vec{v}\,|^2 .

Answer

Suppose that  \vec{u},\vec{v}\in\mathbb{R}^n . If  \vec{u} and  \vec{v} are perpendicular then


|\vec{u}+\vec{v}\,|^2
=(\vec{u}+\vec{v})\cdot(\vec{u}+\vec{v})
=\vec{u}\cdot\vec{u}+2\,\vec{u}\cdot\vec{v}
+\vec{v}\cdot\vec{v}
=\vec{u}\cdot\vec{u}+\vec{v}\cdot\vec{v}
=|\vec{u}\,|^2+|\vec{v}\,|^2

(the third equality holds because  \vec{u}\cdot\vec{v}=0 ).

Problem 24

Show that  |\vec{u}\,|=|\vec{v}\,| if and only if  \vec{u}+\vec{v} and  \vec{u}-\vec{v} are perpendicular. Give an example in  \mathbb{R}^2 .

Answer

Where  \vec{u},\vec{v}\in\mathbb{R}^n , the vectors  \vec{u}+\vec{v} and  \vec{u}-\vec{v} are perpendicular if and only if 0=(\vec{u}+\vec{v})\cdot(\vec{u}-\vec{v})
=\vec{u}\cdot\vec{u}-\vec{v}\cdot\vec{v}, which shows that those two are perpendicular if and only if  \vec{u}\cdot\vec{u}=\vec{v}\cdot\vec{v} . That holds if and only if  |\vec{u}\,|=|\vec{v}\,| .

Problem 25

Show that if a vector is perpendicular to each of two others then it is perpendicular to each vector in the plane they generate. (Remark. They could generate a degenerate plane— a line or a point— but the statement remains true.)

Answer

Suppose  \vec{u}\in\mathbb{R}^n is perpendicular to both  \vec{v}\in\mathbb{R}^n and  \vec{w}\in\mathbb{R}^n . Then, for any  k,m\in\mathbb{R} we have this.


\vec{u}\cdot(k\vec{v}+m\vec{w})
=k(\vec{u}\cdot\vec{v})+m(\vec{u}\cdot\vec{w})
=k(0)+m(0)=0
Problem 26

Prove that, where  \vec{u},\vec{v}\in\mathbb{R}^n are nonzero vectors, the vector


\frac{\vec{u}}{|\vec{u}\,| }+\frac{\vec{v}}{|\vec{v}\,| }

bisects the angle between them. Illustrate in  \mathbb{R}^2 .

Answer

We will show something more general: if  |\vec{z}_1|=|\vec{z}_2| for  \vec{z}_1,\vec{z}_2\in\mathbb{R}^n , then  \vec{z}_1+\vec{z}_2 bisects the angle between  \vec{z}_1 and  \vec{z}_2

Linalg angle bisection.svg

(we ignore the case where  \vec{z}_1 and  \vec{z}_2 are the zero vector).

The  \vec{z}_1+\vec{z}_2=\vec{0} case is easy. For the rest, by the definition of angle, we will be done if we show this.


\frac{\vec{z}_1\cdot(\vec{z}_1+\vec{z}_2)}{
|\vec{z}_1|\,|\vec{z}_1+\vec{z}_2| }
=
\frac{\vec{z}_2\cdot(\vec{z}_1+\vec{z}_2)}{
|\vec{z}_2|\,|\vec{z}_1+\vec{z}_2| }

But distributing inside each expression gives


\frac{\vec{z}_1\cdot\vec{z}_1+\vec{z}_1\cdot\vec{z}_2}{
|\vec{z}_1|\,|\vec{z}_1+\vec{z}_2| }
\qquad
\frac{\vec{z}_2\cdot\vec{z}_1+\vec{z}_2\cdot\vec{z}_2}{
|\vec{z}_2|\,|\vec{z}_1+\vec{z}_2| }

and  \vec{z}_1\cdot\vec{z}_1=|\vec{z}_1|^2
=|\vec{z}_2|^2=\vec{z}_2\cdot\vec{z}_2 , so the two are equal.

Problem 27

Verify that the definition of angle is dimensionally correct: (1) if k > 0 then the cosine of the angle between  k\vec{u} and  \vec{v} equals the cosine of the angle between  \vec{u} and  \vec{v} , and (2) if k < 0 then the cosine of the angle between  k\vec{u} and  \vec{v} is the negative of the cosine of the angle between  \vec{u} and  \vec{v} .

Answer

We can show the two statements together. Let  \vec{u}, \vec{v}\in\mathbb{R}^n , write


\vec{u}=\begin{pmatrix} u_1 \\ \vdots \\ u_n \end{pmatrix}
\qquad
\vec{v}=\begin{pmatrix} v_1 \\ \vdots \\ v_n \end{pmatrix}

and calculate.


\cos\theta=
\frac{ku_1v_1+\cdots+ku_nv_n}{
\sqrt{{(ku_1)}^2+\cdots+{(ku_n)}^2}\sqrt{{b_1}^2+\cdots+{b_n}^2} }
=\frac{k}{|k|}
\frac{\vec{u}\cdot\vec{v}}{|\vec{u}\,|\,|\vec{v}\,| }
=\pm
\frac{\vec{u}\cdot\vec{v}}{|\vec{u}\,|\,|\vec{v}\,| }
This exercise is recommended for all readers.
Problem 28

Show that the inner product operation is linear: for  \vec{u},\vec{v},\vec{w}\in\mathbb{R}^n and  k,m\in\mathbb{R} , \vec{u}\cdot(k\vec{v}+m\vec{w})=
k(\vec{u}\cdot\vec{v})+m(\vec{u}\cdot\vec{w}).

Answer

Let


\vec{u}=\begin{pmatrix} u_1 \\ \vdots \\ u_n \end{pmatrix},
\quad
\vec{v}=\begin{pmatrix} v_1 \\ \vdots \\ v_n \end{pmatrix}
\quad
\vec{w}=\begin{pmatrix} w_1 \\ \vdots \\ w_n \end{pmatrix}

and then

\begin{array}{rl}
\vec{u}\cdot\bigl(k\vec{v}+m\vec{w}\bigr)
&=\begin{pmatrix} u_1 \\ \vdots \\ u_n \end{pmatrix}\cdot
\bigl( \begin{pmatrix} kv_1 \\ \vdots \\ kv_n \end{pmatrix}
+\begin{pmatrix} mw_1 \\ \vdots \\ mw_n \end{pmatrix} \bigr)   \\
&=\begin{pmatrix} u_1 \\ \vdots \\ u_n \end{pmatrix}\cdot
\begin{pmatrix} kv_1+mw_1 \\ \vdots \\ kv_n+mw_n \end{pmatrix}    \\
&=u_1(kv_1+mw_1)+\cdots+u_n(kv_n+mw_n)    \\
&=ku_1v_1+mu_1w_1+\cdots+ku_nv_n+mu_nw_n    \\
&=(ku_1v_1+\cdots+ku_nv_n)+(mu_1w_1+\cdots+mu_nw_n)    \\
&=k(\vec{u}\cdot\vec{v})+m(\vec{u}\cdot\vec{w})
\end{array}

as required.

This exercise is recommended for all readers.
Problem 29

The geometric mean of two positive reals x,y is  \sqrt{xy} . It is analogous to the arithmetic mean (x + y) / 2. Use the Cauchy-Schwartz inequality to show that the geometric mean of any  x,y\in\mathbb{R} is less than or equal to the arithmetic mean.

Answer

For  x,y\in\mathbb{R}^+ , set


\vec{u}=\begin{pmatrix} \sqrt{x} \\ \sqrt{y} \end{pmatrix}
\qquad
\vec{v}=\begin{pmatrix} \sqrt{y} \\ \sqrt{x} \end{pmatrix}

so that the Cauchy-Schwartz inequality asserts that (after squaring)

\begin{array}{rl}
(\sqrt{x}\sqrt{y}+\sqrt{y}\sqrt{x})^2
&\leq(\sqrt{x}\sqrt{x}+\sqrt{y}\sqrt{y})(\sqrt{y}\sqrt{y}
+\sqrt{x}\sqrt{x})   \\
(2\sqrt{x}\sqrt{y})^2
&\leq(x+y)^2                            \\
\sqrt{xy}
&\leq\frac{x+y}{2}
\end{array}

as desired.

? Problem 30

A ship is sailing with speed and direction  \vec{v}_1 ; the wind blows apparently (judging by the vane on the mast) in the direction of a vector  \vec{a} ; on changing the direction and speed of the ship from  \vec{v}_1 to  \vec{v}_2 the apparent wind is in the direction of a vector  \vec{b} .

Find the vector velocity of the wind (Ivanoff & Esty 1933).

Answer

This is how the answer was given in the cited source.

The actual velocity  \vec{v} of the wind is the sum of the ship's velocity and the apparent velocity of the wind. Without loss of generality we may assume  \vec{a} and  \vec{b} to be unit vectors, and may write


\vec{v}=\vec{v}_1+s\vec{a}=\vec{v}_2+t\vec{b}

where s and t are undetermined scalars. Take the dot product first by  \vec{a} and then by  \vec{b} to obtain

\begin{array}{rl}
s-t\vec{a}\cdot\vec{b}
&=\vec{a}\cdot(\vec{v}_2-\vec{v}_1)    \\
s\vec{a}\cdot\vec{b}-t
&=\vec{b}\cdot(\vec{v}_2-\vec{v}_1)
\end{array}

Multiply the second by  \vec{a}\cdot\vec{b} , subtract the result from the first, and find


s=
\frac{[\vec{a}-(\vec{a}\cdot\vec{b})\vec{b}]
\cdot(\vec{v}_2-\vec{v}_1)
}{1-(\vec{a}\cdot\vec{b})^2}.

Substituting in the original displayed equation, we get


\vec{v}=\vec{v}_1+
\frac{[\vec{a}-(\vec{a}\cdot\vec{b})\vec{b}]
\cdot(\vec{v}_2-\vec{v}_1)
\vec{a}}{1-(\vec{a}\cdot\vec{b})^2}.
Problem 31

Verify the Cauchy-Schwartz inequality by first proving Lagrange's identity:


\left(\sum_{1\leq j\leq n} a_jb_j \right)^2
=
\left(\sum_{1\leq j\leq n}a_j^2\right)
\left(\sum_{1\leq j\leq n}b_j^2\right)
-
\sum_{1\leq k < j\leq n}(a_kb_j-a_jb_k)^2

and then noting that the final term is positive. (Recall the meaning


\sum_{1\leq j\leq n}a_jb_j=
a_1b_1+a_2b_2+\cdots+a_nb_n

and


\sum_{1\leq j\leq n}{a_j}^2=
{a_1}^2+{a_2}^2+\cdots+{a_n}^2

of the Σ notation.) This result is an improvement over Cauchy-Schwartz because it gives a formula for the difference between the two sides. Interpret that difference in  \mathbb{R}^2 .

Answer

We use induction on n.

In the n = 1 base case the identity reduces to


(a_1b_1)^2=({a_1}^2)({b_1}^2)-0

and clearly holds.

For the inductive step assume that the formula holds for the 0, ..., n cases. We will show that it then holds in the n + 1 case. Start with the right-hand side


\bigl( \sum_{1\leq j\leq n+1}{a_j}^2\bigr)
\bigl( \sum_{1\leq j\leq n+1}{b_j}^2\bigr)
-
\sum_{1\leq k<j\leq n+1}\bigl(a_kb_j-a_jb_k\bigr)^2

\begin{align}
&=
\bigl[ (\sum_{1\leq j\leq n}{a_j}^2)+{a_{n+1}}^2\bigr]
\bigl[ (\sum_{1\leq j\leq n}{b_j}^2)+{b_{n+1}}^2\bigr]   \\
&\quad -
\bigl[\sum_{1\leq k<j\leq n}\bigl(a_kb_j-a_jb_k\bigr)^2+
\sum_{1\leq k\leq n}\bigl(a_kb_{n+1}-a_{n+1}b_k\bigr)^2  \bigr] \\
&=
\bigl( \sum_{1\leq j\leq n}{a_j}^2\bigr)
\bigl( \sum_{1\leq j\leq n}{b_j}^2\bigr)
+
\sum_{1\leq j\leq n}{b_j}^2{a_{n+1}}^2
+
\sum_{1\leq j\leq n}{a_j}^2{b_{n+1}}^2
+
{a_{n+1}}^2{b_{n+1}}^2                                 \\
&\qquad -
\bigl[\sum_{1\leq k<j\leq n}\bigl(a_kb_j-a_jb_k\bigr)^2+
\sum_{1\leq k\leq n}\bigl(a_kb_{n+1}-a_{n+1}b_k\bigr)^2  \bigr] \\
&=
\bigl( \sum_{1\leq j\leq n}{a_j}^2\bigr)
\bigl( \sum_{1\leq j\leq n}{b_j}^2\bigr)
-\sum_{1\leq k<j\leq n}\bigl(a_kb_j-a_jb_k\bigr)^2   \\
&\quad +
\sum_{1\leq j\leq n}{b_j}^2{a_{n+1}}^2
+
\sum_{1\leq j\leq n}{a_j}^2{b_{n+1}}^2
+
{a_{n+1}}^2{b_{n+1}}^2                                 \\
&\qquad -
\sum_{1\leq k\leq n}\bigl(a_kb_{n+1}-a_{n+1}b_k\bigr)^2
\end{align}

and apply the inductive hypothesis

\begin{array}{rl}
&=
\bigl( \sum_{1\leq j\leq n}a_jb_j\bigr)^2
+
\sum_{1\leq j\leq n}{b_j}^2{a_{n+1}}^2
+
\sum_{1\leq j\leq n}{a_j}^2{b_{n+1}}^2
+
{a_{n+1}}^2{b_{n+1}}^2                          \\
&\qquad-
\bigl[\sum_{1\leq k\leq n}{a_k}^2{b_{n+1}}^2
-2\sum_{1\leq k\leq n}a_kb_{n+1}a_{n+1}b_k
+\sum_{1\leq k\leq n}{a_{n+1}}^2{b_k}^2\bigr]        \\
&=
\bigl( \sum_{1\leq j\leq n}a_jb_j\bigr)^2
+2\bigl(\sum_{1\leq k\leq n}a_kb_{n+1}a_{n+1}b_k\bigr)
+{a_{n+1}}^2{b_{n+1}}^2                                 \\
&=
\bigl[\bigl(\sum_{1\leq j\leq n}a_jb_j\bigr)+a_{n+1}b_{n+1}\bigr]^2
\end{array}

to derive the left-hand side.

[edit] References

  • O'Hanian, Hans (1985), Physics, 1, W. W. Norton 
  • Ivanoff, V. F. (proposer); Esty, T. C. (solver) (Feb. 1933), "Problem 3529", American Mathematical Mothly 39 (2): 118 
  • Pólya, G. (1954), Mathematics and Plausible Reasoning: Volume II Patterns of Plausible Inference, Princeton University Press