Special Relativity/Mathematical approach
From Wikibooks, the open-content textbooks collection
Contents |
[edit] Vectors
Physical effects involve things acting on other things to produce a change of position, tension etc. These effects usually depend upon the strength, angle of contact, separation etc of the interacting things rather than on any absolute reference frame so it is useful to describe the rules that govern the interactions in terms of the relative positions and lengths of the interacting things rather than in terms of any fixed viewpoint or coordinate system. Vectors were introduced in physics to allow such relative descriptions.
The use of vectors in elementary physics often avoids any real understanding of what they are. They are a new concept, as unique as numbers themselves, which have been related to the rest of mathematics and geometry by a series of formulae such as linear combinations, scalar products etc.
Vectors are defined as "directed line segments" which means they are lines drawn in a particular direction. The introduction of time as a geometric entity means that this definition of a vector is rather archaic, a better definition might be that a vector is information arranged as a continuous succession of points in space and time. Vectors have length and direction, the direction being from earlier to later.
Vectors are represented by lines terminated with arrow symbols to show the direction. A point that moves from the left to the right for about three centimetres can be represented as:
If a vector is represented within a coordinate system it has components along each of the axes of the system. These components do not normally start at the origin of the coordinate system.
The vector represented by the bold arrow has components a, b and c which are lengths on the coordinate axes. If the vector starts at the origin the components become simply the coordinates of the end point of the vector and the vector is known as the position vector of the end point.
[edit] Addition of Vectors
If two vectors are connected so that the end point of one is the start of the next the sum of the two vectors is defined as a third vector drawn from the start of the first to the end of the second:
c is the sum of a and b:
c = a + b
If a components of a are a, b, c and the components of b are d, e, f then the components of the sum of the two vectors are (a+d), (b+e) and (c+f). In other words, when vectors are added it is the components that add numerically rather than the lengths of the vectors themselves.
Rules of Vector Addition
1. Commutativity a + b = b + a
2. Associativity (a + b) + c = a + (b + c)
If the zero vector (which has no length) is labelled as 0
3. a + (-a) = 0
4. a + 0 = a
[edit] Rules of Vector Multiplication by a Scalar
The discussion of components and vector addition shows that if vector a has components a,b,c then qa has components qa, qb, qc. The meaning of vector multiplication is shown below:
The bottom vector c is added three times which is equivalent to multiplying it by 3.
1. Distributive laws q(a + b) = qa + qb and (q + p)a = qa + pa
2. Associativity q(pa) = qpa
Also 1 a = a
If the rules of vector addition and multiplication by a scalar apply to a set of elements they are said to define a vector space.
[edit] Linear Combinations and Linear Dependence
An element of the form:

is called a linear combination of the vectors.
The set of vectors multiplied by scalars in a linear combination is called the span of the vectors. The word span is used because the scalars (q) can have any value - which means that any point in the subset of the vector space defined by the span can contain a vector derived from it.
Suppose there were a set of vectors (a1,a2,....,am) , if it is possible to express one of these vectors in terms of the others, using any linear combination, then the set is said to be linearly dependent. If it is not possible to express any one of the vectors in terms of the others, using any linear combination, it is said to be linearly independent.
In other words, if there are values of the scalars such that:
(1). 
the set is said to be linearly dependent.
There is a way of determining linear dependence. From (1) it can be seen that if q1 is set to minus one then:

So in general, if a linear combination can be written that sums to a zero vector then the set of vectors (
are not linearly independent.
If two vectors are linearly dependent then they lie along the same line (wherever a and b lie on the line, scalars can be found to produce a linear combination which is a zero vector). If three vectors are linearly dependent they lie on the same line or on a plane (collinear or coplanar).
[edit] Dimension
If n+1 vectors in a vector space are linearly dependent then n vectors are linearly independent and the space is said to have a dimension of n. The set of n vectors is said to be the basis of the vector space.
[edit] Scalar Product
Also known as the 'dot product' or 'inner product'. The scalar product is a way of removing the problem of angular measures from the relationship between vectors and, as Weyl put it, a way of comparing the lengths of vectors that are arbitrarily inclined to each other.
Consider two vectors with a common origin:
The projection of
on the adjacent side is:

Where
is the length of
.
The scalar product is defined as:
(2) 
Notice that cosθ is zero if
and
are perpendicular. This means that if the scalar product is zero the vectors composing it are orthogonal (perpendicular to each other).
(2) also allows cosθ to be defined as:

The definition of the scalar product also allows a definition of the length of a vector in terms of the concept of a vector itself. The scalar product of a vector with itself is:

cos 0 (the cosine of zero) is one so:

which is our first direct relationship between vectors and scalars. This can be expressed as:
(3) 
where a is the length of
.
Properties:
1. Linearity ![[G\mathbf{a} + H\mathbf{b}].\mathbf{c} = G\mathbf{a.c} + H\mathbf{b.c}](http://upload.wikimedia.org/math/9/5/1/95111156c59988c7681dcf1f42a4bd5d.png)
2. symmetry 
3. Positive definiteness
is greater than or equal to 0
4. Distributivity for vector addition 
5. Schwarz inequality 
6. Parallelogram equality 
From the point of view of vector physics the most important property of the scalar product is the expression of the scalar product in terms of coordinates.
7. 
This gives us the length of a vector in terms of coordinates (Pythagoras' theorem) from:
8. 
The derivation of 7 is:

where
are unit vectors along the coordinate axes. From (4)

but 
so:

etc. are all zero because the vectors are orthogonal, also
and
are all one (these are unit vectors defined to be 1 unit in length).
Using these results:

[edit] Matrices
Matrices are sets of numbers arranged in a rectangular array. They are especially important in linear algebra because they can be used to represent the elements of linear equations.
11a + 2b = c
5a + 7b = d
The constants in the equation above can be represented as a matrix:
The elements of matrices are usually denoted symbolically using lower case letters:
Matrices are said to be equal if all of the corresponding elements are equal.
Eg: if aij = bij
Then 
[edit] Matrix Addition
Matrices are added by adding the individual elements of one matrix to the corresponding elements of the other matrix.
cij = aij + bij
or 
Matrix addition has the following properties:
1. Commutativity 
2. Associativity 
and
3. 
4. 
From matrix addition it can be seen that the product of a matrix
and a number p is simply
where every element of the matrix is multiplied individually by p.
Transpose of a Matrix
A matrix is transposed when the rows and columns are interchanged:
Notice that the principal diagonal elements stay the same after transposition.
A matrix is symmetric if it is equal to its transpose eg: akj = ajk.
It is skew symmetric if
eg: akj = − ajk. The principal diagonal of a skew symmetric matrix is composed of elements that are zero.
Other Types of Matrix
Diagonal matrix: all elements above and below the principal diagonal are zero.
Unit matrix: denoted by I, is a diagonal matrix where all elements of the principal diagonal are 1.
[edit] Matrix Multiplication
Matrix multiplication is defined in terms of the problem of determining the coefficients in linear transformations.
Consider a set of linear transformations between 2 coordinate systems that share a common origin and are related to each other by a rotation of the coordinate axes.
Two Coordinate Systems Rotated Relative to Each Other
If there are 3 coordinate systems, x, y, and z these can be transformed from one to another:
x1 = a11y1 + a12y2
x2 = a21y1 + a22y2
y1 = b11z1 + b12z2
y2 = b21z1 + b22z2
x1 = c11z1 + c12z2
x2 = c21z1 + c22z2
By substitution:
x1 = a11(b11z1 + b12z2) + a12(b21z1 + b22z2)
x2 = a21(b11z1 + b12z2) + a22(b21z1 + b22z2)
x1 = (a11b11 + a12(b21)z1 + (a11b12 + a12b22)z2
x2 = (a21b11 + a22(b21)z1 + (a21b12 + a22b22)z2
Therefore:
c11 = (a11b11 + a12(b21)
c12 = (a11b12 + a12b22)
c21 = (a21b11 + a22b21)
c22 = (a21b12 + a22b22)
The coefficient matrices are:
From the linear transformation the product of A and B is defined as:
In the discussion of scalar products it was shown that, for a plane the scalar product is calculated as:
where a and b are the coordinates of the vectors a and b.
Now mathematicians define the rows and columns of a matrix as vectors:
A Column vector is 
And a Row vector 
Matrices can be described as vectors eg:
and
Matrix multiplication is then defined as the scalar products of the vectors so that:
From the definition of the scalar product
etc.
In the general case:
This is described as the multiplication of rows into columns (eg: row vectors into column vectors). The first matrix must have the same number of columns as there are rows in the second matrix or the multiplication is undefined.
After matrix multiplication the product matrix has the same number of rows as the first matrix and columns as the second matrix:
times
has 2 rows and 1 column 
ie: first row is 1 * 2 + 3 * 3 + 4 * 7 = 39 and second row is 6 * 2 + 3 * 3 + 2 * 7 = 35
times
has 2 rows and 3 columns
Notice that
cannot be determined because the number of columns in the first matrix must equal the number of rows in the second matrix to perform matrix multiplication.
Properties of Matrix Multiplication
1. Not commutative 
2. Associative 

3. Distributative for matrix addition

matrix multiplication is not commutative so
is a separate case.
4. The cancellation law is not always true:
does not mean
or 
There is a case where matrix multiplication is commutative. This involves the scalar matrix where the values of the principle diagonal are all equal. Eg:
In this case
. If the scalar matrix is the unit matrix:
.
[edit] Linear Transformations
A simple linear transformation such as:
x1 = a11y1 + a12y2
x2 = a21y1 + a22y2
can be expressed as:

eg:
and y1 = b11z1 + b12z2
y2 = b21z1 + b22z2
as: 
Using the associative law:

and so:
as before.
[edit] Indicial Notation
Consider a simple rotation of coordinates:
xμ is defined as x1 , x2
xν is defined as
, 
The scalar product can be written as:

Where:
and is called the metric tensor for this 2D space.

Now, g11 = 1, g12 = 0, g21 = 0, g22 = 1 so:

If there is no rotation of coordinates the scalar product is:


Which is Pythagoras' theorem.
[edit] The Summation Convention
Indexes that appear as both subscripts and superscripts are summed over.

by promoting ν to a superscript it is taken out of the summation ie:.

[edit] Matrix Multiplication in Indicial Notation
Consider:
Columns times rows:
times
= 
Matrix product
Where i = 1, 2 j = 1, 2
There being no summation the indexes are both subscripts.
Rows times columns:
times
= 
Matrix product 
Where δij is known as Kronecker delta and has the value 0 when
and 1 when i = j. It is the indicial equivalent of the unit matrix:

There being summation one value of i is a subscript and the other a superscript.
A matrix in general can be specified by any of:
, Mij ,
, Mij depending on which subscript or superscript is being summed over.
[edit] Vectors in Indicial Notation
A vector can be expressed as a sum of basis vectors.

In indicial notation this is: x = aiei
[edit] Linear Transformations in indicial notation
Consider
where
is a coefficient matrix and
and
are coordinate matrices.
In indicial notation this is:

which becomes:



[edit] The Scalar Product in indicial notation
In indicial notation the scalar product is:

[edit] Analysis of curved surfaces and transformations
It became apparent at the start of the nineteenth century that issues such as Euclid's parallel postulate required the development of a new type of geometry that could deal with curved surfaces and real and imaginary planes. At the foundation of this approach is Gauss's analysis of curved surfaces which allows us to work with a variety of coordinate systems and displacements on any type of surface.
Elementary geometric analysis is useful as an introduction to Special Relativity because it suggests the physical meaning of the coefficients that appear in coordinate transformations.
Suppose there is a line on a surface. The length of this line can be expressed in terms of a coordinate system. A short length of line Δs in a two dimensional space may be expressed in terms of Pythagoras' theorem as:
Δs2 = Δx2 + Δy2
Suppose there is another coordinate system on the surface with two axes: x1, x2, how can the length of the line be expressed in terms of these coordinates? Gauss tackled this problem and his analysis is quite straightforward for two coordinate axes:
Figure 1:
It is possible to use elementary differential geometry to describe displacements along the plane in terms of displacements on the curved surfaces:


The displacement of a short line is then assumed to be given by a formula, called a metric, such as Pythagoras' theorem
ΔS2 = ΔY2 + ΔZ2
The values of ΔY and ΔZ can then be substituted into this metric:

Which, when expanded, gives the following:
ΔS2 =




This can be represented using summation notation:

Or, using indicial notation:
ΔS2 = gikΔxiΔxk
Where:

If the coordinates are not merged then Δs is dependent on both sets of coordinates. In matrix notation:

becomes:
times
times 
Where a, b, c, d stand for the values of gik.
Therefore:
times 
Which is:

So:

Δs2 is a bilinear form that depends on both Δx1 and Δx2. It can be written in matrix notation as:

Where A is the matrix containing the values in gik. This is a special case of the bilinear form known as the quadratic form because the same matrix (
) appears twice; in the generalised bilinear form
(the matrices
and
are different).
If the surface is a Euclidean plane then the values of gik are:

Which become:
So the matrix A is the unit matrix I and:

and:

Which recovers Pythagoras' theorem yet again.
If the surface is derived from some other metric such as Δs2 = − ΔY2 + ΔZ2 then the values of gik are:

Which becomes:
Which allows the original metric to be recovered ie:
.
It is interesting to compare the geometrical analysis with the transformation based on matrix algebra that was derived in the section on indicial notation above:

Now,
ie: g11 = 1, g12 = 0, g21 = 0, g22 = 1 so:

If there is no rotation of coordinates the scalar product is:


Which recovers Pythagoras' theorem. However, the reader may have noticed that Pythagoras' theorem had been assumed from the outset in the derivation of the scalar product (see above).
The geometrical analysis shows that if a metric is assumed and the conditions that allow differential geometry are present then it is possible to derive one set of coordinates from another. This analysis can also be performed using matrix algebra with the same assumptions.
The example above used a simple two dimensional Pythagorean metric, some other metric such as the metric of a 4D Minkowskian space:
ΔS2 = − ΔT2 + ΔX2 + ΔY2 + ΔZ2
could be used instead of Pythagoras' theorem.























