# Statistics/Multivariate Data Analysis

## Distributions

### Multivariate Normal

The multivariate normal is just an extension of the normal distribution to the multivariate case. The simplest definition of the multivariate normal distribution can be given as follows:

Definition (Multivariate Normal Distribution):

A random vector ${\displaystyle \mathbf {X} }$ of dimension ${\displaystyle p}$ is said to follow a multivariate normal distribution with mean ${\displaystyle \mu }$ and covariance matrix ${\displaystyle \Sigma }$ if ${\displaystyle \forall \mathbf {a} \in \mathbb {R} ^{p},\ \mathbf {a} ^{T}\mathbf {X} \sim {\mathcal {N}}(\mathbf {a} ^{T}\mu ,\mathbf {a} ^{T}\Sigma \mathbf {a} )}$. It is denoted by ${\displaystyle \mathbf {X} \sim {\mathcal {N}}_{p}(\mu ,\Sigma )}$.

At first glance, the definition seems rather abstract and esoteric. After all, the univariate normal distribution has a specific form of density and a specific characteristic function, both of which are mathematically valid characterisations of any probability distribution. However, this kind of definition is necessary to deal with the case where ${\displaystyle \Sigma }$ is not strictly positive definite. In the case where ${\displaystyle \Sigma }$ is positive definite, it can be shown via Gauss-Markov theorem that the density function of ${\displaystyle \mathbf {X} ,\ f_{\mathbf {X} }(\mathbf {x} )={\frac {1}{{\sqrt {2\pi }}|\Sigma |^{\frac {1}{2}}}}e^{-{\frac {1}{2}}(\mathbf {x} -\mu )^{T}\Sigma ^{-1}(\mathbf {x} -\mu )}}$. However, this will not be true when ${\displaystyle \Sigma }$ is singular, as in that case the density function will not exist. But a definition based on the characteristic function will still work. A piecewise density function can still be derived based on the eigenvalues of ${\displaystyle \Sigma }$, but it is not a true density.

### Matrix-variate Normal

We will first need to develop some notation. Let ${\displaystyle X_{m\times n}}$ be a matrix with columns ${\displaystyle c_{(1)},c_{(2)},\ldots ,c_{(n)}}$. Then we define the column vector ${\textstyle vec(X):={\begin{bmatrix}c_{(1)}\\c_{(2)}\\\vdots \\c_{(n)}\end{bmatrix}}}$, and we call it the vectorisation of ${\displaystyle X}$.

Definition (Matrix-variate Normal):

We say ${\displaystyle X_{m\times n}}$ follows a matrix-variate normal distribution with mean matrix ${\displaystyle \mu _{m\times n}}$ and covariance matrix ${\displaystyle \Sigma _{mn\times mn}}$ if ${\displaystyle vec(X)\sim {\mathcal {N}}_{mn}(vec(\mu ),\Sigma )}$

The reader here should notice that this is simply imposing a normal distribution on the vectorisation of ${\displaystyle X}$. Thus, many of the results that are true for multivariate normal random vector will also be true for the vectorisation of matrix variate normal random variable.

Now that we have a definition of the multivariate and matrix-variate normal distribution, our next aim should be to find a similar analogue of the univariate ${\displaystyle \chi _{(p)}^{2}}$ distribution with ${\displaystyle p}$ degrees of freedom and Student's ${\displaystyle t}$ distribution, both of which are very closely related to the univariate normal distribution. We know that if ${\displaystyle X_{i}\sim {\mathcal {N}}(\mu _{i},\sigma _{i}^{2})\ \forall i\in \{1,2,\ldots n\}}$ then ${\displaystyle \sum _{i=1}^{n}{\frac {(X_{i}-\mu _{i})^{2}}{\sigma _{i}^{2}}}\sim \chi _{(n)}^{2}}$. What would be an analogue of this for the multivariate case?

### Wishart Distribution

Definition (Wishart Distribution):

If ${\displaystyle \mathbf {X} _{i}{\overset {iid}{\sim }}{\mathcal {N}}_{p}(\mu ,\Sigma )}$ for ${\displaystyle i\in \{1,2,\ldots n\}}$, then ${\displaystyle S=\sum _{i=1}^{n}(\mathbf {X_{i}} -\mu )(\mathbf {X_{i}} -\mu )^{T}}$ is said to have a Wishart distribution with ${\displaystyle n}$ degrees of freedom and associated matrix ${\displaystyle \Sigma }$. It is denoted by ${\displaystyle S\sim W_{p}(n,\Sigma )}$.

Although there does exist a form of density for the Wishart distribution, it is not necessary to prove most of the results we will require. An important thing to note, however, is that if ${\displaystyle S}$ follows a Wishart distribution, then ${\displaystyle {\frac {\mathbf {a} ^{T}S\mathbf {a} }{\mathbf {a} ^{T}\Sigma \mathbf {a} }}\sim \chi _{(n)}^{2}}$. This result can be easily proved by multiplying ${\displaystyle S}$ on the left and right by ${\displaystyle \mathbf {a} ^{T}}$and ${\displaystyle \mathbf {a} }$, and then using the fact that ${\displaystyle \mathbf {a} ^{T}\mathbf {X} \sim {\mathcal {N}}(\mathbf {a} ^{T}\mu ,\mathbf {a} ^{T}\Sigma \mathbf {a} )}$.