Statistics/Numerical Methods

From Wikibooks, open books for an open world
Jump to: navigation, search

Often the solution of statistical problems and/or methods involve the use of tools from numerical mathematics. An example might be Maximum-Likelihood estimation of \widehat{\Theta}which involves the maximization of the Likelihood function L:

\widehat{\Theta} = \max_\theta\,\, L(\theta | x_1, ..., x_n).

The maximization here requires the use of optimization routines. Other numerical methods and their application in statistics are described in this section.

Contents of this section:

This section is dedicated to the Gram-Schmidt Orthogonalization which occurs frequently in the solution of statistical problems. Additionally some results of algebra theory which are necessary to understand the Gram-Schmidt Orthogonalization are provided. The Gram-Schmidt Orthogonalization is an algorithm which generates from a set of linear dependent vectors a new set of linear independent vectors which span the same space. Computation based on linear independent vectors is simpler than computation based on linear dependent vectors.

Numerical Optimization occurs in all kind of problem - a prominent example being the Maximum-Likelihood estimation as described above. Hence this section describes one important class of optimization algorithms, namely the so-called Gradient Methods. After describing the theory and developing an intuition about the general procedure, three specific algorithms (the Method of Steepest Descent, the Newtonian Method, the class of Variable Metric Methods) are described in more detail. Especially we provide an (graphical) evaluation of the performance of these three algorithms for specific criterion functions (the Himmelblau function and the Rosenbrock function). Furthermore we come back to Maximum-Likelihood estimation and give a concrete example how to tackle this problem with the methods developed in this section.

In OLS, one has the primary goal of determining the conditional mean of random variable Y, given some explanatory variable x_i , E[Y|x_i]. Quantile Regression goes beyond this and enables us to pose such a question at any quantile of the conditional distribution function. It thereby focuses on the interrelationship between a dependent variable and its explanatory variables for a given quantile.

Statistical calculations require an extra accuracy and are open to some errors such as truncation or cancellation error etc. These errors occur due to binary representation and finite precision and may cause inaccurate results. In this work we are going to discuss the accuracy of the statistical software, different tests and methods available for measuring the accuracy and the comparison of different packages.

The purpose of this paper is to evaluate the accuracy of MS Excel in terms of statistical procedures and to conclude whether the MS Excel should be used for (statistical) scientific purposes or not. The evaluation is made for MS Excel versions 97, 2000, XP and 2003.