Statistical Analysis: an Introduction using R/Cover

From Wikibooks, open books for an open world
Jump to navigation Jump to search

An understanding of statistics is an invaluable tool in much of modern science. This is particularly true of fields such as the biosciences and social sciences in which data are “noisy”, that is where chance or uncontrollable effects have a substantial impact on the data recorded. This wikibook is aimed at the scientifically-minded reader who has not been exposed to much statistical thinking. As well as the Statistics wikibook, and guides on the RWiki, there are many introductory statistical texts on the market. This differs from most in that, although it quickly introduces “advanced” topics such as likelihood and Bayesian methods, it does not do so through formal mathematical proofs, but primarily via graphical methods and simulation. The aim is to produce a text that can be used as a “crash course” in statistics for basic-undergraduate-level readers who want to understand how to analyse their data using modern techniques, but who are not particularly mathematical.

The advent of computers has made an enormous impact on the way in which statistics is done in the real world. Although many introductory statistics courses start by describing simple t-tests and chi-squared tests, most scientists now use more advanced techniques to analyse their data. In particular, there is substantial focus on what are known as general and generalised linear models, as well as (more recently) a renewed interest in Bayesian techniques and methods such as bootstrapping. Instead of explaining overly simple t-tests and the like, leaving underlying theory until later, this book aims to do the opposite. The computer program ‘R’, and its flexible graphical capabilities, are used to give a feel for the underlying theory of statistics, notably the concepts of probability, likelihood, probability density functions, sampling distributions, and hypothesis testing. These principles can then be applied to understanding the various commonly used statistical tests, and easily extended to sophisticated statistics such as generalized linear modelling.

It is intended for the main text of the book to give a general understanding of statistical methodology, as independent as possible of the statistical package used. This text should be interspersed with relevant "R topics": sections of annotated R code, clearly distinguished from the main body of the book. It should be possible to use these R topics by themselves as a set of tutorials: an introduction to basic concepts in R.

Using this book, or any portions of it, in any other R or stats-related projects is encouraged: please feel free to copy and use any parts as you see fit. Books with partially overlapping aims to this are detailed at http://www.ling.uni-potsdam.de/~vasishth/SFLS.html and http://wiener.math.csi.cuny.edu/UsingR/. However, neither of these are editable or redistributable.

Notes[edit | edit source]