R Programming/Introduction
From Wikibooks, open books for an open world
Contents |
[edit] What is R ?
R is statistical software which is used for data analysis. It includes a huge number of statistical procedures such as t-test, chi-square tests, standard linear models, instrumental variables estimation, local polynomial regressions, etc. It also provides high-level graphics capabilities.
[edit] Why use R?
- R is free software. R is an official GNU project and distributed under the Free Software Foundation General Public License (GPL).
- R is a powerful data-analysis package with many standard and cutting-edge statistical functions. See the Comprehensive R Archive Network's Task Views to get an idea of what you can do with R.
- R is a programming language, so its abilities can easily be extended through the use of user-defined functions. A large collection of user-contributed functions and packages can be found at CRAN's Contributed Packages page.
- R is widely used in political science, statistics, econometrics, actuarial sciences, sociology, finance, etc.
- R is available for all major operating systems (Windows, Mac OS, Linux).
- R is object oriented. Virtually anything (e.g., complex data structures) can be stored as an R object.
- R is a matrix language.
- R syntax is much more systematic than Stata or SAS syntax.
- R can be installed on your USB stick[1].
[edit] Alternatives to R
- S-PLUS is a commercial version of the same S programming language that R is a free version of.
- Gretl is free software for econometrics. It has a graphical user interface and is nice for beginners.
- SPSS is proprietary software which is often used in sociology, psychology and marketing. It is known to be easy to use.
- GNU PSPP is a free-software alternative to SPSS.
- SAS is proprietary software that can be used with very large datasets such as census data.
- Stata is proprietary software that is often used in economics and epidemiology.
- MATLAB is proprietary software used widely in the mathematical sciences and engineering.
- Octave is free software similar to MATLAB. The syntax is the same and MATLAB code can be used in Octave.
Beginners can have a look at GNU PSPP or Gretl. Intermediate users can check out Stata. Advanced users who like matrix programming may prefer MATLAB or Octave. Very advanced users may use C or Fortran.
[edit] See also
[edit] R programming style
- R is an object oriented programming language. This means that virtually everything can be stored as an R object. Each object has a class. This class describes what the object contains and what each function does with it. For instance, plot(x) does not produce the same output if x is the result of a regression or a vector.
- The assignment symbol is "<-". Alternatively it is also possible to use the classical "=" symbol.
The two following statements are equivalent :
> a <- 2 > a = 2
- Arguments are passed inside round brackets (parentheses).
- It is often better to put quotes around names but this is not always required.
- One can easily combine functions. For instance you can directly type
mean(rnorm(1000)^2)
- "#" comments to the end of the line
# This is a comment 5 + 7 # This is also a comment
- Commands are separated by a semicolon ";" or newline. If you want to put more than one statement on a line, you can use the ";" delimiter.
a <- 1:10 ; mean(a) ;
- You can also have one statement on multiple lines.
- R is case sensitive : "a" and "A" are two different objects.
- Traditionally underscore "_" are not used in names. It is often better to use dots ".". One should avoid underscore as the first character of an object name.
[edit] See Also
- Google's R Style Guide : a set of rules for R programmers
[edit] References
- ↑ Portable R by Andrew Redd http://sourceforge.net/projects/rportable/