R Programming/Introduction

From Wikibooks, open books for an open world
Jump to: navigation, search

Contents

[edit] What is R ?

R is statistical software which is used for data analysis. It includes a huge number of statistical procedures such as t-test, chi-square tests, standard linear models, instrumental variables estimation, local polynomial regressions, etc. It also provides high-level graphics capabilities.

[edit] Why use R?

  • R is free software. R is an official GNU project and distributed under the Free Software Foundation General Public License (GPL).
  • R is a powerful data-analysis package with many standard and cutting-edge statistical functions. See the Comprehensive R Archive Network's Task Views to get an idea of what you can do with R.
  • R is a programming language, so its abilities can easily be extended through the use of user-defined functions. A large collection of user-contributed functions and packages can be found at CRAN's Contributed Packages page.
  • R is widely used in political science, statistics, econometrics, actuarial sciences, sociology, finance, etc.
  • R is available for all major operating systems (Windows, Mac OS, Linux).
  • R is object oriented. Virtually anything (e.g., complex data structures) can be stored as an R object.
  • R is a matrix language.
  • R syntax is much more systematic than Stata or SAS syntax.
  • R can be installed on your USB stick[1].

[edit] Alternatives to R

  • S-PLUS is a commercial version of the same S programming language that R is a free version of.
  • Gretl is free software for econometrics. It has a graphical user interface and is nice for beginners.
  • SPSS is proprietary software which is often used in sociology, psychology and marketing. It is known to be easy to use.
  • GNU PSPP is a free-software alternative to SPSS.
  • SAS is proprietary software that can be used with very large datasets such as census data.
  • Stata is proprietary software that is often used in economics and epidemiology.
  • MATLAB is proprietary software used widely in the mathematical sciences and engineering.
  • Octave is free software similar to MATLAB. The syntax is the same and MATLAB code can be used in Octave.

Beginners can have a look at GNU PSPP or Gretl. Intermediate users can check out Stata. Advanced users who like matrix programming may prefer MATLAB or Octave. Very advanced users may use C or Fortran.

[edit] See also

[edit] R programming style

  • R is an object oriented programming language. This means that virtually everything can be stored as an R object. Each object has a class. This class describes what the object contains and what each function does with it. For instance, plot(x) does not produce the same output if x is the result of a regression or a vector.
  • The assignment symbol is "<-". Alternatively it is also possible to use the classical "=" symbol.

The two following statements are equivalent :

 > a <- 2
 > a = 2
  • Arguments are passed inside round brackets (parentheses).
  • It is often better to put quotes around names but this is not always required.
  • One can easily combine functions. For instance you can directly type
mean(rnorm(1000)^2)
  • "#" comments to the end of the line
 # This is a comment
 5 + 7 # This is also a comment
  • Commands are separated by a semicolon ";" or newline. If you want to put more than one statement on a line, you can use the ";" delimiter.
 a <- 1:10 ; mean(a) ;
  • You can also have one statement on multiple lines.
  • R is case sensitive : "a" and "A" are two different objects.
  • Traditionally underscore "_" are not used in names. It is often better to use dots ".". One should avoid underscore as the first character of an object name.

[edit] See Also

[edit] References

Index Next: Sample Session
Personal tools
Namespaces
Variants
Actions
Navigation
Community
Toolbox
Sister projects
Print/export