R Programming/Introduction
From Wikibooks, the open-content textbooks collection
Contents |
[edit] Why use R?
- R is free and open source.
- R is a very general statistical package. See CRAN Task View to get an idea of what you can do with R.
- R is widely used in political science, statistics, econometrics, actuarial sciences, sociology, finance, etc.
- R is available to all operating system (Windows, Mac OS, Linux).
- R includes very latest methods.
- R is object oriented. Virtually everything can be stored as an R object.
- R is a matrix language.
- R has a long list of user-written function and packages CRAN Contributed packages
- R syntax is much more systematic than Stata or SAS syntax.
[edit] See also
[edit] Obtaining Help
- For each package you have a reference manual available as an HTML file by typing library(help="package_name") or as a PDF on the CRAN website. You also often have Vignettes which are easier to read.
- 'help()' or '?' gives access to the HTML help file for loaded packages.
> help(lm) > ?lm
- Sometimes you need to use quotes to request for help.
> ?"for" > ?"[["
- '??' gives access to the HTML help file for all installed packages.
> ??"lm"
- 'apropos()' gives all the functions containing a keyword.
> apropos("norm")
[1] "dlnorm" "dnorm" "plnorm"
[4] "pnorm" "qlnorm" "qnorm"
[7] "qqnorm" "qqnorm.default" "rlnorm"
[10] "rnorm" "normalizePath"
- args() gives the full syntax of a command.
> args("dotchart")
function (x, labels = NULL, groups = NULL, gdata = NULL, cex = par("cex"),
pch = 21, gpch = 21, bg = par("bg"), color = par("fg"), gcolor = par("fg"),
lcolor = "gray", xlim = range(x[is.finite(x)]), main = NULL,
xlab = NULL, ylab = NULL, ...)
- RSiteSearch()
- help.search() search ressources on the internet.
help.search("covariance")
- see the 'Help menu' for an access to HTML help and vignettes (handouts for specific packages).
[edit] R programming style
- R is an object oriented programming language. This means that virtually everything can be stored as an R object. Each object has a class. This class describes what the object contains and what each function do with it. For instance, plot(x) does not produce the same output if x is the result of a regression or a vector.
- The assignment symbol is "<-". Alternatively it is also possible to use the classical "=" symbol.
The two following statements are equivalent :
> a <- 2 > a = 2
- Arguments are passed into round brackets.
- It is often better to put quotes around names but this is not always required.
- Commands are separated by a semi column ";" or newline.
- One can easily combine functions. For instance you can directly type
mean(rnorm(1000)^2)
- "#" for comments
# This is a comment 5 + 7 # This is also a comment
- If you want to put more than one statement on a line, you can use the ";" delimiter.
a <- 1:10 ; mean(a) ;
- You can also have one statement on multiple lines.
- R is case sensitive : "a" and "A" are two different objects.
- Traditionally underscore "_" are not used in names. It is often better to use dots ".". One shoule avoid underscore as the first caracter of an object name.
[edit] See Also
- Google's R Style Guide
- a set of rules for R programmers
[edit] Some useful commands
- Get/Set working directory :
- Note for Windows users : R uses slash in the directory instead of antislash.
- The rgrs package includes a dialog box for changing directory selectwd()
> setwd("~/Desktop") # Sets working directory (character string enclosed in "...")
> getwd() # Returns current working directory
[1] "/Users/username/Desktop"
> library("rgrs") # Loads rgrs library (character string enclosed in "...")
> selectwd() # dialog box for changing directory in rgrs package
- Listing the content of the working directory
dir()
- List and remove objects in the workspace :
> z1 = 5+4i # Create complex number z1 > z2 = 5 - 4i # Create complex number z2 > z = c(z1, z2) # Combine complex numbers z1 and z2 into z > ls() # List objects [1] "z" "z1" "z2" > rm (z1) # Remove object z1 > ls() # List defined objects [1] "z" "z2" > z # Get value of z [1] 5+4i 5-4i
- Type and class of an object
typeof() class()
- Date :
> date() [1] "Sat Jun 20 17:05:11 2009" > Sys.time() [1] "2009-06-20 17:05:54 CEST" > Sys.Date() [1] "2009-06-20" > Sys.timezone() [1] "CEST"
- Get CPU time :
proc.time()[1]
> proc.time() [1] 489.28 68.60 29523.36 0.00 0.03
- Insert comments :
# - Exit R : q()
q("no")
The "no" argument specifies that the R session is not saved.
- All the information on the current session are available with the sessionInfo() command.
> sessionInfo() R version 2.2.0, 2005-10-06, powerpc-apple-darwin7.9.0 attached base packages: [1] "methods" "stats" "graphics" "grDevices" "utils" [6] "datasets" "base" other attached packages: foreign "0.8-10"
- Managing conflicts with names.
conflicts(detail=TRUE)
- Get version of the current R session
> getRversion()
[1] ‘2.11.0’
> R.version
_
platform i686-pc-linux-gnu
arch i686
os linux-gnu
system i686, linux-gnu
status Under development (unstable)
major 2
minor 11.0
year 2009
month 09
day 30
svn rev 49906
language R
version.string R version 2.11.0 Under development (unstable) (2009-09-30 r49906)
> #R.Version() ## similar as R.version, return a list
[edit] Controlling output
- sink()
print()
cat()
> format(pi,digit=22) [1] "3.141592653589793" > format(pi,digit=10) [1] "3.141592654" > format( 2*10^6, scientific=F) [1] "2000000"
[edit] Working Sessions
You can save an R session (all the objects in memory)
> save.image(file="~/Documents/Logiciels/R/test.rda")
and load the session.
> load("~/Documents/Logiciels/R/test.rda")
[edit] Special Values
- NA : Not Available (ie missing values)
- NaN : Not a Number (eg 0/0)
- Inf: Infinity
- -Inf : Minus Infinity.
For instance 0 divided by 0 gives a NaN:
> 0/0 [1] NaN
But 1 divided by 0 gives 
> 1/0 [1] Inf