R Programming/Introduction

From Wikibooks, the open-content textbooks collection

Jump to: navigation, search


Contents

[edit] Why use R?

  • R is free and open source.
  • R is a very general statistical package. See CRAN Task View to get an idea of what you can do with R.
  • R is widely used in political science, statistics, econometrics, actuarial sciences, sociology, finance, etc.
  • R is available to all operating system (Windows, Mac OS, Linux).
  • R includes very latest methods.
  • R is object oriented. Virtually everything can be stored as an R object.
  • R is a matrix language.
  • R has a long list of user-written function and packages CRAN Contributed packages
  • R syntax is much more systematic than Stata or SAS syntax.

[edit] See also

[edit] Obtaining Help

  • For each package you have a reference manual available as an HTML file by typing library(help="package_name") or as a PDF on the CRAN website. You also often have Vignettes which are easier to read.
  • 'help()' or '?' gives access to the HTML help file for loaded packages.
> help(lm)
> ?lm
  • Sometimes you need to use quotes to request for help.
> ?"for"
> ?"[["
  • '??' gives access to the HTML help file for all installed packages.
> ??"lm"
  • 'apropos()' gives all the functions containing a keyword.
> apropos("norm")
 [1] "dlnorm"         "dnorm"          "plnorm"        
 [4] "pnorm"          "qlnorm"         "qnorm"         
 [7] "qqnorm"         "qqnorm.default" "rlnorm"        
[10] "rnorm"          "normalizePath" 
  • args() gives the full syntax of a command.

> args("dotchart")

function (x, labels = NULL, groups = NULL, gdata = NULL, cex = par("cex"), 
    pch = 21, gpch = 21, bg = par("bg"), color = par("fg"), gcolor = par("fg"), 
    lcolor = "gray", xlim = range(x[is.finite(x)]), main = NULL, 
    xlab = NULL, ylab = NULL, ...) 
  • RSiteSearch()
  • help.search() search ressources on the internet.
help.search("covariance")
  • see the 'Help menu' for an access to HTML help and vignettes (handouts for specific packages).

[edit] R programming style

  • R is an object oriented programming language. This means that virtually everything can be stored as an R object. Each object has a class. This class describes what the object contains and what each function do with it. For instance, plot(x) does not produce the same output if x is the result of a regression or a vector.
  • The assignment symbol is "<-". Alternatively it is also possible to use the classical "=" symbol.

The two following statements are equivalent :

> a <- 2
> a = 2
  • Arguments are passed into round brackets.
  • It is often better to put quotes around names but this is not always required.
  • Commands are separated by a semi column ";" or newline.
  • One can easily combine functions. For instance you can directly type
mean(rnorm(1000)^2)
  • "#" for comments
# This is a comment
5 + 7 # This is also a comment 
  • If you want to put more than one statement on a line, you can use the ";" delimiter.
a <- 1:10 ; mean(a) ;
  • You can also have one statement on multiple lines.
  • R is case sensitive : "a" and "A" are two different objects.
  • Traditionally underscore "_" are not used in names. It is often better to use dots ".". One shoule avoid underscore as the first caracter of an object name.

[edit] See Also

Google's R Style Guide 
a set of rules for R programmers

[edit] Some useful commands

  • Get/Set working directory :
    • Note for Windows users : R uses slash in the directory instead of antislash.
    • The rgrs package includes a dialog box for changing directory selectwd()
> setwd("~/Desktop")            # Sets working directory (character string enclosed in "...")
> getwd()                       # Returns current working directory
[1] "/Users/username/Desktop"
> library("rgrs")               # Loads rgrs library (character string enclosed in "...")
> selectwd()                    # dialog box for changing directory in rgrs package
  • Listing the content of the working directory
dir()
  • List and remove objects in the workspace :
> z1 = 5+4i                     # Create complex number z1
> z2 = 5 - 4i                   # Create complex number z2
> z = c(z1, z2)                 # Combine complex numbers z1 and z2 into z
> ls()                          # List objects
[1] "z"  "z1" "z2"
> rm (z1)                       # Remove object z1
> ls()                          # List defined objects
[1] "z"  "z2"
> z                             # Get value of z
[1] 5+4i 5-4i
  • Type and class of an object
typeof()
class()
  • Date :
> date()
[1] "Sat Jun 20 17:05:11 2009"
> Sys.time()
[1] "2009-06-20 17:05:54 CEST"
> Sys.Date()
[1] "2009-06-20"
> Sys.timezone()
[1] "CEST"
  • Get CPU time : proc.time()[1]
> proc.time()
[1]   489.28    68.60 29523.36     0.00     0.03
  • Insert comments : #
  • Exit R : q()
q("no")
The "no" argument specifies that the R session is not saved.
  • All the information on the current session are available with the sessionInfo() command.
> sessionInfo()
R version 2.2.0, 2005-10-06, powerpc-apple-darwin7.9.0 

attached base packages:
[1] "methods"   "stats"     "graphics"  "grDevices" "utils"    
[6] "datasets"  "base"     

other attached packages:
 foreign 
"0.8-10" 
  • Managing conflicts with names.
conflicts(detail=TRUE)


  • Get version of the current R session
> getRversion()
[1] ‘2.11.0’
> R.version
               _                                                                
platform       i686-pc-linux-gnu                                                
arch           i686                                                             
os             linux-gnu                                                        
system         i686, linux-gnu                                                  
status         Under development (unstable)                                     
major          2                                                                
minor          11.0                                                             
year           2009                                                             
month          09                                                               
day            30                                                               
svn rev        49906                                                            
language       R                                                                
version.string R version 2.11.0 Under development (unstable) (2009-09-30 r49906)

> #R.Version() ## similar as R.version, return a list

[edit] Controlling output

  • sink()
print()
cat()


> format(pi,digit=22)
[1] "3.141592653589793"
> format(pi,digit=10)
[1] "3.141592654"
> format( 2*10^6, scientific=F)
[1] "2000000"

[edit] Working Sessions

You can save an R session (all the objects in memory)

> save.image(file="~/Documents/Logiciels/R/test.rda")

and load the session.

> load("~/Documents/Logiciels/R/test.rda")

[edit] Special Values

  • NA : Not Available (ie missing values)
  • NaN : Not a Number (eg 0/0)
  • Inf: Infinity
  • -Inf : Minus Infinity.

For instance 0 divided by 0 gives a NaN:

> 0/0
[1] NaN

But 1 divided by 0 gives +\infty

> 1/0
[1] Inf



Index Next: Settings