Jump to content

R Programming/Control Structures

From Wikibooks, open books for an open world

Conditional execution

[edit | edit source]
  • Help for programming :
> ?Control

if accepts a unidimensional condition.

> if (condition){
+     statement  
+     } 
> else{
+     alternative
+     }

The unidimensional condition may be one of TRUE or FALSE, T or F, 1 or 0 or a statement using the truth operators:

  • x == y "x is equal to y"
  • x != y "x is not equal to y"
  • x > y "x is greater than y"
  • x < y "x is less than y"
  • x <= y "x is less than or equal to y"
  • x >= y "x is greater than or equal to y"

And may combine these using the & or && operators for AND. | or || are the operators for OR.

> if(TRUE){
+     print("This is true")
+     }
  [1] "This is true"
> x <- 2  # x gets the value 2
> if(x==3){
+     print("This is true")
+     } else {
+     print("This is false")
+     }
 [1] "This is false"
> y <- 4 # y gets the value 4
> if(x==2 && y>2){
+     print("x equals 2 and y is greater than 2")
+     }
 [1] "x equals 2 and y is greater than 2"

The ifelse() command takes as first argument the condition, as second argument the treatment if the condition is true and as third argument the treatment if the condition is false. In that case, the condition can be a vector. For instance we generate a sequence from 1 to 10 and we want to display values which are lower than 5 and greater than 8.

> x <- 1:10 
> ifelse(x<5 | x>8, x, 0)
 [1]  1  2  3  4  0  0  0  0  9 10

R has some very useful handlers for sets to select a subset of a vector:

> x = runif(10)
> x<.5
 [1]  TRUE FALSE FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE
> x
 [1] 0.32664759 0.57826623 0.98171138 0.01718607 0.24564238 0.62190808 0.74839301 
 [8] 0.32957783 0.19302650 0.06013694
> x[x<.5]
[1] 0.32664759 0.01718607 0.24564238 0.32957783 0.19302650 0.06013694

to exclude a subset of a vector:

> x = 1:10
> x
 [1]  1  2  3  4  5  6  7  8  9 10
> x[-1:-5]
[1]  6  7  8  9 10

Loops

[edit | edit source]

Implicit loops

[edit | edit source]
Example of fast code using vectorisation

R has support for implicit loops, which is called vectorization. This is built-in to many functions and standard operators. for example, the + operator can add two arrays of numbers without the need for an explicit loop.

Implicit Loops are generally slow, and it is better to avoid them when it is possible.

  • apply() can apply a function to elements of a matrix or an array. This may be the rows of a matrix (1) or the columns (2).
  • lapply() applies a function to each column of a dataframe and returns a list.
  • sapply() is similar but the output is simplified. It may be a vector or a matrix depending on the function.
  • tapply() applies the function for each level of a factor.
> N <- 10
> x1 <- rnorm(N)
> x2 <- rnorm(N) + x1 + 1
> male <- rbinom(N,1,.48)
> y <- 1 + x1 + x2 + male + rnorm(N)
> mydat <- data.frame(y,x1,x2,male)
> lapply(mydat,mean) # returns a list
$y
[1] 3.247

$x1
[1] 0.1415

$x2
[1] 1.29

$male
[1] 0.5

> sapply(mydat,mean) # returns a vector
     y     x1     x2   male 
3.2468 0.1415 1.2900 0.5000 
> apply(mydat,1,mean) # applies the function to each row
 [1]  1.1654  2.8347 -0.9728  0.6512 -0.0696  3.9206 -0.2492  3.1060  2.0478  0.5116
> apply(mydat,2,mean) # applies the function to each column
     y     x1     x2   male 
3.2468 0.1415 1.2900 0.5000 
> tapply(mydat$y,mydat$male,mean) # applies the function to each level of the factor
    0     1 
1.040 5.454
  • See also aggregate() which is similar to tapply() but is applied to a dataframe instead of a vector.

Explicit loops

[edit | edit source]

R provides three ways to write loops: for, repeat and while. The for statement is excessively simple. You simply have to define index (here k) and a vector (in the example below the vector is 1:5) and you specify the action you want between braces.

> for (k in 1:5){
+ print(k)
+ }
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5

When it is not possible to use the for statement, you can also use break or while by specifying a breaking rules. One should be careful with this kind of loops since if the breaking rules is misspecified the loop will never end. In the two examples below the standard normal distribution is drawn in as long as the value is lower than 1. The cat() function is used to display the present value on screen.

> repeat { 
+ 	g <- rnorm(1) 
+ 	if (g > 1.0) break 
+ 	cat(g,"\n")
+ 	} 
-1.214395 
0.6393124 
0.05505484 
-1.217408 
> g <- 0
> while (g < 1){
+ 	g <- rnorm(1) 
+ 	cat(g,"\n")
+ 	}
-0.08111594 
0.1732847 
-0.2428368 
0.3359238 
-0.2080000 
0.05458533 
0.2627001 
1.009195

The next statement can be used to discontinue one particular cycle and skip to the “next”.

> for (k in 1:10) { 
+   if(k==8) {
+     print("skipped")
+     next
+   }
+   print(k)
+ }
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] "skipped"
[1] 9
[1] 10

Iterators

[edit | edit source]

References

[edit | edit source]
Previous: Random Number Generation Index Next: Data Management