User:SBabovic/CytometRy

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Analyzing flow cytometry data using R ::: PAGE UNDER DEVELOPMENT

[edit | edit source]

This is a very basic guide to using R for analysis of flow cytometry data. It is intended for biologists already comfortable with flow cytometry but with minimal or no knowledge of programming. For people totally new to R, we highly recommend completing tutorials #1 and #2 on Cyclismo.org to get a basic sense of the syntax of the language. An even more user-friendly introduction to R can be found at CodeSchool. People already familiar with other programming languages, or who already have some knowledge of R, may find aRrgh highly useful.

Before you start

[edit | edit source]

You first need to install the actual program R. This page has a great set of instructions.

The packages used to analyze flow data are part of a set called Bioconductor. To install Bioconductor, open the R console and type in the following lines:

source("http://bioconductor.org/biocLite.R")
biocLite()

This installs a core set of Bioconductor packages.

You now need to install the packages flowCore and flowViz.

biocLite("flowCore")
biocLite("flowViz")

You only need to install these packages once, but have to load them every time you need to use them, by using the 'library' command:

library(flowCore)
library(flowViz)

While working in R, please take note of the following rules:

  • Capitalization matters; 'flowcore' is not the same as 'flowCore'
  • R does not work well with file names that include spaces; avoid these whenever possible.

Sample files

[edit | edit source]

A set of sample files that accompany this tutorial can be downloaded from this page.

Getting started

[edit | edit source]

First you need to set your working directory. This directory should contain all FCS files used in this tutorial; in addition, new files created will be placed in this directory. In Windows, this will look something like this (replace the path name with the folder you're using to store files for this tutorial, and be sure to replace the default back slash with a forward slash!):

setwd("C:/Documents and Settings/username/My Documents/R tutorial")

On a Mac, this will look something like:

setwd("/Users/username/R tutorial")

Now load the libraries flowCore and flowViz:

library(flowCore)
library(flowViz)

If you're getting an error message at this point, it's probably because you haven't installed the relevant packages yet. Please see the Before you start guide before proceeding further.

Next step is to define the function fluortrans. This can be used to transform any channel by applying a biexponential transformation, where values up to 150 are displayed on a linear scale, and any values higher than that on a logarithmic scale (you can change the cutoff).

fluortrans<-arcsinhTransform(transformationId="fluorTransform",a=0,b=(1/150),c=0)

Furthermore, we need to define the function draw2dgate in order to be able to draw two dimensional gates.

LINK TO FUNCTION HERE

The function draw2dplot allows the construction of pseudocolor plots.

LINK TO FUNCTION HERE

At this point, you can choose how to proceed further. If you want to learn how to set up an automatic compensation script in R without relying on any previous matrix, follow the instructions outlined here. If you already did your compensation on the flow cytometry instrument and would like to use that compensation matrix for further analysis, you can skip to this section.

Using R to create a compensation matrix

[edit | edit source]

This section will use sample flow cytometry files, available for download on this page. The full, unannotated compensation script can be downloaded from this page; an explanation of the steps is below.

To start, we need to define positive and negative populations for each of the single stained controls. The first step is to load a file (NB: avoid using spaces in the file name). If it doesn't load, make sure the file is in your working directory and that the extension is .FCS

APCss<-read.FCS("APC.fcs")

Then, apply a bioexponential transformation (function fluortrans) to the colour.

APCss<-transform(APCss,`APC`=fluortrans(`APC-A`))

Finally, draw gates for the positive and negative population (in that order). A gate can be closed by clicking the "Escape" button (Mac) or ??? (Windows, Unix). The positive and negative cell populations should have similar forward scatter (FSC), an indication of size.

APCpos<-Subset(APCss,polygonGate(draw2dgate(APCss,"APC","FSC-A")))
APC positive cell gate
APCneg<-Subset(APCss,polygonGate(draw2dgate(APCss,"APC","FSC-A")))
APC negative cell gate

To check if the gates have been defined successfully, you can try to call up the associated objects, e.g.

APCneg

The result should be something like this:

Example of a subsetted APC negative population

Or you can plot it using the draw2dplot function:

draw2dplot(APCneg,"APC","FSC-A")
APC negative cell population

These steps should be repeated for each of the single stained controls. The full script is available here. You can paste all of the subsetting commands into R all at once; closing one gate will automatically move it onto the next subset.

We now need to define the parameters to be used to create the compensation matrix:

params<-c("FITC-A","PerCP-Cy5-5-A","Alexa Fluor 700-A","APC-A","DAPI-A","PE-A")

As well as create lists of files corresponding to the positive and negative populations we just defined.

posfiles<-list(FITCpos,PerCPpos,AF700pos,APCpos,DAPIpos,PEpos)
negfiles<-list(FITCneg,PerCPneg,AF700neg,APCneg,DAPIneg,PEneg)

It's very important to keep all the variables in the same order between the three lists.

Finally, we need to run a piece of code that will create a compensation matrix from the variables and populations defined:

posfiles2<-list()
	for(i in 1:length(posfiles)){
  	      posfiles2[[i]]<-posfiles[[i]][,params]}
	negfiles2<-list()
	for(i in 1:length(negfiles)){
  	      negfiles2[[i]]<-negfiles[[i]][,params]}

tt<-fsApply(as(posfiles2,"flowSet"), each_col,"median")[, params]-fsApply(as(negfiles2,"flowSet"), each_col,"median")[, params]
Rcompmatrix<-pmax(tt,0)
Rcompmatrix <- sweep(Rcompmatrix, 1,apply(Rcompmatrix, 1, max), "/")
rownames(Rcompmatrix)<-params

Rcompmatrix

The resulting matrix should look something like this:

Compensation matrix created in R

The numbers are displayed as proportions; to obtain a value for % spillover, simply multiply by 100. For example, spillover from APC-A into Alexa Fluor 700-A is 50.00%.

It's possible to save this matrix to a .CSV file, for future reference:

write.table(Rcompmatrix,"R Comp Matrix.csv",quote=F,sep=",",row.names=T)

A sample matrix can be downloaded here. To load it into R, use the following command. If it doesn't load, make sure the file is in the working directory you've set.

Rcompmatrix<-read.csv("R Comp Matrix.csv",row.names=1,header=T,check.names=F)

We have previously defined the function fluortrans, a type of biexponential transformation that can be applied to one channel at a time. We now need to create a transformation function, trans, that will apply the fluortrans transformation to all the fluorescence channels in a particular file:

trans<-function(flowframe){
        output<-transform(flowframe,`APC`=fluortrans(`APC-A`),`AF700`=fluortrans(`Alexa Fluor 700-A`),
	`PE`=fluortrans(`PE-A`),`FITC`=fluortrans(`FITC-A`),
        `PerCP`=fluortrans(`PerCP-Cy5-5-A`),`DAPI`=fluortrans(`DAPI-A`))
        return(output)}

Finally, defining the compPlots function will allow the generation of compensation plots for each of the channels.

LINK TO FUNCTION HERE

We also need to define a new set of parameters, consisting of the transformed fluorescence channels:

parameters<-c("APC","AF700","PE","FITC","PerCP","DAPI")

We can now generate compensation plots, e.g. for the channel APC:

compPlots(APCss,Rcompmatrix,"APC",parameters)

This generates a file, "temp.pdf", in the working directory, containing the single stain control channel plotted against all other fluorescence channels. This file is also available here: here The single stain control is always on the y-axis, whereas the other fluorescence channels are indicated in the title of each graph and are plotted on the x-axis.

Upon examination of the compensation plots, it may appear that some channels are over- or under-compensated. It is possible to change each of the values in the matrix individually. In this example, it appears that APC is undercompensated from AF700.

In this plot, APC is undercompensated from AF700.

The current compensation value can be obtained thus:

Rcompmatrix['APC-A','Alexa Fluor 700-A']

In this example, the value is 0.5 (50%). We can increase the value, for example, to 61% (0.61)

Rcompmatrix['APC-A','Alexa Fluor 700-A']<-0.61

We need to generate the compensation plots again, using the altered matrix. This overwrites the previous "temp.pdf" file.

compPlots(APCss,Rcompmatrix,"APC",parameters)
APC is now correctly compensated out of AF700.

An example of the new file is available here

It's necessary to check the spillover of all channels relative to all other channels, and change the values in the compensation matrix as required. When the values are finalized, save the modified compensation matrix for future reference (this overwrites the original version unless you change the file name):

write.table(Rcompmatrix,"R Comp Matrix.csv",quote=F,sep=",",row.names=T)

Using the compensation matrix created during file acquisition

[edit | edit source]