DChip/Classify Genes

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Classify genes by annotation terms

After a list of genes is obtained by “Compare samples” or “Filter genes”, we can use the “Tools/Classify Genes” dialog to classify these genes into different groups according to GeneOntology or other annotational terms. Gene groups have header lines such as “Found 15 GeneOntology 'response to external stimulus' genes in a 120-group (all: 1068/7734, PValue: 0.661181)”. The p-values are calculated in the same way as for the significant gene clusters. Here 120 is the number of genes having GeneOntology annotation in the input gene list, thus may be fewer than the actual number of genes in the list. Note that at "Tools/Classify genes", the whole gene list is considered to assess the significant enrichment; while at clustering, every gene clusters with at least 4 annotated genes is considered. Thus the former gives fewer significant gene groups than the latter.

Significant p-values as defined in the “Tools/Options/Clustering” dialog are suffixed by stars (“***”) in the output file. Also one may check the “Only report significant results” box to output only gene groups with significant p-values. The additional data columns such as expression values or fold changes of the “gene list file” will be copied into the output “classified file”.

To prevent multiple probe sets for the same gene from biasing the result of the functional significance computation, it is best to check “Analysis/Open group/Options/Analysis/Mask redundant probe sets” to exclude the redundant probe sets (identified by LocusLink ID) from a gene list. This can also be done at “Tools/Options/Analysis/Mask redundant probe sets”, but redoing “Analysis/Open group” is desired since the array background information on gene annotation is computed after reading in the “gene information file”.