Next Generation Sequencing (NGS)/Genotyping by sequencing
Genotyping By Sequencing (GBS) is a method of cataloguing genetic variation between individuals on a common reference genome by using Next Generation Sequencing.
Overview
[edit | edit source]Genetic variation between individuals is a key component of explaining phenotype, traits or characteristics. For this reason, cataloguing variations between individuals is an important step towards understanding, predicting, and ultimately engineering biology.
There are many methods of cataloguing variation between individuals. Genotyping By Sequencing (GBS) being a common method.
GBS relies on generating sequencing data for an individual which is aligned to the reference genome. Variations are then called on the reference using a variety of methods.
Useful background
[edit | edit source]There are two key 'biological' concepts related to GBS, sequencing, alignment and genotyping.
From an informatics perspective, imputation.
Objective
[edit | edit source]For a given individual, GBS produces a catalogue of variation for that individual. It is a relatively cheap and 'un-biased' method of discovering variations, not being limited to variations within genes, or transcripts, and without any prior expectation of polymorphic sites.
Biological questions
[edit | edit source]Variation from GBS can be used to explain quantitative differences between individuals, better understand genetic relationships between groups of individuals, allow high density genetic maps to be constructed. GWAS.
Inputs and outputs
[edit | edit source]Inputs
[edit | edit source]- A reference genome
- Whole Genome Sequencing over one or more individuals.
Outputs
[edit | edit source]- For each individual, a list of variant loci.
- Variations may include SNPs, InDels, or Structural Variations, depending on the sequencing technology used.
Experimental design
[edit | edit source]There are two broad approaches to GBS, 1) relatively high depth of coverage over N individuals, or 2) relatively low level coverage of N groups of individuals (with relativly high coverage per group).
If only short read sequencing technology is used, then only SNPs and small InDels will be discovered. If longer reads and paired-end sequencing technology is used, larger InDels and structural variation may be discovered.
Typical steps in the method
[edit | edit source]Sample collection
[edit | edit source]- Summary
- Brief bit of information.
Sequencing
[edit | edit source]- Summary
- Brief bit of information.
QC and filtering
[edit | edit source]- Summary
- Brief bit of information.
Alignment
[edit | edit source]- Summary
- Brief bit of information.
Genotype calling
[edit | edit source]- Summary
- Brief bit of information.
Data submission
[edit | edit source]- Summary
- Brief bit of information.
Next steps
[edit | edit source]Depending on the type of experiment, GWAS, GSEA, Genetic Mapping.
Workflows
[edit | edit source]Example galaxy workflow
[edit | edit source]Link to an example galaxy workflow for for the method (including example datasets) on a given galaxy instance or to the XML document describing the workflow.
Example command line workflow
[edit | edit source]Discussion
[edit | edit source]POV discussion about the method.
Links to related discussion on BioStar: Genotyping By Sequencing on Biostars