Bioinformatics/Sequence Data/Polymers Sequences
From Wikibooks, the open-content textbooks collection
[edit] Sequences of DNA, RNA and proteins
DNA, RNA and proteins are linear (unbranched) chains of nucleotides (DNA, RNA) and amino acids (proteins). The chemical term for this type of molecule is linear hetero-polymer, because it is composed of several types of monomers (usually 4 types of nucleotides or 20 types of amino acids) as denoted by the term 'hetero', and many monomers as denoted by the term 'poly'. Due to their occurrence in the living cell they belong to the class of biopolymers. The polymerisation occurs via a condensation reaction under release of water molecules. Therefore, the biopolymer chain of DNA, RNA and proteins is strictly speaking not composed of nucleotides or amino acids, but of nucleotide residues or amino acid residues, often abbreviated to 'residues'.
The notation of nucleotide and amino acid sequences is defined as single letter code, meaning that each residue is represented by one alphabet character. The linear sequence of residues is written as sequence of code characters, in computer terms a 'string'.