Structural Biochemistry/Proteins/Purification/Edman Sequencing

Edman Degradation

Edman degradation is the process of purifying protein by sequentially removing one residue at a time from the amino end of a peptide.

To solve the problem of damaging the protein by hydrolyzing conditions, Pehr Edman created a new way of labeling and cleaving the peptide. Edman thought of a way of removing only one residue at a time, which did not damage the overall sequencing. This was done by adding Phenyl isothiocyanate, which creates a phenylthiocarbamoyl derivative with the N-terminal. The N-terminal is then cleaved under less harsh acidic conditions, creating a cyclic compound of phenylthiohydantoin PTH-amino acid. This does not damage the protein and leaves two constituents of the peptide. This method can be repeated for the rest of the residues, separating one residue at a time.

Edman degradation is very useful because it does not damage the protein. This allows sequencing of the protein to be done in less time.

Bergmann Degradation

Another non conventional way of stepwise sequencing polypeptides would be through degradation of peptide bonds using benzyl chemistry. The process is carried in three major steps. First, an acyl azide experiences a Curtius rearrangement when hydroxybenzene is added. The Curtius rearrangement is a two-step process, in the first step nitrogen gas is released which forms an acyl nitrene followed by the rearrangement of acyl nitrenes by the shift of an R-group generating isocyanate in a concerted fashion. Second, The intermediate product, benzyl carbamate is generated in the process. Lastly, The Carbobenzyloxy molecule acts as a protecting group which is cleaved by hydrogenolysis finally yielding a primary amide and aldehyde. In summary, The process requires benzoylation, conversion to azides and treatment of the azides with hydroxybenzene; this treatment yields, via rearrangement to isocyanates, carbobenzoxy compounds which require hydrogenation and hydrogenolysis reduction reactions to yield the final product.

Composition of an Amino Acid

Edman sequencing is done best if the composition of the amino acid is known. To determine the composition of the amino acid, the peptide must be hydrolyzed. This can be done by denaturing the protein and heating it and adding HCl for a long time. This causes the individual amino acids to be separated, and they can be separated by ion exchange chromatography. They are then dyed with ninhydrin and the amount of amino acid can be determined by the amount of optical absorbance. This way, the composition but not the sequence can be determined

Dabsyl Chloride

A less useful way to sequence amino acids is to label the N-terminal. This can be done by the use of Dabsyl chloride, which forms a covalent bond with the amine group, and also can be detected because of its fluorescent derivatives. A strong covalent bond is needed because it needs to remain stable when the protein is being hydrolyzed. After hydrolysis, the N-terminal can be determined; however, this method is not very practical because the protein is damaged by the hydrolysis conditions. This may lead to a lost in sequencing.

Sequencing Larger Proteins

Larger proteins cannot be sequenced by the Edman sequencing because of the less than perfect efficiency of the method. A strategy called divide and conquer successfully cleaves the larger protein into smaller, practical amino acids. This is done by using a certain chemical or enzyme which can cleave the protein at specific amino acid residues. The separated peptides can be isolated by chromatography. Then they can be sequenced using the Edman method, because of their smaller size.

In order to put together all the sequences of the different peptides, a method of overlapping peptides is used. The strategy of divide and conquer followed by Edman sequencing is used again a second time, but using a different enzyme or chemical to cleave it into different residues. This allows two different sets of amino acid sequences of the same protein, but at different points. By comparing these two sequences and examining for any overlap between the two, the sequence can be known for the original protein.

For example, trypsin can be used on the initial peptide to cleave it at the carboxyl side of arginine and lysine residues. Using trypsin to cleave the protein and sequencing them individually with Edman degradation will yield many different individual results. Although the sequence of each individual cleaved amino acid segment is known, the order is scrambled. Chymotrypsin, which cleaves on the carboxyl side of aromatic and other bulky nonpolar residues, can be used. The sequence of these segments overlap with those of the trypsin. They can be overlapped to find the original sequence of the initial protein. However, this method is limited in analyzing larger sized proteins (more than 100 amino acids) because of secondary hydrogen bond interference. Other weak intermolecular bonding such as hydrophobic interactions cannot be properly predicted. Only the linear sequence of a protein can be properly predicted assuming the sequence is small enough.