# Structural Biochemistry/Stability, Mutation and Evolvability

In recent years, scientists have accepted the standard that proteins are able to tolerate most amino acid substitutions but this has been tested and replaced by the concept that the deleterious effects of protein mutations is now the major constraint on protein's ability to change sequences and functions. This article analyzed the different methods for predicting stability effects after a mutation and the different mechanisms that are utilized to compensate for those effects that are destabilizing (and therefore encouraging protein evolability). The most widely accepted idea was that most positions on the protein were able to endure drastic sequence changes while also retaining the protein's configurational stability and function. And although there were exceptions to this view, this hypothesis made the assumption that stability is correlated with activity changes. In 2005, two papers were published that marked the importance of stability effects of mutations to protein evolution and these were then were then studied further to a new link between protein biophysics and it's molecular evolution.

## Protein Fitness

Since mutations can be described as "raw material" for evolution, the selection to continue to sustain the existing structure and function abolishes most protein mutations and this therefore reduces the potential for future adaptions. This concludes the idea that only a small portion of every mutation that occurs will actually be fixed under positive selection to adopt and maintain a new kind of function. Neutral mutations are also termed "neutral drift" which can fix owing to random in small populations. But for the levels of the organisms, the reproduction rates (fitness, W) are not simple and they hardly ever relate with the properties of one type of gene or one type of protein. Because of the effects of redundancy, backup and robustness at a variety of different levels, the effects of mutations are therefore masked. For these reasons, it is safe to conclude that the effects of mutations is a difficulty for evolutionary biologists. But an equation can be utilized to therefore show a simple model of protein fitness. Protein fitness (W) is the fluctuation of an enzyme catalyzed reaction and this is then systematically related to the fitness of the organism in which this particular enzyme functions. So this continual flux is therefore related to (in terms of proportion) to the functional protein's concentration ${\displaystyle [E_{0}]}$ and it's function, f.

${\displaystyle W=[E_{0}]f}$

Research has shown that the concentration of functional protein ${\displaystyle [E_{0}]}$ is related to the protein stability. The deleterious effects of about ≥80% of mutations have been rooted from their effects on stability and folding. Protein disfunctionalization is then caused when the levels of soluble, functional proteins are reduced and protein disfunctionalization is then caused by destabilizing mutations beyond a certain level. Through measuring experimental different proteins the evidence shows that the probability of a mutation to be deleterious is therefore in the range of 33-40% (with a 36% on average). Therefore, it is clear that as mutations aggregate, protein fitness declines exponentially. This is shown through:

W≈e^(-0.36n)

The following equation then represents how with more mutations, the protein fitness then declines accordingly. The n is the average amount of mutations. Therefore, by the time am average protein accumulates (on average, it is about five mutations), the fitness will then decline <20%. So as the initial stability of a protein can shield some of the destabilizing effects of mutations and it can be concluded that the rate of protein evolution is dictated by the stability of the particular protein and therefore, the rate of protein evolution could thus be related to the obtaining of new functions.

## Thermodynamic Stability

(∆G) is used in various models to describe evolution because it is the definition of stability. Thermodynamic stability is therefore the energy difference between the unfolded and native state of the protein but this thermodynamic stability measurement is only reasonable for small proteins. But this calculation does not represent stability of proteins within cellular environments. Therefore, Kinetic stability is greatly valued because it relates the energy levels of the folding intermediates between the unfolded and native states of the protein and can include the mis-folded forms of the protein. Also, these can potentially lead to aggregation and if not degradation. Experimental data therefore relates the changes of thermodynamic stability of mutations which are available only for a small range of proteins. But recent studies have shown that there are advances in calculating have enabled for the prediction of ∆∆G values of mutations throughout a variety of wider range or proteins. The predictions can be based on methods such as based on sequences or the 3D structure of proteins and the combination of sequences and 3D structure of proteins have been combined as well. This prediction is largely correlated to the effects of mutations on the native state and therefore do not include the effects of the native site mutations. It's been noticed that the effects on folding in vivo overlap greatly with the thermodynamic stability effects. Therefore the predictions of kinetic stability effects would be of great value. So there is a challenge for more accurate predictions of the effects of mutations that can be related to protein levels because they remain in vivo.

## Relationship between Stability and Protein Fitness

The relationship between protein fitness and mutation is dictated by the following equation:

${\displaystyle W\propto [E_{0}]=1-{\frac {1}{e^{-{\frac {\Delta G}{RT}}}+1}}}$

This sigmodial relationship shows that more than 99% of folded proteins are given by a stability factor of -3kcal/mol where many proteins exhibit stability factors within the range of several k/mol. However ΔG values lower than -3kcal/mol risk shifting equilibrium away from folded, functional state of proteins. Using the above equation, a stability factor of less than -3kcal/mol indicates a certain amount of misfolded or partially folded proteins, which could lead to the irreversible effects of aggregation and degradation.

Threshold model

Threshold Robustness Model

The above equation also dictates a relationship in which proteins contain a certain limit to the amount of mutations it can handle before decreasing its fitness. [E0] (or protein fitness, as they are both proportional to each other) is fixed as long as ΔG remains above a certain threshold, referred to as ΔGt as shown in the figure “Threshold Robustness Model.” If the threshold were to increase (threshold robustness line in green), there will be a higher tolerance in mutations. Once mutations begin to accumulate, however, protein fitness begins to rapidly decrease. Many mutations leading to monogenic diseases show sigmoidal relationships.

Epistatic effects

The threshold model exemplifies negative epistasis (the increase in the harmful effect of a mutation while other mutations are present). As expected, the first few mutations have no to very little effect on protein fitness because an excess of stability buffers these destabilizing mutations. However, the buildup of more mutations is additive and leads to a reduction in stability which ultimately leads to a fitness decline. Negative epistatic effects dictate that because there is no immediate effect on protein fitness, higher ΔG values are not favored by natural selection.

Environmental Robustness and Phenotypic mutations

Genetic robustness can be explained by the threshold model where proteins maintain a higher tolerance to mutations with an increase in its threshold (ΔGt). Environmental robustness is a theory as to why genetic robustness occurs in proteins. Fluctuations in temperature, salinity, and other environmental factors could have influence the evolution of higher stability. Another factor could be phenotypic mutations. Because these types of mutations occur more frequently than genetic mutations, phenotypic mutations are believed to exert an immediate effect on protein fitness. Therefore the evolution of higher stability thresholds is understood to buffer the effects of phenotypic mutations and other environmental factors.

Another type of robustness to mutations, gradient robustness, is associated with a small initial stability margin, but with a smaller slope so that each mutation causes, on average, a lower loss of stability. In a protein that is not tightly packed, a lower stability change would be expected because there are already few residue contacts so there are not many interactions to be lost. Proteins from RNA viruses do in fact show this type of relationship. These viruses have mutation rates several orders of magnitude higher than most other organisms. Their proteins show lower overall stability and are often loosely packed or partially disordered. Gradient robustness is the notion that proteins with strong well-packed structures exhibit higher stability losses than those whose residues have little contacts.

New Function and Stability

Typically occurring at more buried residues, adaptive new function mutations are more destabilizing than non-adaptive, neutral mutations. If these types of mutations accumulate, protein stability will be below ∆Gt, decreasing [e]o and ultimately decreasing protein fitness. The ability of a protein to acquire mutations that confer new functions is limited by the destabilizing effects of such mutations. The observation that mutations that improved the catalytic efficiency of TEM-1-lactase toward third generation antibiotics were destabilizing suggests that there is a tradeoff between protein stability and the evolution of new functions. On the other hand, compensatory mutations which reestablish protein stability are often seen after a change in function. FoldX predictions of new-function mutations specified that, while destabilizing, new function mutations are not more destabilizing than the average mutation. Observations contradict this prediction; new function mutations were found to be more destabilizing than neutral mutations and often occur more in internal residues. If new function mutations accumulate the protein stability is likely to decrease such that protein fitness decreases, even if the mutations improve function. Stabilized variants of P450 and TEM1 exhibited greater evolvability since they were able to accommodate more new-function mutations without correspondingly lower levels of enzyme.

## Stability and Evolutionary Change through Uphill Divergence, Downhill Divergence, and Chaperones

Graphical representation of evolutionary changes with changes in protein stability with different types of mutations.
The green area represents protein stability and the red represents instability. Blue arrows represent stabilizing mutations and orange arrows represent new-function mutations (or destabilizing mutations). The area boxed yellow in Figure B represents chaperone buffering.

Uphill Divergence

Compensatory mutations (or global suppressors) restore the stability margin for evolving proteins. Compensatory mutations are also called global suppressors because they can suppress the harmful effects of a wide range of mutations, and they have an important role in the evolutionary dynamics of proteins (they have been observed in both natural and in vitro evolution). Most compensatory mutations are stabilizing, for example, in developing resistance to the antibiotic cefotaxime, TEM-1 showed active site mutations which provided the new resistance followed by the stabilizing compensatory mutation Met182Thr. The rate of evolution is limited by the need for compensatory mutations to restore protein stability to the evolving protein. Still, mutations that change protein function and are destabilizing beyond ∆Gt cannot become fixated, except when buffered by chaperones.

Stabilizing Ancestor/Consensus Mutations and Downhill Divergence

By pairing a compensatory mutation (stabilizing factor) with that of a new function mutation (destabilizing factor), the overall stability of the protein is withheld. However, a large excess of stability could hinder evolvability in that the protein becomes rigid and restricts alternative conformations that could account for new functions. One way of using downhill divergence in protein engineering would be to incorporate compensatory mutations into the library that is selected for the enzyme’s new function; however, this would require the ability to predict stabilizing compensatory mutations. In one neutral drift experiment (multiple rounds of mutating and purifying to maintain the enzyme’s function) which provided a hint to predicting stabilizing compensatory mutations, several different mutations were enriched and five of the mutations showing the highest enrichment increased the stability and acted as compensatory mutations for a range of destabilizing mutations. The enriched mutations had one thing in common: they all changed the sequence of TEM-1 to be closer to its family consensus, and/or its ancestor. If a mutation occurs in a conserved residue, it usually causes a large drop in stability, while stability can be increased by reverting residues that deviate from the consensus amino acid. Ancestral interference, and/or consensus analysis can possibly be used to predict compensatory mutations. These predicted compensatory mutations can then be used to facilitate the engineering of more stable proteins with new functions through downhill divergence.

Chaperones and Protein Evolvability

Chaperones are known to assist in the folding of proteins, but they can also buffer the effects of mutations. Though the extent and the impact on evolutionary rates are unknown, chaperones seem to extend the zone of neutrality, allowing the accumulation of destabilizing mutations. A means of measuring the buffering capacity of the bacterial chaperonin GroEL/ES has been established recently, in which mutation accumulation experiments were performed with overexpression of the GroEL/ES protein. The proteins which accumulated the mutations were then tested for the quantity and type of acquired mutations and the amount of buffering necessary for stability. It was found that under overexpression of GroEL/ES the amount of accumulated neutral mutations was doubled, with increased variability. There were increased levels of mutations in the proteins’ cores, and the mutations had, on average, much higher destabilizing effects than in the absence of GroEL/ES. It has also been shown that overexpression of GroEL/ES can speed up the acquisition of a new enzymatic specificity. One case was observed in which variants of an enzyme selected under overexpression of GroEL/ES had a mutation that largely improved the newly evolving activity but was also largely destabilizing. Variants were selected without GroEL/ES that carried a different mutation which showed lower improvement with no destabilization, and variants selected without overexpression of chaperonins showed no improved function, or even decreased function due to lower enzyme concentrations.

Chaperonins (GroEL) has been known to use ATP in order to help proteins to fold. The process consists of unfolded proteins binding to the GroEL while not block the GroES. Then the ATP will bind to GroEL heptamers which will lead to ATP hydrolysis. ATP hydrolysis consists of releasing 14ADP and GroES. From here, GroEL is then bound to 7ATP and GroES in a pocket which will allow proteins to fold inside. Released proteins from the pockets means they are completely or partially folded while the proteins that are unfolded are sent back to bind to ATP.

## Reference

Tokuriki, Nobuhiko, and Dan S. Tawfik. "Stability Effects of Mutations and Protein Evolvability." Current Opinion in Structural Biology 19.5 (2009): 596-604. Print.