Proteomics/Proteomics and Drug Discovery/Structure-Based Drug Design

From Wikibooks, open books for an open world
< Proteomics‎ | Proteomics and Drug Discovery
Jump to: navigation, search
  1. Introduction
  2. Structure Based Drug Design
  3. Virtual Compound Libraries
  4. Docking and Scoring
  5. Software Tools
  6. Protein Aggregation

Structure-Based Drug Design[edit]


In the early 1980s, researchers were not able to take advantage of structure-based methods in the drug discovery process. This was due to a number of factors, the most important of which were a lack of computing power and docking programs that could test potential models as well as a lack of interest in the established community due in part to the aforementioned lack of tools. However, in the 1990s, computing power and available programs increased exponentially as well as the ability to obtain cheaper, more reliable x-ray crystallographic structures necessary for any type of computational study. This marked the start of a new era and the first attempts and successes were published, with the two most famous being the HIV-1 protease inhibitors and renin inhibitors (to combat hypertension) (Lunney). In contemporary drug discovery, structure-based methods are an integral part of the drug development process. This change can be attributed to the rapid advances in genomics and structural biology, as well as developments in information technology. The advancements of technology in several fields that are vital to the drug discovery process increased the pace of drug development. However, many years of research are still required until a drug which is both effective and tolerated by the human body is marketable (Anderson). Despite this new technology and increased funding by pharmaceutical companies, the issues with finding safe, effective compounds have resulted in there being no significant increase in the rate of therapeutic agent release to the general public. (Lindsay)

The drug discovery process includes many steps. First, the drug target has to be selected. In most cases this will be a protein; however, recent studies have shown that RNA, with its well defined secondary-structure, is also an effective drug target. DNA is also a target of newer drugs, especially in chemotherapy (see below). At least 25 percent of currently marketed drugs target G-protein-coupled receptors (GPCRs), while another 22 percent affect the function of ion channels, proteases, kinases, and nuclear hormone receptors. Only 2 percent of therapeutic targets in 2003 were DNA or RNA. Once a target is selected, it has to be cloned and purified in order to determine the structure. The most common method used for structure determination is X-ray crystallography, but NMR, which when applied to proteins uses multidimensional experiments that increase unique signals and the signal to noise ratio, and homology modeling, which uses closely related protein structures to clarify the structure of an unclear protein, are often employed as well. New methods in phase determination of molecules, along with complete automation, have allowed for high-throughput X-ray crystallography; this has greatly enhanced the speed of structure determination. As soon as the structure of the target is known and a potential ligand binding site is identified, computational methods can be used to dock a large number of small molecules into the specified position. These small molecules, which are stored in a database, are then scored and ranked based on their steric and electrostatic (see also, ionic bonds) interactions with the target site. The best performing compounds, called ‘hits’, are selected for biochemical assays and further testing. The hit compounds have to be effective at very low concentrations, at least at the micromolar level. The hits are further optimized into the more potent ‘leads’ by chemical synthesis to improve their potency. Selected leads are further scrutinized by rigorous cytotoxicity tests, pharmacokinetic studies, and toxicology investigations to search for side effects followed by the eventual phase 0, I, and II clinical trials for successful drug prototypes (Alanine, Anderson).

Drug Target Selection and Identification of the Target Site[edit]

The selection of the drug target is mainly based on biological and biochemical considerations. Proteomics as a tool in this area is still relatively limited due to the sheer complexity of protein expression in any given cell. Despite this, proteomics’ usefulness has grown in other areas of the drug discovery process including biomarker identification and tracking. The ideal drug target for structure-based drug design should bind a small molecule and should be closely associated with the disease. The small molecule then either changes the function of the drug target, or in case of a pathogenic organism, inhibits the function of the target. This will ideally lead to the cell death of the pathogen. In the latter case, the drug target should only be present in the diseased cells or pathogen and should have a unique function that allows for and encourages this selectivity. Furthermore, the uniqueness of the drug target guarantees that another pathway cannot restore the function of an inhibited target. The structure-based search for anti-cancer and autoimmune drugs is much more challenging, since the drug targets regulate essential cell functions. Hence, these targets are not unique and isolated; inhibition of their function not only affects mutant or over/under-activated cells, but also normal cells. For example, the phosphoinositide 3-kinase (PI3K) pathway is both involved in cellular growth and has been directly linked with pancreatic cancer activation (Reddy). Trying to separate convoluted pathways such as this make the drug discovery process much more complex than with other diseases. Another option for targeting in cancer is DNA. Cisplatin and Bleomycin cross-link and cleave DNA respectively and are used to slow down cell division, particularly in cancerous tissues. (Singh)

The ligand binding site of a drug target should be a pocket containing both hydrogen donors and acceptors, as well as hydrophobic residues. In many cases, the selected target location is the active site of an enzyme, as with sildenafil citrate (Viagra), which targets the catalytic subunit of NADPH (Jeremy); however, it can also be an assembly or regulatory site, as is the case with the phosphotransferase regulatory domain the Bacillus subtilis Spo0f protein, a histidine kinase (Dai-Fu). Even protein-protein interaction sites, which are often large and planar, have been selected as target sites (2-oxoglutarate, a naturally occurring molecule, affects the monomer-monomer interactions of GlnK, an ammonia transporter [AcrB] inhibitor (Anderson, Stroud).

Proteomics as a Tool to Discover Biomarkers[edit]

‘Biomarker’ is short for ‘biological marker.’ A biomarker is a molecule, indicator, or test that can be used to measure such processes as disease progression, infection stage, and drug efficacy, as well as other various biological functions. They can also be used as part of safety studies for therapeutic agents. Although the most common biomarkers in use today are small molecules and proteins, the growing fields of pharmacogenetics and pharmacogenomics are attempting to utilize genotypes, haplotypes, and single nucleotide polymorphisms as biomarkers (Frank).

With the growing emphasis on biomarkers as indicators, the National Institutes of Health (NIH)[1] began a ‘Biomarkers and Surrogate Endpoint Working Group.’[2] This organization has set up a classification system for biomarkers. Type 0 biomarkers are more symptomatic in nature and track disease progression over its complete history. They are used in phase 0 clinical studies. These clinical studies use well developed assay techniques in highly regulated populations for specific durations. The goal of these studies is to achieve simple positive or negative results for the drug or system being studied. Type 1 biomarkers are used to track any type of compound injected into the biological system. The most familiar aspects are drug trials, where the researchers are looking for a specific effect. The effect observed may be positive or negative. The final classification is Type II biomarkers. These biomarkers are used to determine a ‘surrogate endpoint.’ A surrogate endpoint, as defined by the FDA,[3] “is a marker – a laboratory measurement or physical sign – that is used in the therapeutic trials as a substitute for a clinically meaningful endpoint that is a direct measure of how a patient feels, functions or survives and is expected to predict the effect of the therapy.” In other words, a surrogate endpoint moves beyond the concept of a single biomarker and into the realm where many or no biomarkers may be sufficient. Other symptoms including overall health and mortality may be studied to determine the efficacy of a treatment regimen. Although surrogate endpoints are still in the early stages, two examples that have been accepted are blood pressure and cholesterol, which have a firm connection to cardiovascular health and mortality (Frank).

In addition to the above effects, a biomarker should correlate well with the disease condition and minimize the number of false positives, as well as false negatives. Thus, a biomarker should be able to accurately discriminate between a normal and a infected condition with high reproducibility. The challenge in proteomics research is to identify unique biomarkers from complex biological mixtures that unequivocally correlate with the disease condition. Biomarkers can be utilized for many purposes. Also, established biomarkers are useful as risk factor indicators, capable of providing information to show that a person is susceptible to a disease. QT prolongation, a measure of change in the ventricular electrical cycle, is used as a assessment of a patient’s chances for survival after heart attack, as is troponin T, a cardiac enzyme whose levels rise following a heart attack. 5-hydroxytryptophan is a metabolic precursor to seratonin, and has been found to localize around neoplasms in neuroendocrine tissue. Labelling of these molecules allows for visualization of these events by fluorodeoxyglucose positron emission tomography (FDG-PET), which is used in visualizing many types of malignancies. Another PET application uses SPA-RQ, a neurokinin-1 binder, whose injection is used to track binding of the drug Aprepitant™, a drug used in chemotherapy patients to control vomiting (Frank).

There are currently many more biomarkers being studied and in development, and this number will continue to grow as our understanding of the body evolves. The combination of tracer molecules and imaging techniques, as in 5-hydroxytryptophan, creates a powerful method for visualizing areas that should not be touched surgically except as a last resort. The growing use of proteomics to find biomarkers is important as well. Recent advances in analytical techniques such as mass spectrometry and column chromatography have allowed for more comprehensive studies of protein expression. In addition, 2-dimensional electrophoresis provides an overall view of expression, in similar fashion to gene studies. Proteomics studies using these techniques have identified ovarian cancer, macular degeneration, and lipoprotein composition. Development of these techniques will undoubtedly increase in the future (Frank).


Asano, T.; Yao, Y.; Shin, S.; McCubrey, J.; Abbruzzese, J. L.; Reddy, S. A. Cancer Res. 2005, 65, 9164-9168.

Bleicher, K. H.; Bohm, H. J.; Muller, K.; Alanine, A. I. Nat. Rev. Drug Discov. 2003, 2, 369-378.

Cai, X. H.; Zhang, Q.; Shi, S. Y.; Ding, D. F. Acta Biochim. Biophys. Sin. (Shanghai) 2005, 37, 293-302.

Frank, R.; Hargreaves, R. Nat. Rev. Drug Discov. 2003, 2, 566-580.

Gruswitz, F.; O'Connell, J.,3rd; Stroud, R. M. Proc. Natl. Acad. Sci. U. S. A. 2007, 104, 42-47.

Lindsay, M. A. Nat. Rev. Drug Discov. 2003, 2, 831-838.

Lunney, E. Structure-Based Design and Two Aspartic Proteases.

Muzaffar, S.; Shukla, N.; Srivastava, A.; Angelini, G. D.; Jeremy, J. Y. Br. J. Pharmacol. 2005, 146, 109-117.

Singh, S.; Malik, B. K.; Sharma, D. K. Bioinformation 2006.

Next: Virtual Compound Libraries