Metabolomics/Analytical Methods/Statistical Methods

Back to Previous Chapter: Hormones
Next chapter: Computational Modeling of Metabolic Control
Go to first page: NMR
Go back to: Tissues

Statistical Methods

With any scientific endeavor there is data. After collecting the data from the experiment, it needs to be quantified and analyzed. Methods of statistical analysis are used to help make conclusions from the experiment.

Articles and Web Pages for Review and Inclusion

Peer-Reviewed Article #1:

Metabolic network discovery through reverse engineering of metabolome data

Metabolomics. 2009 September; 5(3): 318–329.

Reviewer: Ahman N

Main Focus

Developing and analyzing network interference on integrated metabolic data by using computational method based on statistical similarities measures.

New Terms

Reverse engineering: To study or analyze (metabolites) in order to learn details of operation or network to produce a copy or an improved version. (source: http://en.wikipedia.org/wiki/Reverse_engineering)
Pruning: An alternative mathematical approach to remove indirect interactions in network inference . (source: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2731157/?tool=pubmed)
Network inference: The prediction of a set of nodes and a set of directed or undirected edges between the nodes. (source: http://www.sysbio.org/sysbio/networkbio/)
in silico: Performing the experiment on computer or via computer simulation. (source: http://en.wikipedia.org/wiki/In_silico)
False-positive rate: The probability of falsely rejecting the null hypothesis for a particular test among all the tests performed. (source: http://en.wikipedia.org/wiki/False_positive_rate)
Stochastic simulation: A method to analyze chemical reactions involving large numbers of species with complex reaction kinetics. (source: http://en.wikipedia.org/wiki/Stochastic_simulation)
Permutation test: A type of statistical significance test in which a reference distribution is obtained by calculating all possible values of the test statistic under rearrangements of the labels on the observed data points. (source: http://en.wikipedia.org/wiki/Permutation_test#Permutation_tests)
Receiving-Operator Characteristic (ROC): A global measure of networks inference quality for larger system. (source: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2731157/?tool=pubmed)
Intrinsic variability: The fluctuation that occurs within cellular processes due to complex regulation patterns (i.e pH, temperature, etc). (source: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2731157/?tool=pubmed)
Jacobian matrix: A method to quantify the interaction strength between metabolites pairs in in silico kinetic system. (source: http://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant)

Summary

The article is mainly based on developing and analyzing a metabolic network by using reverse engineering technology. Reverse technology is highly popular in transcriptomics to infer genetic regulatory networks but not in metabolic network inference. The simulation of metabolic pathway is very important since it can show the relationship between structure and function. Thus, the authors developed a metabolic network inference based on three known metabolic pathways, which are threonine synthesis pathway of E. coli consisting of 4 metabolites, S. cerevisiae glycolysis pathway with 13 metabolites, and E. coli central metabolism pathway with 18 metabolites. Those three pathways were used in the study for in silico data generation. Besides, the research was also trying to prove that the steady state, which was the simplest experimental analysis, was informative enough to reveal the connectivity of the underlying metabolic network. There are various reverse engineering methods of omics data existed. However, the authors chose statistical similarity measures as a network inference tool since they were widely employed and they were best suit analysis of steady-state data. Before using the similarity measures, three variables were compared to emulate the condition in the cell. They were enzymatic variability, intrinsic variability and environmental variability. In the similarity measures, some of the indirect interactions in similarity networks inference were identified and removed. Conditioning method was used to identify the indirect interactions and pruning was used to remove the indirect interaction from network inference. The significance measure of similarity score, then, was determined to the in silico data to give the connectivity pattern of the inferred network, which then can be compared with the actual metabolic network derived from the in silico model. Lastly they validated the result by using the interaction strength since the strength of interactions in a cellular network was not the same for all edges in the network. The results of the research showed that the authors successfully built the network inference. The network inference on metabolics data enables the testing of nonlinear measures as well as measures eliminating indirect interactions. Based on the results, by eliminating a high percentage of indirect links by conditioning and pruning, the network inference was more accurate. Besides, the authors were also found that intrinsic variability provides more information to develop the network inference compared to other variables.

Relevance to a Traditional Metabolism Course

Some of the materials in the article are somewhat connected to a traditional course. One of them is the enzyme variability that affects the construction of network interference. In the example of glycolysis, three enzymes (hexokinase, phosphokinase-1, pyruvate kinase) are regulated depend on the cell state (starve state of fed state). To develop perfect network interference, the regulations of enzymes have to take into account. Besides, the article pointed out the importance of structure and function in the network interference. This is closely related to traditional metabolism course since the structure of protein can determine its polarity, charges and many more which can influence the role of the protein. Moreover, the development of network inference on metabolome data is a breakthrough in science since reverse technology for metabolome data is a new analytical approaches even though it has been used in other fields such as transcriptomics.