Measuring correlations in metabolomic networks with mutual information

J. Numata, Oliver Ebenhoeh, E. W. Knapp

Research output: Chapter in Book/Report/Conference proceedingChapter

27 Citations (Scopus)

Abstract

Non-linear correlations based on mutual information are evaluated to measure statistical dependencies among data points measured from metabolism in two dimensional space. While the Pearson correlation coefficient is only rigorously applicable to characterize strictly linear correlations with Gaussian noise, the mutual information coefficient is more generally valid. Here, we use recent distribution-free (non-parametric) mutual information estimators based on k-nearest neighbor distances. The mutual information algorithm of Kraskov et al. is found to yield estimates with low systematic and statistical error. The significance of the different methods is probed for artificial sets of tens to hundreds of data points, a size currently typical for metabolomic data. We analyze experimental data on metabolite concentrations from Arabidopsis thaliana by using these procedures. The mutual information was able to detect additional non-linear correlations undetectable for the Pearson coefficient.
Original languageEnglish
Title of host publicationGenome Informatics 2008
Subtitle of host publicationProceedings of the 8th Annual International Workshop on Bioinformatics and Systems Biology (IBSB 2008)
EditorsErnst-Walter Knapp
Place of PublicationLondon, United Kingdom
PublisherImperial College Press
Pages112-122
Volume20
ISBN (Print)978-1848162990
DOIs
Publication statusPublished - 8 Dec 2008
Event8th Annual International Workshop on Bioinformatics and Systems Biology (IBSB 2008) - Zeuten Lake, Berlin, Germany
Duration: 9 Jun 200811 Jun 2008

Publication series

NameGenome Informatics Series
PublisherImperial College Press
Volume20
ISSN (Print)0919-9454

Conference

Conference8th Annual International Workshop on Bioinformatics and Systems Biology (IBSB 2008)
CountryGermany
CityZeuten Lake, Berlin
Period9/06/0811/06/08

    Fingerprint

Keywords

  • statistical correlation
  • Pearson coefficient
  • non-linear correlation
  • mutual information
  • k_nearest neighbour
  • entropy
  • metabolomics
  • Arabidopsis thaliana

Cite this

Numata, J., Ebenhoeh, O., & Knapp, E. W. (2008). Measuring correlations in metabolomic networks with mutual information. In E-W. Knapp (Ed.), Genome Informatics 2008: Proceedings of the 8th Annual International Workshop on Bioinformatics and Systems Biology (IBSB 2008) (Vol. 20, pp. 112-122). (Genome Informatics Series; Vol. 20). London, United Kingdom: Imperial College Press. https://doi.org/10.1142/9781848163003_0010