Co-evolution of metabolism and protein sequences

M Schuette, N Klitgord, D Segre, Oliver Ebenhoeh

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

The set of chemicals producible and usable by metabolic pathways must have evolved in parallel with the enzymes that catalyze them. One implication of this common historical path should be a correspondence between the innovation steps that gradually added new metabolic reactions to the biosphere-level biochemical toolkit, and the gradual sequence changes that must have slowly shaped the corresponding enzyme structures. However, global signatures of a long-term co-evolution have not been identified. Here we search for such signatures by computing correlations between inter-reaction distances on a metabolic network, and sequence distances of the corresponding enzyme proteins. We perform our calculations using the set of all known metabolic reactions, available from the KEGG database. Reaction-reaction distance on the metabolic network is computed as the length of the shortest path on a projection of the metabolic network, in which nodes are reactions and edges indicate whether two reactions share a common metabolite, after removal of cofactors. Estimating the distance between enzyme sequences in a meaningful way requires some special care: for each enzyme commission (EC) number, we select from KEGG a consensus set of protein sequences using the cluster of orthologous groups of proteins (COG) database. We define the evolutionary distance between protein sequences as an asymmetric transition probability between two enzymes, derived from the corresponding pair-wise BLAST scores. By comparing the distances between sequences to the minimal distances on the metabolic reaction graph, we find a small but statistically significant correlation between the two measures. This suggests that the evolutionary walk in enzyme sequence space has locally mirrored, to some extent, the gradual expansion of metabolism.

Original languageEnglish
Pages (from-to)156-166
Number of pages11
JournalGenome Informatics
Volume22
Publication statusPublished - Jan 2010

Fingerprint

Metabolic Networks and Pathways
Enzymes
Proteins
Protein Databases
Databases

Cite this

Schuette, M., Klitgord, N., Segre, D., & Ebenhoeh, O. (2010). Co-evolution of metabolism and protein sequences. Genome Informatics, 22, 156-166.

Co-evolution of metabolism and protein sequences. / Schuette, M; Klitgord, N; Segre, D; Ebenhoeh, Oliver.

In: Genome Informatics, Vol. 22, 01.2010, p. 156-166.

Research output: Contribution to journalArticle

Schuette, M, Klitgord, N, Segre, D & Ebenhoeh, O 2010, 'Co-evolution of metabolism and protein sequences', Genome Informatics, vol. 22, pp. 156-166.
Schuette M, Klitgord N, Segre D, Ebenhoeh O. Co-evolution of metabolism and protein sequences. Genome Informatics. 2010 Jan;22:156-166.
Schuette, M ; Klitgord, N ; Segre, D ; Ebenhoeh, Oliver. / Co-evolution of metabolism and protein sequences. In: Genome Informatics. 2010 ; Vol. 22. pp. 156-166.
@article{60cad7a90e784e4087f8a468ffe78f15,
title = "Co-evolution of metabolism and protein sequences",
abstract = "The set of chemicals producible and usable by metabolic pathways must have evolved in parallel with the enzymes that catalyze them. One implication of this common historical path should be a correspondence between the innovation steps that gradually added new metabolic reactions to the biosphere-level biochemical toolkit, and the gradual sequence changes that must have slowly shaped the corresponding enzyme structures. However, global signatures of a long-term co-evolution have not been identified. Here we search for such signatures by computing correlations between inter-reaction distances on a metabolic network, and sequence distances of the corresponding enzyme proteins. We perform our calculations using the set of all known metabolic reactions, available from the KEGG database. Reaction-reaction distance on the metabolic network is computed as the length of the shortest path on a projection of the metabolic network, in which nodes are reactions and edges indicate whether two reactions share a common metabolite, after removal of cofactors. Estimating the distance between enzyme sequences in a meaningful way requires some special care: for each enzyme commission (EC) number, we select from KEGG a consensus set of protein sequences using the cluster of orthologous groups of proteins (COG) database. We define the evolutionary distance between protein sequences as an asymmetric transition probability between two enzymes, derived from the corresponding pair-wise BLAST scores. By comparing the distances between sequences to the minimal distances on the metabolic reaction graph, we find a small but statistically significant correlation between the two measures. This suggests that the evolutionary walk in enzyme sequence space has locally mirrored, to some extent, the gradual expansion of metabolism.",
author = "M Schuette and N Klitgord and D Segre and Oliver Ebenhoeh",
year = "2010",
month = "1",
language = "English",
volume = "22",
pages = "156--166",
journal = "Genome Informatics",
issn = "0919-9454",
publisher = "Universal Academy Press",

}

TY - JOUR

T1 - Co-evolution of metabolism and protein sequences

AU - Schuette, M

AU - Klitgord, N

AU - Segre, D

AU - Ebenhoeh, Oliver

PY - 2010/1

Y1 - 2010/1

N2 - The set of chemicals producible and usable by metabolic pathways must have evolved in parallel with the enzymes that catalyze them. One implication of this common historical path should be a correspondence between the innovation steps that gradually added new metabolic reactions to the biosphere-level biochemical toolkit, and the gradual sequence changes that must have slowly shaped the corresponding enzyme structures. However, global signatures of a long-term co-evolution have not been identified. Here we search for such signatures by computing correlations between inter-reaction distances on a metabolic network, and sequence distances of the corresponding enzyme proteins. We perform our calculations using the set of all known metabolic reactions, available from the KEGG database. Reaction-reaction distance on the metabolic network is computed as the length of the shortest path on a projection of the metabolic network, in which nodes are reactions and edges indicate whether two reactions share a common metabolite, after removal of cofactors. Estimating the distance between enzyme sequences in a meaningful way requires some special care: for each enzyme commission (EC) number, we select from KEGG a consensus set of protein sequences using the cluster of orthologous groups of proteins (COG) database. We define the evolutionary distance between protein sequences as an asymmetric transition probability between two enzymes, derived from the corresponding pair-wise BLAST scores. By comparing the distances between sequences to the minimal distances on the metabolic reaction graph, we find a small but statistically significant correlation between the two measures. This suggests that the evolutionary walk in enzyme sequence space has locally mirrored, to some extent, the gradual expansion of metabolism.

AB - The set of chemicals producible and usable by metabolic pathways must have evolved in parallel with the enzymes that catalyze them. One implication of this common historical path should be a correspondence between the innovation steps that gradually added new metabolic reactions to the biosphere-level biochemical toolkit, and the gradual sequence changes that must have slowly shaped the corresponding enzyme structures. However, global signatures of a long-term co-evolution have not been identified. Here we search for such signatures by computing correlations between inter-reaction distances on a metabolic network, and sequence distances of the corresponding enzyme proteins. We perform our calculations using the set of all known metabolic reactions, available from the KEGG database. Reaction-reaction distance on the metabolic network is computed as the length of the shortest path on a projection of the metabolic network, in which nodes are reactions and edges indicate whether two reactions share a common metabolite, after removal of cofactors. Estimating the distance between enzyme sequences in a meaningful way requires some special care: for each enzyme commission (EC) number, we select from KEGG a consensus set of protein sequences using the cluster of orthologous groups of proteins (COG) database. We define the evolutionary distance between protein sequences as an asymmetric transition probability between two enzymes, derived from the corresponding pair-wise BLAST scores. By comparing the distances between sequences to the minimal distances on the metabolic reaction graph, we find a small but statistically significant correlation between the two measures. This suggests that the evolutionary walk in enzyme sequence space has locally mirrored, to some extent, the gradual expansion of metabolism.

M3 - Article

VL - 22

SP - 156

EP - 166

JO - Genome Informatics

JF - Genome Informatics

SN - 0919-9454

ER -