Tidying up international nucleotide sequence databases: ecological, geographical and sequence quality annotation of its sequences of mycorrhizal fungi

Leho Tedersoo, Kessy Abarenkov, R Henrik Nilsson, Arthur Schuessler, Gwen-Aelle Grelet, Petr Kohout, Jane Oja, Gregory M. Bonito, Vilmar Veldre, Teele Jairus, Martin Ryberg, Karl-Henrik Larsson, Urmas Koeljalg

Research output: Contribution to journalArticle

36 Citations (Scopus)
4 Downloads (Pure)

Abstract

Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS) region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD) are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0%, 33.0%, 41.7% and 96.4% of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/) for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/), the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi.

Original languageEnglish
Article numbere24940
Number of pages7
JournalPloS ONE
Volume6
Issue number9
DOIs
Publication statusPublished - 15 Sep 2011

Cite this

Tidying up international nucleotide sequence databases : ecological, geographical and sequence quality annotation of its sequences of mycorrhizal fungi. / Tedersoo, Leho; Abarenkov, Kessy; Nilsson, R Henrik; Schuessler, Arthur; Grelet, Gwen-Aelle; Kohout, Petr; Oja, Jane; Bonito, Gregory M.; Veldre, Vilmar; Jairus, Teele; Ryberg, Martin; Larsson, Karl-Henrik; Koeljalg, Urmas.

In: PloS ONE, Vol. 6, No. 9, e24940, 15.09.2011.

Research output: Contribution to journalArticle

Tedersoo, L, Abarenkov, K, Nilsson, RH, Schuessler, A, Grelet, G-A, Kohout, P, Oja, J, Bonito, GM, Veldre, V, Jairus, T, Ryberg, M, Larsson, K-H & Koeljalg, U 2011, 'Tidying up international nucleotide sequence databases: ecological, geographical and sequence quality annotation of its sequences of mycorrhizal fungi' PloS ONE, vol. 6, no. 9, e24940. https://doi.org/10.1371/journal.pone.0024940
Tedersoo, Leho ; Abarenkov, Kessy ; Nilsson, R Henrik ; Schuessler, Arthur ; Grelet, Gwen-Aelle ; Kohout, Petr ; Oja, Jane ; Bonito, Gregory M. ; Veldre, Vilmar ; Jairus, Teele ; Ryberg, Martin ; Larsson, Karl-Henrik ; Koeljalg, Urmas. / Tidying up international nucleotide sequence databases : ecological, geographical and sequence quality annotation of its sequences of mycorrhizal fungi. In: PloS ONE. 2011 ; Vol. 6, No. 9.
@article{4805388b28c64506bd5f415e17aee9f7,
title = "Tidying up international nucleotide sequence databases: ecological, geographical and sequence quality annotation of its sequences of mycorrhizal fungi",
abstract = "Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS) region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD) are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0{\%}, 33.0{\%}, 41.7{\%} and 96.4{\%} of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/) for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/), the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi.",
author = "Leho Tedersoo and Kessy Abarenkov and Nilsson, {R Henrik} and Arthur Schuessler and Gwen-Aelle Grelet and Petr Kohout and Jane Oja and Bonito, {Gregory M.} and Vilmar Veldre and Teele Jairus and Martin Ryberg and Karl-Henrik Larsson and Urmas Koeljalg",
year = "2011",
month = "9",
day = "15",
doi = "10.1371/journal.pone.0024940",
language = "English",
volume = "6",
journal = "PloS ONE",
issn = "1932-6203",
publisher = "PUBLIC LIBRARY SCIENCE",
number = "9",

}

TY - JOUR

T1 - Tidying up international nucleotide sequence databases

T2 - ecological, geographical and sequence quality annotation of its sequences of mycorrhizal fungi

AU - Tedersoo, Leho

AU - Abarenkov, Kessy

AU - Nilsson, R Henrik

AU - Schuessler, Arthur

AU - Grelet, Gwen-Aelle

AU - Kohout, Petr

AU - Oja, Jane

AU - Bonito, Gregory M.

AU - Veldre, Vilmar

AU - Jairus, Teele

AU - Ryberg, Martin

AU - Larsson, Karl-Henrik

AU - Koeljalg, Urmas

PY - 2011/9/15

Y1 - 2011/9/15

N2 - Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS) region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD) are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0%, 33.0%, 41.7% and 96.4% of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/) for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/), the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi.

AB - Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS) region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD) are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0%, 33.0%, 41.7% and 96.4% of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/) for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/), the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi.

U2 - 10.1371/journal.pone.0024940

DO - 10.1371/journal.pone.0024940

M3 - Article

VL - 6

JO - PloS ONE

JF - PloS ONE

SN - 1932-6203

IS - 9

M1 - e24940

ER -