De Novo assembly of the complete genome of an enhanced electricity-producing variant of Geobacter sulfurreducens using only short reads

Harish Nagarajan, Jessica E Butler, Anna Klimes, Yu Qiu, Karsten Zengler, Joy Ward, Nelson D Young, Barbara A Methé, Bernhard Ø Palsson, Derek R Lovley, Christian L Barrett

Research output: Contribution to journalArticle

29 Citations (Scopus)
4 Downloads (Pure)

Abstract

State-of-the-art DNA sequencing technologies are transforming the life sciences due to their ability to generate nucleotide sequence information with a speed and quantity that is unapproachable with traditional Sanger sequencing. Genome sequencing is a principal application of this technology, where the ultimate goal is the full and complete sequence of the organism of interest. Due to the nature of the raw data produced by these technologies, a full genomic sequence attained without the aid of Sanger sequencing has yet to be demonstrated.We have successfully developed a four-phase strategy for using only next-generation sequencing technologies (Illumina and 454) to assemble a complete microbial genome de novo. We applied this approach to completely assemble the 3.7 Mb genome of a rare Geobacter variant (KN400) that is capable of unprecedented current production at an electrode. Two key components of our strategy enabled us to achieve this result. First, we integrated the two data types early in the process to maximally leverage their complementary characteristics. And second, we used the output of different short read assembly programs in such a way so as to leverage the complementary nature of their different underlying algorithms or of their different implementations of the same underlying algorithm.The significance of our result is that it demonstrates a general approach for maximizing the efficiency and success of genome assembly projects as new sequencing technologies and new assembly algorithms are introduced. The general approach is a meta strategy, wherein sequencing data are integrated as early as possible and in particular ways and wherein multiple assembly algorithms are judiciously applied such that the deficiencies in one are complemented by another.

Original languageEnglish
Article numbere10922
Number of pages9
JournalPloS ONE
Volume5
Issue number6
DOIs
Publication statusPublished - 8 Jun 2010

Fingerprint

Geobacter sulfurreducens
Geobacter
Electricity
electricity
Genes
Genome
Technology
genome
genome assembly
Program assemblers
Microbial Genome
application technology
electrodes
Biological Science Disciplines
DNA Sequence Analysis
sequence analysis
genomics
Electrodes
nucleotide sequences
Nucleotides

Keywords

  • algorithms
  • electricity
  • genome, bacterial
  • Geobacter
  • polymerase chain reaction

Cite this

De Novo assembly of the complete genome of an enhanced electricity-producing variant of Geobacter sulfurreducens using only short reads. / Nagarajan, Harish; Butler, Jessica E; Klimes, Anna; Qiu, Yu; Zengler, Karsten; Ward, Joy; Young, Nelson D; Methé, Barbara A; Palsson, Bernhard Ø; Lovley, Derek R; Barrett, Christian L.

In: PloS ONE, Vol. 5, No. 6, e10922, 08.06.2010.

Research output: Contribution to journalArticle

Nagarajan, H, Butler, JE, Klimes, A, Qiu, Y, Zengler, K, Ward, J, Young, ND, Methé, BA, Palsson, BØ, Lovley, DR & Barrett, CL 2010, 'De Novo assembly of the complete genome of an enhanced electricity-producing variant of Geobacter sulfurreducens using only short reads', PloS ONE, vol. 5, no. 6, e10922. https://doi.org/10.1371/journal.pone.0010922
Nagarajan, Harish ; Butler, Jessica E ; Klimes, Anna ; Qiu, Yu ; Zengler, Karsten ; Ward, Joy ; Young, Nelson D ; Methé, Barbara A ; Palsson, Bernhard Ø ; Lovley, Derek R ; Barrett, Christian L. / De Novo assembly of the complete genome of an enhanced electricity-producing variant of Geobacter sulfurreducens using only short reads. In: PloS ONE. 2010 ; Vol. 5, No. 6.
@article{87ceedf9b9fa437d81eb228fe7dc4324,
title = "De Novo assembly of the complete genome of an enhanced electricity-producing variant of Geobacter sulfurreducens using only short reads",
abstract = "State-of-the-art DNA sequencing technologies are transforming the life sciences due to their ability to generate nucleotide sequence information with a speed and quantity that is unapproachable with traditional Sanger sequencing. Genome sequencing is a principal application of this technology, where the ultimate goal is the full and complete sequence of the organism of interest. Due to the nature of the raw data produced by these technologies, a full genomic sequence attained without the aid of Sanger sequencing has yet to be demonstrated.We have successfully developed a four-phase strategy for using only next-generation sequencing technologies (Illumina and 454) to assemble a complete microbial genome de novo. We applied this approach to completely assemble the 3.7 Mb genome of a rare Geobacter variant (KN400) that is capable of unprecedented current production at an electrode. Two key components of our strategy enabled us to achieve this result. First, we integrated the two data types early in the process to maximally leverage their complementary characteristics. And second, we used the output of different short read assembly programs in such a way so as to leverage the complementary nature of their different underlying algorithms or of their different implementations of the same underlying algorithm.The significance of our result is that it demonstrates a general approach for maximizing the efficiency and success of genome assembly projects as new sequencing technologies and new assembly algorithms are introduced. The general approach is a meta strategy, wherein sequencing data are integrated as early as possible and in particular ways and wherein multiple assembly algorithms are judiciously applied such that the deficiencies in one are complemented by another.",
keywords = "algorithms, electricity, genome, bacterial, Geobacter, polymerase chain reaction",
author = "Harish Nagarajan and Butler, {Jessica E} and Anna Klimes and Yu Qiu and Karsten Zengler and Joy Ward and Young, {Nelson D} and Meth{\'e}, {Barbara A} and Palsson, {Bernhard {\O}} and Lovley, {Derek R} and Barrett, {Christian L}",
year = "2010",
month = "6",
day = "8",
doi = "10.1371/journal.pone.0010922",
language = "English",
volume = "5",
journal = "PloS ONE",
issn = "1932-6203",
publisher = "PUBLIC LIBRARY SCIENCE",
number = "6",

}

TY - JOUR

T1 - De Novo assembly of the complete genome of an enhanced electricity-producing variant of Geobacter sulfurreducens using only short reads

AU - Nagarajan, Harish

AU - Butler, Jessica E

AU - Klimes, Anna

AU - Qiu, Yu

AU - Zengler, Karsten

AU - Ward, Joy

AU - Young, Nelson D

AU - Methé, Barbara A

AU - Palsson, Bernhard Ø

AU - Lovley, Derek R

AU - Barrett, Christian L

PY - 2010/6/8

Y1 - 2010/6/8

N2 - State-of-the-art DNA sequencing technologies are transforming the life sciences due to their ability to generate nucleotide sequence information with a speed and quantity that is unapproachable with traditional Sanger sequencing. Genome sequencing is a principal application of this technology, where the ultimate goal is the full and complete sequence of the organism of interest. Due to the nature of the raw data produced by these technologies, a full genomic sequence attained without the aid of Sanger sequencing has yet to be demonstrated.We have successfully developed a four-phase strategy for using only next-generation sequencing technologies (Illumina and 454) to assemble a complete microbial genome de novo. We applied this approach to completely assemble the 3.7 Mb genome of a rare Geobacter variant (KN400) that is capable of unprecedented current production at an electrode. Two key components of our strategy enabled us to achieve this result. First, we integrated the two data types early in the process to maximally leverage their complementary characteristics. And second, we used the output of different short read assembly programs in such a way so as to leverage the complementary nature of their different underlying algorithms or of their different implementations of the same underlying algorithm.The significance of our result is that it demonstrates a general approach for maximizing the efficiency and success of genome assembly projects as new sequencing technologies and new assembly algorithms are introduced. The general approach is a meta strategy, wherein sequencing data are integrated as early as possible and in particular ways and wherein multiple assembly algorithms are judiciously applied such that the deficiencies in one are complemented by another.

AB - State-of-the-art DNA sequencing technologies are transforming the life sciences due to their ability to generate nucleotide sequence information with a speed and quantity that is unapproachable with traditional Sanger sequencing. Genome sequencing is a principal application of this technology, where the ultimate goal is the full and complete sequence of the organism of interest. Due to the nature of the raw data produced by these technologies, a full genomic sequence attained without the aid of Sanger sequencing has yet to be demonstrated.We have successfully developed a four-phase strategy for using only next-generation sequencing technologies (Illumina and 454) to assemble a complete microbial genome de novo. We applied this approach to completely assemble the 3.7 Mb genome of a rare Geobacter variant (KN400) that is capable of unprecedented current production at an electrode. Two key components of our strategy enabled us to achieve this result. First, we integrated the two data types early in the process to maximally leverage their complementary characteristics. And second, we used the output of different short read assembly programs in such a way so as to leverage the complementary nature of their different underlying algorithms or of their different implementations of the same underlying algorithm.The significance of our result is that it demonstrates a general approach for maximizing the efficiency and success of genome assembly projects as new sequencing technologies and new assembly algorithms are introduced. The general approach is a meta strategy, wherein sequencing data are integrated as early as possible and in particular ways and wherein multiple assembly algorithms are judiciously applied such that the deficiencies in one are complemented by another.

KW - algorithms

KW - electricity

KW - genome, bacterial

KW - Geobacter

KW - polymerase chain reaction

U2 - 10.1371/journal.pone.0010922

DO - 10.1371/journal.pone.0010922

M3 - Article

VL - 5

JO - PloS ONE

JF - PloS ONE

SN - 1932-6203

IS - 6

M1 - e10922

ER -