HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes

Wenwei Xiong, Limei He, Jinsheng Lai, Hugo K. Dooner, Chunguang Du

Research output: Contribution to journalArticle

45 Citations (Scopus)

Abstract

Transposons make up the bulk of eukaryotic genomes, but are difficult to annotate because they evolve rapidly. Most of the unannotated portion of sequenced genomes is probably made up of various divergent transposons that have yet to be categorized. Helitrons are unusual rolling circle eukaryotic transposons that often capture gene sequences, making them of considerable evolutionary importance. Unlike other DNA transposons, Helitrons do not end in inverted repeats or create target site duplications, so they are particularly challenging to identify. Here we present HelitronScanner, a two-layered local combinational variable (LCV) tool for generalized Helitron identification that represents a major improvement over previous identification programs based on DNA sequence or structure. HelitronScanner identified 64,654 Helitrons from a wide range of plant genomes in a highly automated way. We tested HelitronScanner's predictive ability in maize, a species with highly heterogeneous Helitron elements. LCV scores for the 5? and 3? termini of the predicted Helitrons provide a primary confidence level and element copy number provides a secondary one. Newly identified Helitrons were validated by PCR assays or by in silico comparative analysis of insertion site polymorphism among multiple accessions. Many new Helitrons were identified in model species, such as maize, rice, and Arabidopsis, and in a variety of organisms where Helitrons had not been reported previously to our knowledge, leading to a major upward reassessment of their abundance in plant genomes. HelitronScanner promises to be a valuable tool in future comparative and evolutionary studies of this major transposon superfamily.

Original languageEnglish
Pages (from-to)10263-10268
Number of pages6
JournalProceedings of the National Academy of Sciences of the United States of America
Volume111
Issue number28
DOIs
StatePublished - 15 Jul 2014

Fingerprint

Plant Genome
Zea mays
Genome
DNA Transposable Elements
Arabidopsis
Computer Simulation
Polymerase Chain Reaction
Genes
Oryza

Keywords

  • Algorithm
  • Bioinformatic analysis
  • Computational tool
  • Transposition

Cite this

@article{a59571289c164acdacd3cae2fc1dd8b6,
title = "HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes",
abstract = "Transposons make up the bulk of eukaryotic genomes, but are difficult to annotate because they evolve rapidly. Most of the unannotated portion of sequenced genomes is probably made up of various divergent transposons that have yet to be categorized. Helitrons are unusual rolling circle eukaryotic transposons that often capture gene sequences, making them of considerable evolutionary importance. Unlike other DNA transposons, Helitrons do not end in inverted repeats or create target site duplications, so they are particularly challenging to identify. Here we present HelitronScanner, a two-layered local combinational variable (LCV) tool for generalized Helitron identification that represents a major improvement over previous identification programs based on DNA sequence or structure. HelitronScanner identified 64,654 Helitrons from a wide range of plant genomes in a highly automated way. We tested HelitronScanner's predictive ability in maize, a species with highly heterogeneous Helitron elements. LCV scores for the 5? and 3? termini of the predicted Helitrons provide a primary confidence level and element copy number provides a secondary one. Newly identified Helitrons were validated by PCR assays or by in silico comparative analysis of insertion site polymorphism among multiple accessions. Many new Helitrons were identified in model species, such as maize, rice, and Arabidopsis, and in a variety of organisms where Helitrons had not been reported previously to our knowledge, leading to a major upward reassessment of their abundance in plant genomes. HelitronScanner promises to be a valuable tool in future comparative and evolutionary studies of this major transposon superfamily.",
keywords = "Algorithm, Bioinformatic analysis, Computational tool, Transposition",
author = "Wenwei Xiong and Limei He and Jinsheng Lai and Dooner, {Hugo K.} and Chunguang Du",
year = "2014",
month = "7",
day = "15",
doi = "10.1073/pnas.1410068111",
language = "English",
volume = "111",
pages = "10263--10268",
journal = "Proceedings of the National Academy of Sciences of the United States of America",
issn = "0027-8424",
number = "28",

}

HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes. / Xiong, Wenwei; He, Limei; Lai, Jinsheng; Dooner, Hugo K.; Du, Chunguang.

In: Proceedings of the National Academy of Sciences of the United States of America, Vol. 111, No. 28, 15.07.2014, p. 10263-10268.

Research output: Contribution to journalArticle

TY - JOUR

T1 - HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes

AU - Xiong, Wenwei

AU - He, Limei

AU - Lai, Jinsheng

AU - Dooner, Hugo K.

AU - Du, Chunguang

PY - 2014/7/15

Y1 - 2014/7/15

N2 - Transposons make up the bulk of eukaryotic genomes, but are difficult to annotate because they evolve rapidly. Most of the unannotated portion of sequenced genomes is probably made up of various divergent transposons that have yet to be categorized. Helitrons are unusual rolling circle eukaryotic transposons that often capture gene sequences, making them of considerable evolutionary importance. Unlike other DNA transposons, Helitrons do not end in inverted repeats or create target site duplications, so they are particularly challenging to identify. Here we present HelitronScanner, a two-layered local combinational variable (LCV) tool for generalized Helitron identification that represents a major improvement over previous identification programs based on DNA sequence or structure. HelitronScanner identified 64,654 Helitrons from a wide range of plant genomes in a highly automated way. We tested HelitronScanner's predictive ability in maize, a species with highly heterogeneous Helitron elements. LCV scores for the 5? and 3? termini of the predicted Helitrons provide a primary confidence level and element copy number provides a secondary one. Newly identified Helitrons were validated by PCR assays or by in silico comparative analysis of insertion site polymorphism among multiple accessions. Many new Helitrons were identified in model species, such as maize, rice, and Arabidopsis, and in a variety of organisms where Helitrons had not been reported previously to our knowledge, leading to a major upward reassessment of their abundance in plant genomes. HelitronScanner promises to be a valuable tool in future comparative and evolutionary studies of this major transposon superfamily.

AB - Transposons make up the bulk of eukaryotic genomes, but are difficult to annotate because they evolve rapidly. Most of the unannotated portion of sequenced genomes is probably made up of various divergent transposons that have yet to be categorized. Helitrons are unusual rolling circle eukaryotic transposons that often capture gene sequences, making them of considerable evolutionary importance. Unlike other DNA transposons, Helitrons do not end in inverted repeats or create target site duplications, so they are particularly challenging to identify. Here we present HelitronScanner, a two-layered local combinational variable (LCV) tool for generalized Helitron identification that represents a major improvement over previous identification programs based on DNA sequence or structure. HelitronScanner identified 64,654 Helitrons from a wide range of plant genomes in a highly automated way. We tested HelitronScanner's predictive ability in maize, a species with highly heterogeneous Helitron elements. LCV scores for the 5? and 3? termini of the predicted Helitrons provide a primary confidence level and element copy number provides a secondary one. Newly identified Helitrons were validated by PCR assays or by in silico comparative analysis of insertion site polymorphism among multiple accessions. Many new Helitrons were identified in model species, such as maize, rice, and Arabidopsis, and in a variety of organisms where Helitrons had not been reported previously to our knowledge, leading to a major upward reassessment of their abundance in plant genomes. HelitronScanner promises to be a valuable tool in future comparative and evolutionary studies of this major transposon superfamily.

KW - Algorithm

KW - Bioinformatic analysis

KW - Computational tool

KW - Transposition

UR - http://www.scopus.com/inward/record.url?scp=84904310867&partnerID=8YFLogxK

U2 - 10.1073/pnas.1410068111

DO - 10.1073/pnas.1410068111

M3 - Article

C2 - 24982153

AN - SCOPUS:84904310867

VL - 111

SP - 10263

EP - 10268

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

SN - 0027-8424

IS - 28

ER -