Living fossil genome unveiled
Published today in the open-access journal GigaScience, is an article that presents the genome sequence of Ginkgo biloba, the oldest extant tree species. The research was carried out by a team of scientists at BGI, Zheijiang University and the Chinese Academy of Sciences, who tackled and analyzed an exceptionally large genome, totalling more than 10 billion DNA "letters". Ginkgo is considered a "living fossil", meaning its form and structure have changed very little in the 270 million years since it first came into existence. Given its longevity as a species and unique position in the evolutionary tree of life, the ginkgo genome will provide an extensive resource for studies concerning plant defenses against insects and pathogens, and research investigating early events in tree evolution and in evolution overall.
To study the ginkgo's extraordinary biology at a genetic and molecular level, sequencing its genome was high up on the wish list of plant biologists. However, because of its size as well as the presence of an enormous number of repeat sequences, assembling the whole genome sequence would be a difficult task. The ginkgo genome stretches over more than 10 Gb, which is 80 times larger than the "model plant" Arabidopsis thaliana genome. The tree's genome is also larger than other plant species known for extremely big genomes, such as maize or orchids. The great interest in the history and biology of gingko, however, made the work of sequencing and assembling the genome a challenge the researchers from China felt worth taking, and one they succeeded in accomplishing.
Wenbin Chen from BGI explains some of the difficulties that they had to overcome: "A huge amount of raw data (~2 TB) was generated, and the computing capability for genome assembly was challenged by both the huge data and the remarkably high proportion of repetitive sequences. So an incredible amount of memory was required." He went on to highlight several genome features: "The large genome of ginkgo may have resulted from whole genome duplication and insertion of a remarkably high proportion of repetitive sequences, at least 76.58%, and the longest introns among all sequenced species due to insertions of transposable elements."
Meeting the sequencing challenge was worth it for a variety of reasons. One certainly relates to its status as a "living fossil", at title shared by few other species, including the horseshoe crab and the nautilus. As the only surviving representative of a highly unusual group of non-flowering plants that appeared at least 270 million years ago, the ginkgo has retained traits over millions of years, such as the emblematic fan-shaped leaves, that are not seen in any other surviving plant species surviving. It further holds a very unique position in the plant evolutionary tree.
Professor Yunpeng Zhao, one of the authors from Zhejiang University, explains how this evolutionary placement is of great interest to researchers: "Ginkgo represents one of the five living groups of seed plants, and has no living relatives. Such a genome fills a major phylogenetic gap of land plants, and provides key genetic resources to address evolutionary questions like phylogenetic relationships of gymnosperm lineages, evolution of genome and genes in land plants, innovation of developmental traits, evolution of sex as well as history of demography and distribution, resistance and conservation of ginkgo."
Researchers are also fascinated by the ginkgo's resilience under adverse conditions– it is worth noting that ginkgo trees were one of the few living things to survive the blast of the atomic bombing of Hiroshima. This hardiness likely helped the ginkgo survive periods of glaciation in China that killed many other species, and may also promote the longevity of individual trees, some living up to several thousand years, according to reports. The ginkgo is also able to defend itself against a wide range of attackers, employing an arsenal of chemical weapons against insects, bacteria and fungi.
To better understand the ginkgo's defensive systems, the authors analysed the repertoire of genes present in the genome that are known to play a role in fending off attackers. An initial analysis of the tree's more than 40,000 predicted genes showed extensive expansion of gene families that provide for a variety of defensive mechanisms. Genes that enable resistance against pathogens are often duplicated. Additionally, ginkgo has a double-knockout punch in its fight against insects by synthesizing chemicals that directly fight insects and by releasing volatile organic compounds that specifically attract enemies of plant-eating insects. These findings indicated that having multiple mechanisms — the expansion of gene families, higher doses of specific genes, and versatility in its defence genes — might be linked to the ginkgo's extraordinary resilience. This information may then be useful to aid in understanding plant defence system with an eye to improving food security.
In keeping with the journal's goals of making the data underlying the analyses used in published research fully and freely available, all data from this project are available under a CC0 waiver in the GigaScience database, GigaDB, in a citable format (http://dx.doi.org/10.5524/100209), and, as a standard, the sequence data is available in the NCBI public repository under accession number PRJNA307642.
Rui Guan, Yunpeng Zhao, He Zhang, Guangyi Fan, Xin Liu, Wenbin Zhou, Chengcheng Shi, Jiahao Wang, Weiqing Liu, Xinming Liang, Yuanyuan Fu, Kailong Ma, Lijun Zhao, Fumin Zhang, Zuhong Lu, Simon Ming-Yuen Lee, Xun Xu, Jian Wang, Huanming Yang, Chengxin Fu, Song Ge, Wenbin Chen: Draft genome of the living fossil Ginkgo biloba. GigaScience 2016 http://dx.doi.org/10.1186/s13742-016-0154-1.
Notes to News Writers:
1. Publication in GigaScience includes storage of relevant associated data in the journal's affiliated database, GigaDB, where every dataset is provided with a digital object identifier (DOI), making it possible to cite and track data in standard scientific literature, which serves as a strong incentive for researchers to more rapidly release expensive and work-intensive datasets for community use. For this papers this is: Guan RZ, Y; Zhang, H; Fan, G; Liu, X; Zhou, W; Shi, C; Wang, J; Liu, W; Liang, X; Fu, Y; Ma, K; Zhao, L; Zhang, F; Lu, Z; Lee, S, M; Xu, X; Wang, J; Yang, H; Fu, C; Ge, S; Chen, W: De novo sequencing of Ginkgo biloba. GigaScience Database 2016, http://dx.doi.org/10.5524/100209
2. GigaScience is co-published by BGI, the world's largest genomics organization, and BioMed Central, the world's largest open-access publisher. The journal covers research that uses or produces 'big data' from the full spectrum of the life sciences. It also serves as a forum for discussing the difficulties of and unique needs for handling large-scale data from all areas of the life sciences. The journal has a completely novel publication format — one that integrates manuscript publication with complete data hosting, and analyses tool incorporation. To encourage transparent reporting of scientific research as well as enable future access and analyses, it is a requirement of manuscript submission to GigaScience that all supporting data and source code be made available in the GigaScience database, GigaDB, as well as in publicly available repositories. GigaScience will provide users access to associated online tools and workflows, and has integrated a data analysis platform, maximizing the potential utility and re-use of data. Follow GigaScience on twitter @GigaScience; Facebook https://www.facebook.com/GigaScience/, and keep up-to-date with our blog http://blogs.biomedcentral.com/gigablog/.
GigaScience, BGI Hong Kong
Tel: +852 3610 3531
Mob: +852 92490853