Extensive variation revealed in 1,001 genomes and epigenomes of Arabidopsis
An international team of scientists has sequenced the whole genomes and epigenomes of more than 1,000 Arabidopsis thaliana plants, sampled from geographically diverse locations. The collection of 1,001 genomes and 1,001 epigenomes not only illuminates new aspects of its evolutionary history, but also provides a comprehensive, species-wide picture of the interaction between genetic and epigenetic variation in this important model plant.
When next-generation sequencing appeared around 2007, it became possible to sequence genomes relatively rapidly and cheaply. Human geneticists quickly developed a project to sequence 1,000 genomes to catalog human genetic variation. Not to be outdone, plant biologists decided that "if our colleagues will have one thousand genomes, then we have to have at least one thousand and one genomes," joked Detlef Weigel, Director of the Max Planck Institute for Developmental Biology in Germany, who co-led the 1,001 genomes project. But sequencing the genome only provided part of the story–the researchers went further and sequenced the transcriptome and methylome of these plants, too.
Many questions about plant evolution and adaptation can be addressed with the new data. "It's an enormous hypothesis generator in terms of trying to understand what happens in the natural world," explained Howard Hughes Medical Institute (HHMI) and Gordon and Betty Moore Foundation (GBMF) Investigator Joe Ecker, a plant biologist at the Salk Institute who directed the 1,001 epigenomes project. "In the past, we've made mutations in almost all genes in the laboratory. But here you're looking at both subtle and not so subtle variants, both genetic and epigenetic, that are captured from the wild." So it provides an opportunity for scientists interested in how wild plants adapt to climate change, for example.
"[Researchers] will have tools to look at what kind of natural variation exists in a gene of interest," said Magnus Nordborg, Director of the Gregor Mendel Institute in Austria, who co-led the 1001 genomes project. "The ultimate goal is to move away from the reference genome, and get a complete picture of all the genetic variation–and everything that it is associated with."
The two new studies, which will be published together in the July 14, 2016 issue of Cell, show that approximately 25 percent of genes in the Arabidopsis genome exhibit diversity in their methylation state. Methylation, the addition of methyl groups to a strand of DNA, is related to silencing of transposable elements, the "jumping genes" in the genome. "Methylation can also modify gene expression, for example by blocking a transcription factor from landing on a gene promoter and activating it," said Carol Huang, a computational biologist at the Salk Institute who co-led the epigenomes study.
The researchers also found that the genome and epigenome closely interact with each other. "There are genes that control the epigenome in these various plants," Ecker explained, "and variants of those genes potentially alter the epigenome in a way that helps the plant survive better in a particular environment." Taiji Kawakatsu, a plant biologist at the Salk Institute, now working at the National Institute of Agrobiological Sciences in Japan, who co-led the work added "Those genes may also play roles in generating cell-type specific epigenome patterns and inter-species epigenome diversity."
Although the researchers had looked for such associations between genome and epigenome before, the studies were limited by smaller sample sizes. "With 1,000 accessions, we are able to measure how much methylation variation can be explained by genetic variants," said Eriko Sasaki, a population geneticist at the Gregor Mendel Institute in Austria, who co-led the epigenomes analysis. "It gives us a much better sense, quantitatively, of the interplay between genetic and methylation variation."
Another striking feature they discovered was that genes involved in immunity show more genetic and epigenetic variation than other classes of genes. The immunity genes not only had small mutations, but also "variable epiallelic states and are associated with large scale structural rearrangements and transposable elements," said Florian Jupe, a plant biologist at the Salk Institute who co-led the epigenomes project.
The researchers also noticed correlations of genetic and epigenetic variants with climate and geographic location. They are working to identify which genes and epigenetic markers allow a specific variety to thrive in a particular environment. Plants are probably one of the best organisms in which to study adaptation: when plants are placed in a new environment, they have to adapt quickly, because they have nowhere to go, explained HHMI Investigator Joanne Chory, a plant biologist at the Salk Institute who was not involved in the study. "I think plants are the only place where you can really map genotype to phenotype and fitness. It's very hard to do in humans … if they can figure this out in the plant world, that will be a huge contribution to understanding multicellular life in general," Chory said.
The new dataset will also be an inspiration to plant breeders, who typically focus on genetic markers to select for genes of interest. "Breeders could potentially use epigenetic information just like they use genetic information to select for traits; the power of such an approach can now be tested," said Ecker. "Beyond individual genes being useful, the idea that there are epigenetic variants that could be selected for is something that they should pay attention to," he said.