BOSTON — It's not unusual for siblings to seem more dissimilar than similar: one becoming a florist, for example, another becoming a flutist, and another becoming a physicist.
Something of the same diversity applies to the "brood" of proteins produced from any single gene in human cells, a new study led by scientists at Dana-Farber Cancer Institute, University of California, San Diego School of Medicine, and McGill University has found. In a first large-scale systematic study, the researchers found that most sibling proteins — known as "protein isoforms" encoded by the same gene — often play radically different roles within tissues and cells, however alike they may be structurally.
The research, published online today by the journal Cell, stands to have a powerful effect on the understanding of human biology and the direction of future research. For one, it may help explain how the mere 20,000 protein-coding genes in the human genome – fewer than are found in the genome of a grape — can give rise to creatures of such enormous complexity. Scientists know that the number of different proteins in human cells, thought to be upwards of 100,000, far exceeds the number of genes, but many questions have remained. Do most of those proteins have a unique function in the cell, or do their roles sometimes overlap? The discovery that different protein isoforms encoded by the same gene may have divergent functions on a larger scale than realized suggests that they vastly multiply what our genes are capable of.
This diversity also suggests that each protein isoform needs to be studied individually to understand its normal role and its potential involvement in disease, the study authors state.
"Research into cancer-related proteins, for example, often focuses on the most prevalent isoforms in a given cell, tissue, or organ," said co-senior author David E. Hill, PhD, associate director of the Center for Cancer Systems Biology (CCSB) at Dana-Farber. "Since less-prevalent protein isoforms may also contribute to disease, and may prove to be valuable targets for drug therapy, their role should be examined as well; and to do that properly, we also need comprehensive clone collections covering all expressed isoforms."
Previous functional studies of protein isoforms have generally been done on a gene-by-gene basis. Furthermore, researchers frequently compared the activity of a gene's "minor" isoforms to that of its predominant isoform in a particular tissue. The new study approached the functional question from a larger perspective – by gathering multiple protein isoforms of hundreds of genes and comparing how they specifically interact with any other human protein.
One of the ways that cells produce multiple protein isoforms from individual genes is a process called alternative splicing. Most human genes contain multiple segments called exons, separated by intervening non-coding sequences called introns. In the cell, different combinations of these individual exons are "glued" or spliced together to generate a final expressed gene product; thus, a single gene can encode a set of distinct, but related protein isoforms, depending on the specific exons that are spliced. One isoform, for example, may result from splicing exons A-B-C-D of a particular gene. Another may arise from the skipping of exon C, resulting in a product with only exons A-B-D.
For the new study, researchers devised a technique called "ORF-Seq" that allowed them to identify and clone large numbers of alternatively spliced gene products in the form of open reading frames (ORFs), and use them to produce multiple protein isoforms for hundreds of genes.
Of the roughly 20,000 genes in the human genome that code for proteins, researchers concentrated on about eight percent. Using ORF-Seq, they ultimately created a collection of 1,423 protein isoforms for 506 genes, of which more than 50 percent were entirely novel gene products. They subjected 1,035 of these protein isoforms through a mass screening test that paired them with 15,000 human proteins to see which would interact.
"The exciting discovery was that isoforms coming from the same gene often interacted with different protein partners," remarked Gloria Sheynkman, PhD, of Dana-Farber and one of the lead authors. "This suggests that the isoforms play very different roles within the cell" – much as siblings with different careers often interact with different sets of friends and co-workers.
The researchers found that in most cases, related isoforms shared less than half of their protein partners. Sixteen percent of related isoforms share absolutely no protein partners. "From the perspective of all the protein interactions within a cell, related isoforms behave more like distinct proteins than minor variants of one another," Tong Hao, of Dana-Farber and one of the lead authors, asserted.
Intriguingly, isoforms that stem from a minuscule difference in DNA – a difference of just one letter of the genetic code — sometimes had starkly different roles within the cell, researchers found. At the same time, related isoforms that are structurally quite different may have very similar roles.
Quite often, the interaction partners of related isoforms vary from tissue to tissue, the researchers found. In the liver, for example, an isoform may interact with one set of proteins. In the brain, a relative of that isoform may interact with a largely different set of protein partners.
"A more detailed view at protein interaction networks, as presented in our paper, is especially important in relation to human diseases," said co-senior author Lilia Iakoucheva of UC San Diego. "Drastic differences in interaction partners among splicing isoforms strongly suggest that identification of the disease-relevant pathways at the gene level is not sufficient. This is because different variants could participate in different pathways leading to the same disease or even to different diseases. It's time to take a deeper dive into the networks that we are building and analyzing."
Author information and funding sources
The co-lead authors of the study are Xinping Yang, PhD, of Dana-Farber and Nanfang Hospital, Southern Medical University, Guangzhou, China; Gloria Sheynkman, PhD, and Tong Hao of Dana-Farber; Jasmin Coulombe-Huntington, PhD, of McGill University; and Shuli Kang, PhD, of UCSD. The co-senior authors are David E. Hill, PhD, and Marc Vidal, PhD, of Dana-Farber; Frederick P. Roth, PhD, of the University of Toronto, Mt. Sinai Hospital in Toronto, and the Canadian Institute for Advanced Research; Lilia Iakoucheva, PhD, of UCSD; and Yu Xia, PhD, of McGill University. The co-authors are Aaron Richardson, PhD, Yun Shen, Ryan Murray, Kerstin Spirohn, Bridget Begg, Andrew MacWilliams, Quan Zhong, PhD, Shelly Trigg, Stanley Tam, Lila Ghamsari, PhD, Nidhi Sahni, PhD, Song Yi, PhD, Maria Rodriguez, Dawit Balcha, Kourosh Salehi-Ashtiani, PhD, Benoit Charloteaux, PhD, Alyce Chen, PhD, MPH, and Michael Calderwood, PhD, of Dana-Farber; Samuel Pevzner, PhD, of Dana-Farber and Boston University; Guihong Tan, PhD, Michael Costanzo, PhD, Brenda Andrews, PhD, and Charles Boone, PhD, of the University of Toronto; Fan Yang of the University of Toronto and Mt. Sinai Hospital, Toronto; Song Sun, PhD, of the University of Toronto, Mt. Sinai Hospital, Toronto, and Uppsala University, Uppsala, Sweden; Miquel Duran-Frigola, PhD, of the Barcelona Institute of Science and Technology; Patrick Aloy, PhD, of the Barcelona Institute of Science and Technology and Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain; and Xianghong Zhou, PhD, of the University of Southern California at Los Angeles.
The work was supported by the National Human Genome Research Institute (grants P50HG004233 and U01HG001715); the Ellison Foundation; the National Cancer Institute (grant R33CA132073); the Krembil Foundation; a Canada Excellence Research Chair Award; an Ontario Research Fund-Research Excellence Award; the Eunice Kennedy Shriver National Institute of Child Health and Human Development (grant R01HD065288); the National Institute of Mental Health (grants R01MH091350, R01MH105524, and R21MH104766; the National Science Foundation (grant CCF-1219007); the Natural Sciences and Engineering Research Council of Canada (NSERC) (grant RGPIN-2014-03892), the Canada Foundation for Innovation (grant JELF-33732); the Canada Research Chairs Program; National Institutes of Health (training grant T32CA009361); an NSERC fellowship; the National Institute of General Medical Sciences (grant R01GM105431); and a Swedish Research Council International Postdoc Grant.
About Dana-Farber Cancer Institute
From achieving the first remissions in childhood cancer with chemotherapy in 1948, to developing the very latest new therapies, Dana-Farber Cancer Institute is one of the world's leading centers of cancer research and treatment. It is the only center ranked in the top 4 of U.S. News and World Report's Best Hospitals for both adult and pediatric cancer care.
Dana-Farber sits at the center of a wide range of collaborative efforts to reduce the burden of cancer through scientific inquiry, clinical care, education, community engagement, and advocacy. Dana-Farber/Brigham and Women's Cancer Center provides the latest in cancer care for adults; Dana-Farber/Boston Children's Cancer and Blood Disorders Center for children. The Dana-Farber/Harvard Cancer Center unites the cancer research efforts of five Harvard academic medical centers and two graduate schools, while Dana-Farber Community Cancer Care provides high quality cancer treatment in communities outside Boston's Longwood Medical Area.
Dana-Farber is dedicated to a unique, 50/50 balance between cancer research and care, and much of the Institute's work is dedicated to translating the results of its discovery into new treatments for patients locally and around the world.