Researchers predict risk for common deadly diseases from millions of genetic variants

Credit: Lauren Solomon

A research team at the Broad Institute of MIT and Harvard, Massachusetts General Hospital (MGH), and Harvard Medical School reports a new kind of genome analysis that could identify large fractions of the population who have a much higher risk of developing serious common diseases, including coronary artery disease, breast cancer, or type 2 diabetes.

These tests, which use information from millions of places in the genome to ascertain risk for five diseases, can flag greater likelihood of developing the potentially fatal conditions well before any symptoms appear. While the study was conducted with data from the UK, it suggests that up to 25 million people in the US may be at more than triple the normal risk for coronary artery disease, and millions more may be at similar elevated risk for the other conditions, based on genetic variation alone. The genomic information could allow physicians to focus particular attention on these individuals, perhaps enabling early interventions to prevent disease.

The research raises important questions about how this method, called polygenic risk scoring, should be further developed and used in the medical system. In addition, the authors note that the genetic tests are largely based on information from individuals of European descent, and the results underscore the need for larger studies of other ethnic groups to ensure equity. The study appears in Nature Genetics.

"We've known for long time that there are people out there at high risk for disease based just on their overall genetic variation," said senior author Sekar Kathiresan, an institute member and director of the Cardiovascular Disease Initiative at the Broad Institute, as well as director of the Center for Genomic Medicine at MGH and a professor of medicine at Harvard Medical School. "Now, we're able to measure that risk using genomic data in a meaningful way. From a public health perspective, we need to identify these higher-risk segments of the population so we can provide appropriate care."

Kathiresan led the work with first authors Amit V. Khera, a cardiologist at MGH and junior faculty member in Kathiresan's lab, and Mark Chaffin, a computational biologist also in Kathiresan's lab.

To develop the algorithms for scoring disease risk, the researchers first gathered data from large-scale genome-wide association studies to identify genetic variants associated with coronary artery disease, atrial fibrillation, type 2 diabetes, inflammatory bowel disease, or breast cancer. For each disease, they applied a computational algorithm to combine information from all of the variants — most of which individually have an extremely small impact on risk — into a single number, or polygenic risk score. This number could be used to predict a person's chances of getting these diseases based on his or her genome.

The team tested and validated the polygenic risk score algorithms on data from over 400,000 individuals in the UK Biobank, an extensive database of genomic data and medical information from participants of British ancestry.

Importantly, according to Khera, the people with high polygenic risk scores for coronary artery disease did not necessarily exhibit other warning signs of disease risk (such as hypertension or high cholesterol).

"These individuals, who are at several times the normal risk for having a heart attack just because of the additive effects of many variations, are mostly flying under the radar," he explained. "If they came into my clinical practice, I wouldn't be able to pick them out as high risk with our standard metrics. There's a real need to identify these cases so we can target screening and treatments more effectively, and this approach gives us a potential way forward."

Here's how the score worked for coronary artery disease: The algorithm pored over more than 6.6 million locations in the genome to estimate a person's risk of developing the deadly disease, which is the most common type of heart disease and a leading cause of death for adults in the United States. Of the individuals in the UK Biobank dataset, 8 percent were more than three times as likely to develop the disease compared to everyone else, based on their genetic variation. In absolute terms, only 0.8 percent of individuals with the very lowest polygenic risk scores had coronary artery disease, as compared to 11 percent for the people with the top scores.

For breast cancer, a leading cause of malignancy-related death in women, the polygenic predictor found that 1.5 percent of the UK Biobank population had more than triple the risk for having the disease when compared to everyone else. Those with the very highest polygenic risk scores had five times the risk — meaning, in absolute terms, that 19 percent of people with the top scores had breast cancer, versus about 4 percent of the remaining individuals. The researchers applied a similar approach to polygenic risk scoring for type 2 diabetes, atrial fibrillation, and inflammatory bowel disease.

To develop polygenic risk scoring tests for other common diseases, the team notes that additional research will be necessary to collect genome-wide association data and validate the scores with reference biobanks. In addition, the current polygenic risk calculations are largely derived from genetic studies done in people of European ancestry — so more studies are needed to optimize the algorithms for other ethnic groups.

Nevertheless, the researchers propose that it is time for the biomedical community to consider including this approach in clinical care. To do this, a number of factors need to be considered, such as: whether the disease has a genetic component; if the disease is prevalent enough in the general population to make screening worth incorporating into routine clinical care; and if knowing the genetic risk for a disease would be useful in guiding care to offset this inherited risk.

"Ultimately, this is a new type of genetic risk factor," said Kathiresan. "We envision polygenic risk scores as a way to identify people at high or low risk for a disease, perhaps as early as birth, and then use that information to target interventions — either lifestyle modifications or treatments — to prevent disease. For heart attack, I foresee that each patient will have the opportunity to know his or her polygenic risk number in the near future, similar to way they can know their cholesterol number right now."


Funding for this study was provided in part by the National Institutes of Health, the National Lipid Association, and the Ofer and Shelly Nemirovsky Research Scholar Award from Massachusetts General Hospital.

Khera and Kathiresan are listed as co-inventors on a patent application for the use of genetic risk scores to determine risk and guide therapy.

Media Contact

David Cameron
[email protected]