Data scientists invent new tools to analyze the spread, evolution of novel coronavirus

Scripps Research is collaborating with top computational scientists at Johns Hopkins University and UCLA.

LA JOLLA, CA — For as much as the scientific community has learned about the novel coronavirus, SARS-CoV-2, since it emerged in China last year, many key aspects of the pandemic remain a mystery. And for that reason, COVID-19 has been an especially tricky disease to contain.

For example, how did the virus travel from country to country, or region to region? Do weather patterns affect its ability to spread? What demographic or socioeconomic factors put certain populations at higher risk?  

“The problem is that existing tools for analyzing infectious diseases can’t see how all of these factors are interconnected,” says Kristian Andersen, PhD, professor in the Department of Immunology and Microbiology at Scripps Research. “Even the most advanced tools either aren’t capable of dealing with the amount of data we have today or aren’t appropriate for the types of questions we’re trying to answer.”

That’s why Andersen and his collaborators— Lauren Gardner, PhD, of Johns Hopkins University and Marc Suchard, MD, PhD, of University of California, Los Angeles—are now working to develop better statistical models and visualization software.

The project has won a $1.3 million grant from the National Institutes of Health, with operations based out of the Scripps Research-led Center for Viral Systems Biology. The funding supplements an initial $15 million NIH grant that enabled Andersen to launch the center in 2018, with the goal of helping eradicate infectious diseases such as Ebola and Lassa.

Seeing everything at once

The team has already started its effort to build tools that can show how SARS-CoV-2 is moving around the world and what factors may be driving its spread and evolution. “The idea is to be able to analyze everything at the same time,” Andersen says.

“Everything” encompasses diverse factors such as airline traffic patterns, socioeconomic and demographic data, and weather conditions. It also includes genomic data from virus genomes sequenced from COVID-19 patients. Every day, hundreds of new genomes are shared openly on research databases; Andersen and others use that data to look for mutations, or slight changes in the genetic sequence, that show how the virus moved from person to person. 

Once the new tools are developed, the genomic data and the other information will build on the Johns Hopkins COVID-19 Dashboard data and Scripps Research’s Outbreak.info website, both of which are available to the public. The Johns Hopkins dashboard, developed by Lauren Gardner, has become the world’s most accessed resource for real-time COVID-19 information.

Data layers tell the story

For this project, Gardner draws from her expertise in epidemiological risk and mathematical modeling to integrate new layers of information, such as climate, land use and mobility.

“Our goal is to weave together rich data layers that we will continuously analyze, creating real-time updates on the rapidly evolving pandemic,” says Gardner, associate professor in the Department of Civil and Systems Engineering at Johns Hopkins. “From a public health perspective, it’s essential to see how the virus is really spreading and how mitigation efforts are working.”

Another key collaborator is statistician Marc Suchard, a professor in UCLA’s Departments of Biomathematics and Human Genetics. He is the senior developer of an open-source software program that’s used by more than 1,000 research groups worldwide to understand, on a genomic level, how infectious diseases spread. 

“Through the creation of new, scalable statistical models, we’ll be able to more clearly identify the factors that affect viral transmission and virulence for SARS-CoV-2,” Suchard says. “Not only will this allow us to understand whether certain public health measures are working, but it also will help predict how the disease could spread under different circumstances.”

At its highest level, the project seeks to make complex information easier to understand, revealing patterns that would otherwise go unnoticed. By fostering a greater understanding of the virus among researchers and the public, the team hopes that governments around the world can improve their response to the COVID-19 pandemic and minimize future outbreaks.

Media Contact
Kelly Quigley
[email protected]
https://www.scripps.edu/news-and-events/press-room/2020/20200611-COVID19-datascience.html

Comments