Gathering big data to accelerate the COVID-19 fight
Scientists creating secure, central database of electronic health records from coronavirus patients
A nationwide collaboration of clinicians, informaticians and other biomedical researchers aims to turn data from hundreds of thousands of medical records from coronavirus patients into effective treatments and predictive analytical tools that could help lessen or end the global pandemic.
Through the National COVID Cohort Collaborative, about 60 clinical institutions affiliated with the National Institutes of Health-supported Clinical and Translational Science Awards Program are invited to partner with U.S. Department of Health & Human Services agencies and clinical organizations. Together, Collaborative members will support the analysis of electronic health records on a new, secure database.
The National COVID Cohort Collaborative is supported as part of a $25 million NIH award to the National Center for Data to Health, which is coordinating the collaborative’s efforts and is based at Oregon Health & Science University’s Oregon Clinical and Translational Research Institute. NIH’s National Center for Advancing Translational Sciences, also known as NCATS, is providing overall stewardship of the Collaborative.
“There is no centralized health care data in the United States,” explained Melissa Haendel, Ph.D., the Collaborative’s lead investigator, National Center for Data to Health director, an associate professor of medical informatics and clinical epidemiology in the OHSU School of Medicine, and translational data science director at Oregon State University.
“The coronavirus pandemic has spurred us to build, for the first time, a process for collecting and harmonizing electronic health records from many different institutions, storing it in one secure location, and making it available in a collaborative platform for use by diverse experts,” she added.
The secure, cloud-based database is certified through the Federal Risk and Authorization Management Program, or FedRAMP, which provides standardized assessment, authorization and continuous monitoring for cloud products and services. The National Center for Advancing Translational Sciences is providing the database, which contains records from patients who have undergone coronavirus testing or are suspected to be infected.
Individuals granted access to the database will be able to run algorithms on this first-of-its-kind patient data set without seeing actual patient records. A safe derivative of the patient data called synthetic data also will be available.
The database will enable new machine learning and rigorous modern statistical analyses to answer key questions such as predicting patient responses to antiviral or anti-inflammatory therapies, identifying potential new drugs and treatments, and finding other indicators such as biomarkers that can inform clinical decision making.
“This effort demonstrates how the existing resources of the National Center for Advancing Translational Sciences and Clinical and Translational Science Awards Program hubs can be leveraged to quickly address public health emergencies,” said Michael G. Kurilla, M.D., Ph.D., Division of Clinical Innovation director at NCATS. “The National COVID Cohort Collaborative represents a shared vision to make data more meaningful, open and accessible to the research community to study COVID-19 and help identify urgently needed treatments.”
The first sampling of electronic health records was transferred to the database May 12, and more will be uploaded as additional partners join the effort. Fifteen institutions that have agreed to contribute data thus far, including Oregon Health & Science University, John Hopkins University, University of North Carolina at Chapel Hill, Rockefeller University, Washington University, University of Kentucky, Medical University of South Carolina, Stony Brook University, University of Alabama at Birmingham, Tufts University, University of Wisconsin-Madison, University of Massachusetts, Wake Forest University Health Sciences, Maine Medical Center Research Institute and Penn State.
While the Collaborative’s database is not intended to be a repository of all coronavirus patients records, organizers want to make its data fully reflective of America’s diverse residents and have diverse clinicians and health care researchers from across the U.S. analyze the data. More partners are needed to make this happen. Those interested in contributing data or participating in this effort should send an email to [email protected]
- National COVID Cohort Collaborative (N3C): https:/
/, https:/ covid. cd2h. org/ N3C / ncats. nih. gov/ n3c
- Clinical and Translational Science Awards Program (CTSA): https:/
/ ncats. nih. gov/ ctsa
- National Center for Data to Health (CD2H): https:/
/ cd2h. org
- National Center for Advancing Translational Sciences (NCATS): https:/
/ ncats. nih. gov/