In a groundbreaking new study published in Science Advances, researchers from Boston University School of Public Health and Stanford University have unveiled a startling revelation about the true death toll of COVID-19 in the United States during the first two years of the pandemic. Utilizing an innovative machine learning methodology, the team quantified a 19 percent undercount in official COVID-19 mortality records, exposing a significant number of deaths that were not recognized as resulting directly from SARS-CoV-2 infection. This analysis not only redefines our understanding of the pandemic’s impact but also underscores systemic flaws and inequalities in the US death investigation infrastructure.
Traditional epidemiological approaches to estimating unrecognized COVID-19 deaths have predominantly relied on excess mortality models, which compare observed all-cause deaths against expected death rates absent the pandemic. While insightful, such models conflate deaths caused directly by the virus with fatalities indirectly triggered by pandemic-related societal disruptions, such as delayed medical care or economic distress. The novel machine learning technique employed in this study refines these estimates by specifically targeting deaths linked to SARS-CoV-2 infection, thereby delivering a more precise assessment of the pandemic’s direct lethality.
The researchers trained their machine learning algorithm on a robust dataset comprised of hospital-verified inpatient deaths attributed to COVID-19. This choice was deliberate, grounded in the recognition that hospital deaths underwent near-universal COVID-19 testing during the pandemic’s peak periods, ensuring high diagnostic reliability. Leveraging these verified inpatient cases as a foundation, the model extrapolated predictions to out-of-hospital settings, where COVID-19 diagnoses were often lacking or inconsistent, particularly in home deaths.
The findings paint a sobering picture: between March 2020 and December 2021, over 155,000 deaths—equivalent to an approximate 19 percent increase beyond federal counts—went unrecognized as COVID-19 related. This discrepancy was most pronounced in deaths occurring in private residences, which the study reveals were underreported by a staggering 160 percent. These uncounted fatalities suggest that over 111,000 individuals died at home from COVID-19 without the death certificate reflecting the true cause.
Demographic analysis revealed disturbing inequities in undercounted fatalities. Marginalized groups—including racial and ethnic minorities such as Hispanic, American Indian and Alaska Native, Black, and Asian populations—alongside socioeconomically disadvantaged individuals, those without a high school diploma, and residents of counties burdened by preexisting health conditions, were disproportionately affected. Geographically, the Southern United States exhibited significant underreporting, with states like Alabama undercounting COVID-19 deaths by over 67 percent. This regional disparity highlights glaring inconsistencies in death recording practices across jurisdictions.
Underlying cause-of-death coding further complicated the accurate enumeration of COVID-19 fatalities. Often, deaths attributable to the virus were misclassified under chronic conditions such as Alzheimer’s disease and related dementias, cardiovascular disease, and diabetes. This misclassification points toward systemic challenges within the death certification process, including inadequate training, resource limitations, and political influences that may compromise objectivity—particularly in counties relying on elected coroners without requisite medical qualifications.
The implications of these findings extend far beyond mortality statistics. By obscuring the true extent of COVID-19 impact, the undercounting phenomenon conceals the profound structural inequities embedded within public health and social policy frameworks. The failure to accurately record deaths in vulnerable populations delays targeted intervention, obscures the urgency of policy responses, and perpetuates disparities that compound the pandemic’s devastation.
Experts involved in the study argue vehemently for sweeping reforms in the US death investigation system. Presently, this network operates as a fragmented, underfunded patchwork, often staffed by personnel lacking sufficient scientific expertise and comprehensive training in forensic epidemiology. Enhanced funding, standardized protocols, and increased employment of medically trained examiners are essential measures to modernize this critical public health infrastructure.
Importantly, while the study showcases the immense potential of machine learning to enhance death surveillance, the researchers caution against viewing these computational tools as standalone solutions. Instead, artificial intelligence should complement broader systemic reforms, ensuring that cause-of-death data collection becomes more accurate, timely, and equitable. Further, such techniques could be adapted to address other public health challenges characterized by incomplete or biased data, including drug overdose mortality, custodial deaths, and fatalities related to environmental hazards.
The study also challenges ongoing public debates that question the proportionality of pandemic control measures. By demonstrating that excess deaths predominantly stemmed from the viral infection itself rather than mitigation policies, the research reorients discourse toward prioritizing comprehensive, evidence-based public health strategies over politicized narratives.
In sum, this pioneering machine learning approach not only recalibrates the national understanding of COVID-19’s deadly reach but also spotlights the urgent necessity for systemic reform in mortality data collection—a reform that promises to improve responses to future public health crises and advance health equity at a foundational level.
Subject of Research: People
Article Title: Applying machine learning to identify unrecognized COVID-19 deaths recorded as other causes of death in the United States
News Publication Date: 18-Mar-2026
Web References:
- DOI link to the article
- CDC Excess Deaths Estimates
- Boston University SPH Article on Mortality
- National Academies Report on Death Investigation System
References:
- Prior studies estimating undercounted COVID-19 deaths using excess mortality frameworks
- Publications highlighting systemic challenges in death certification processes during COVID-19 (PUbMed: 32520302, 36545301, 34515787)
Keywords:
COVID-19, SARS-CoV-2, Mortality rates, Machine learning, Death investigation system, Health disparities, Public health, Pandemic mortality, Death certification, Racial and ethnic inequities, Computational modeling, Excess mortality

