In a groundbreaking advancement poised to reshape the landscape of precision medicine, researchers at the Icahn School of Medicine at Mount Sinai have unveiled a sophisticated artificial intelligence (AI) framework designed to decipher the penetrance of rare genetic variants. Traditionally, clinicians and patients grappling with the implications of genetic testing have been confronted with ambiguous interpretations, especially when encountering uncommon DNA mutations. This pioneering study, published in the prestigious journal Science on August 28, 2025, introduces a machine learning-based methodology that integrates electronic health records with routine laboratory data to generate a nuanced, probabilistic measurement of disease risk linked to genetic variants.
Conventional genetic assessments have long operated within a binary diagnostic framework—classifying individuals as either affected or unaffected by certain diseases. However, this categorical approach inadequately captures the complexities inherent in many common conditions such as hypertension, diabetes, and various forms of cancer, where phenotypic expression can span a spectrum of severity and onset. Addressing this limitation, the Mount Sinai team employed advanced machine learning algorithms to quantify disease expression continuously, thereby providing a more refined and clinically actionable insight into penetrance. This approach transcends simplistic yes/no verdicts, offering patients and healthcare providers a dynamic and scalable risk assessment tool.
At the core of this innovation is the integration of over one million electronic health records (EHRs), which furnish the AI models with an unprecedented depth of longitudinal clinical data. Variables such as lipid profiles, complete blood counts, and markers of renal function—parameters routinely collected in clinical practice—serve as real-world physiological indicators that enrich the model’s predictive capacity. By harmonizing these diverse data streams, the AI system calculates an individualized penetrance score ranging from 0 to 1, wherein values nearing unity denote a higher probability that a particular genetic variant will precipitate disease, and values closer to zero suggest negligible or absent risk.
Senior author Dr. Ron Do, Charles Bronfman Professor in Personalized Medicine, articulates the transformative potential of this approach: “Our goal was to move beyond binary interpretations that often leave patients and clinicians uncertain about the real-world implications of genetic test results. By harnessing artificial intelligence alongside routinely available clinical laboratory data, we can now deliver more precise estimates of disease risk for patients harboring specific variants, particularly those that are rare or previously uncharacterized.” This paradigm shift promises to enhance clinical decision-making by facilitating personalized risk stratification grounded in empirical evidence rather than theoretical assumptions.
The study’s development of the “ML penetrance” score entailed rigorous data curation and algorithmic training across ten prevalent diseases. The spectrum of diseases was carefully chosen to encompass conditions with heterogenous presentation and variable genetic etiology, ensuring robust applicability of the model. When applied to over 1,600 rare genetic variants, the AI revealed unexpected patterns: some variants formerly deemed of “uncertain significance” exhibited clear associations with disease phenotypes, while others previously implicated as pathogenic showed minimal effect in population-level clinical data. These findings underscore the critical importance of leveraging large-scale, real-world datasets to revisit and refine the pathogenicity classification of genetic variants.
Lead study author Dr. Iain S. Forrest emphasizes the clinical utility of these findings, cautioning that while the AI tool is not intended to supplant physician judgment, it offers an invaluable adjunct in ambiguous cases. For instance, in carriers of variants linked to Lynch syndrome—a hereditary cancer predisposition syndrome—the penetrance score could prompt timely screening interventions in high-risk individuals, thereby preventing cancer development or enabling early detection. Conversely, a low-risk score might spare patients from unnecessary surveillance and the anxiety associated with overdiagnosis. This precision-guided approach fosters a balance between proactive care and avoidance of overtreatment.
Moreover, the investigators are expanding the scope of their model to incorporate additional diseases and a broader array of genetic alterations, including structural variants and complex haplotypes. A critical future direction involves validating the predictive accuracy longitudinally by monitoring whether individuals with high penetrance scores indeed manifest disease and assessing the impact of early clinical interventions prompted by AI-based risk assessment. Such longitudinal studies will be pivotal in solidifying the clinical integration of AI-driven penetrance estimation.
Beyond the algorithmic innovation, this research exemplifies the fruitful synergy achievable through the confluence of genomics, clinical informatics, and artificial intelligence. Mount Sinai’s Windreich Department of AI and Human Health, under the leadership of Dr. Girish N. Nadkarni, who is internationally recognized for his expertise in ethical AI deployment in healthcare, played an instrumental role in driving this interdisciplinary endeavor. The department’s commitment to responsible AI research ensures that technologies like the ML penetrance model are developed with rigorous attention to clinical applicability, patient safety, and ethical considerations.
This work also benefits from Mount Sinai’s partnership with the Hasso Plattner Institute for Digital Health, a unique collaboration between the Mount Sinai Health System and the Hasso Plattner Institute for Digital Engineering in Germany. Their combined expertise in biomedical informatics, machine learning, and digital engineering accelerates the translation of computational breakthroughs into practical clinical tools, fostering scalable innovations geared toward improving health outcomes globally.
The broader institutional context is equally significant. The Icahn School of Medicine at Mount Sinai, one of the preeminent academic medical centers in the United States, boasts extensive expertise in translational research and clinical care. Its integration within a large, diverse health system provides unparalleled access to rich clinical datasets, enabling the development of data-driven approaches such as the ML penetrance model on a population scale. This infrastructure is essential for validating AI models across heterogeneous patient populations and ensuring their generalizability and equity.
In an era when the volume of genetic testing continues to surge, yielding a vast number of rare and ambiguous variants awaiting clinical interpretation, the integration of AI-driven penetrance estimation represents a crucial advancement. This methodology has the potential to demystify genetic risk, foster precision interventions, and ultimately improve patient outcomes through data-driven personalization. As genetic medicine moves toward this more refined, continuous risk assessment paradigm, patients and clinicians alike stand to gain clarity amidst the complexity of genomic information.
The study, titled “Machine learning-based penetrance of genetic variants,” signifies a landmark step in moving beyond traditional genetics into an era where machine learning and comprehensive clinical data converge to illuminate the nuanced realities of disease risk. By equipping healthcare providers with probabilistic tools grounded in rigorous data analysis, this research heralds a future where genetic information is no longer a source of uncertainty but a guiding beacon for tailored medical care.
Subject of Research: People
Article Title: Machine learning-based penetrance of genetic variants
News Publication Date: 28-Aug-2025
Web References: https://ai.mssm.edu/
References: Forrest IS, Vy HMT, Rocheleau G, Jordan DM, Petrazzini BO, Nadkarni GN, Cho JH, Ganapathi M, Huang K-L, Chung WK, Do R. Machine learning-based penetrance of genetic variants. Science. 2025 Aug 28.
Keywords: Genetic algorithms, Machine learning, Genetic penetrance, Precision medicine, Electronic health records, Rare genetic variants