In the vast and intricate world beneath the ocean floor, foraminifera—tiny shelled microorganisms—play a pivotal role in our understanding of Earth’s changing climate and marine ecosystems. These microscopic entities, often no larger than a grain of sand, are not only abundant in seabed sediments worldwide but also carry within their shells a detailed record of environmental conditions spanning millions of years. The precise classification of various foram species is crucial for scientists seeking insights into past climate fluctuations, marine health, and potential sites for carbon capture and storage technologies, a key component in mitigating the effects of global warming.
Traditionally, the identification and classification of foraminifera have demanded painstaking manual labor by skilled geoscientists. This process, labor-intensive and time-consuming, involves careful morphological examination under microscopes, requiring expert knowledge and attention to detail. In recent years, the advent of deep learning (DL)—a subset of artificial intelligence that excels in pattern recognition—has promised to revolutionize this process. By automating classification tasks, DL methods can potentially accelerate research and increase accuracy. However, while many studies have demonstrated the capabilities of DL in classifying foraminifera, few have rigorously addressed a critical aspect: the uncertainty inherent in these automated predictions.
Understanding uncertainty in classification is not merely an academic exercise; it has profound ecological and practical implications. Misclassifying rare or ecologically significant species could lead to flawed interpretations of marine environments or misguided decisions regarding carbon sequestration projects. Recognizing this gap, a team of researchers from UiT The Arctic University of Norway, the Centre for Research-based Innovation SFI Visual Intelligence, Nofima, and the Norwegian School of Economics embarked on a comprehensive study to evaluate both the performance of DL models in foram classification and the reliability of their uncertainty estimates. Their work, recently published in Artificial Intelligence in Geosciences, sheds new light on the interface between artificial intelligence and human expertise in geoscience applications.
Using a dataset comprising 260 high-resolution images of various foraminifera and sediment grains, the researchers trained multiple state-of-the-art deep learning algorithms to detect, classify, and estimate the uncertainty of their predictions. This approach focused not only on the accuracy of classification but also on how well the models could quantify their confidence levels—a challenging task rarely prioritized in earlier studies. Accurate uncertainty quantification ensures that models can signal when their predictions might be unreliable, prompting human experts to review ambiguous cases rather than blindly accepting automated results.
To establish a robust comparative framework, the researchers devised an innovative method involving four senior geoscientists. These experts examined the same set of 260 images, providing both classifications and self-reported confidence levels for each specimen analyzed. The resulting dataset thus offered a rare human-derived baseline of uncertainty estimation against which the performance of DL models could be benchmarked. This comparative human-machine assessment is critical for transitioning AI tools from experimental research to dependable instruments in operational settings.
The study’s findings are striking. The deep learning methods demonstrated an ability to estimate uncertainty with a level of precision comparable to—and occasionally surpassing—that of seasoned geoscientists. This suggests that, far from being black-box algorithms, modern DL models can be calibrated to express not just what they predict but how confident they are in those predictions. The capacity to differentiate between high-certainty and ambiguous classifications permits more nuanced decision-making workflows in geological surveys and environmental monitoring.
Delving deeper, the research highlights several implications for the future deployment of AI in the geosciences. First, by integrating reliable uncertainty estimation, automated classification systems can alleviate the burden on human experts, enabling them to focus their attention where it matters most—on specimens that defy straightforward categorization. This selective scrutiny enhances research efficiency without compromising data integrity. Second, confidence-aware DL models can bolster trust in automated analytical tools, addressing skepticism often associated with AI decisions in scientific disciplines traditionally dominated by human judgment.
Importantly, the successful human-machine comparison underscores the potential for collaborative frameworks where AI supplements expert knowledge rather than replaces it. By quantifying their own uncertainty, models can act as intelligent assistants, flagging contentious cases and empowering geoscientists to make more informed interpretations. Such partnerships are essential as the scale and complexity of environmental datasets continue to expand exponentially, challenging conventional analysis methods.
From a technical standpoint, the study leveraged advanced convolutional neural networks (CNNs) fine-tuned for object detection and classification in microscopic imagery. The researchers incorporated probabilistic modeling techniques to capture uncertainty, employing methods like Monte Carlo dropout and ensemble learning to generate confidence scores. Evaluations against human benchmarks revealed that these approaches effectively capture both aleatoric uncertainty—stemming from image noise or inherent biological variability—and epistemic uncertainty arising from model limitations.
Moreover, by focusing on marine microfossils such as foraminifera, the research bridges domains of machine learning, geology, and environmental science. The interdisciplinary collaboration exemplifies how AI can accelerate knowledge generation in complex natural systems, facilitating climate change studies, marine conservation, and the exploration of carbon capture technologies. As sustainable practices take center stage globally, tools that enhance the precision and scalability of ecological assessments become invaluable.
This breakthrough research signals a leap forward in the automated analysis of marine microfossils. By rigorously quantifying uncertainty and comparing predictions to human expertise, the study lays foundational work for trustworthy AI applications in geosciences. Future work might expand datasets, explore additional microorganisms, and refine uncertainty estimation techniques further, ultimately integrating these tools into standard workflows in oceanographic and climate research institutions worldwide.
In summary, the deployment of deep learning models capable of not only classifying foraminifera with human-level accuracy but also quantifying prediction uncertainty marks a transformative advancement. This technological synergy promises to enhance our understanding of Earth’s marine ecosystems, improve environmental monitoring, and support strategic climate action. The journey toward fully reliable AI-augmented geoscience research continues, empowered by studies such as this that carefully balance innovation with expert human judgment.
Contact the author:
Iver Martinsen, PhD Candidate at SFI Visual Intelligence: iver.martinsen@uit.no
Petter Bjørklund, Communications Officer at SFI Visual Intelligence: petter.bjorklund@uit.no
Subject of Research: Classification and uncertainty quantification of foraminifera using deep learning
Article Title: Quantifying uncertainty in foraminifera classification: How deep learning methods compare to human experts
Web References: DOI: 10.1016/j.aiig.2025.100145
Image Credits: Petter Bjørklund, Communications Officer at SFI Visual Intelligence and UiT The Arctic University of Norway
Keywords: Microscopy, Deep Learning, Foraminifera, Uncertainty Quantification, Geoscience, Climate Change, Marine Microorganisms, AI in Environmental Science