In the ever-evolving landscape of cognitive neuroscience, researchers continually seek reliable biomarkers that shed light on the intricacies of human mental processes. Among these, pupil dilation has emerged as an intriguing physiological metric linked to attention, memory load, and cognitive effort. A seminal study by Bényei and Pajkossy, recently published in Scientific Reports, delves deeply into the test–retest reliability and behavioral correlates of pupil responses during the n-back task—a widely used paradigm probing working memory and executive function.
This pioneering investigation bridges an essential gap in cognitive psychophysiology by rigorously evaluating the consistency of pupil size changes across repeated sessions. Such reliability is critically important for establishing pupil metrics as robust indicators of cognitive load and for potential applications spanning clinical diagnostics, brain-computer interfaces, and adaptive learning technologies. The n-back task, by challenging participants to continuously update and monitor a sequence of stimuli, provides an ideal experimental framework to trigger dynamic autonomic nervous system responses measurable via precise pupillometry.
The study’s methodological rigor is underscored by its detailed experimental design. Participants performed the n-back task under conditions varying in memory load, typically contrasting simpler 1-back and more demanding 2- or 3-back challenges. During these sessions, high-resolution eye tracking equipment captured nuanced temporal fluctuations in pupil diameter. By testing individuals across multiple sessions separated in time, the researchers could quantify the stability of pupil responses—a cornerstone of any biomarker’s scientific credibility.
Results revealed noteworthy patterns in the pupil dynamics linked to cognitive demand. Consistent pupil dilations manifested during higher working memory loads, validating earlier findings that pupil size scales with mental effort. More importantly, these dilation responses exhibited remarkable test–retest reliability at the individual level, confirming that pupillometric indicators could serve not only as momentary snapshots of cognitive state but as enduring individual trait markers. This opens new avenues for personalized cognitive monitoring.
Alongside physiological measurements, behavioral data such as reaction times and accuracy rates were meticulously analyzed. The coupling of behavioral performance with pupil dynamics illuminated subtle interdependencies between subjective cognitive effort, objective task success, and physiological arousal. The study elegantly illustrated how pupil dilation serves as a proxy for the brain’s adaptive allocation of attentional resources, enabling researchers to infer mental workload in real-time, non-invasively.
From a technical standpoint, the authors employed advanced statistical techniques, including intraclass correlation coefficients (ICC), to quantify reliability. Such metrics are vital to ascertain whether pupillary responses remain stable enough across time to warrant their use in longitudinal cognitive assessments. The high ICC values reported lend credence to the robustness of pupillometry in experimental and potentially clinical settings, moving beyond mere correlational observations toward validated biomarkers.
Moreover, the study considered confounding factors known to influence pupil size, such as ambient lighting, fatigue, and circadian variables. Through rigorous control and standardized protocols, the research minimized noise and improved signal fidelity. This meticulous operationalization safeguards against spurious findings and sets a new standard for experimental pupillometry in cognitive research.
Importantly, the authors contextualize their findings within broader neuroscientific frameworks. They discuss the role of the locus coeruleus–norepinephrine system, widely recognized as a key neuromodulatory influence on pupil size modulation under cognitive load. By anchoring pupillometric changes to underlying neurochemical circuits, the study bridges phenomenological measures and biological mechanisms, contributing to our integrative understanding of cognition and autonomic function.
The translational implications of this research are multifaceted. Reliable pupillometric biomarkers could revolutionize cognitive health assessments, enabling early detection of impairments in populations vulnerable to neurodegenerative conditions or psychiatric disorders. Additionally, neuroergonomic applications may harness real-time pupil data to optimize environments and interfaces for peak mental performance, adapting tasks dynamically based on cognitive load inferred from pupil signals.
Furthermore, the n-back task remains a foundational tool in cognitive neuroscience due to its elegant balance between experimental control and ecological validity. By elucidating pupillometric correlates of n-back performance, the study advances quantitative methods to characterize executive functions such as updating, inhibition, and shifting. This precision empowers both fundamental research and applied clinical approaches.
The investigation’s longitudinal design merits special emphasis. Longitudinal reliability is notoriously challenging to establish in psychological metrics given day-to-day variability inherent in human performance and physiology. That pupillary responses withstand this variability underscores their viability as stable markers. This makes repeat assessments feasible without necessitating recalibration for each session, streamlining future research designs.
Underscoring practical considerations, the study details technical parameters for pupil data acquisition and preprocessing, including baseline corrections and artifact rejection algorithms. These procedural insights equip researchers worldwide with actionable protocols to replicate and extend findings. Standardizing pupillometric methodologies enhances reproducibility—a current priority in cognitive science.
While the study confirms robust reliability, the authors acknowledge limitations such as sample size and demographic variability. They advocate for expanded research encompassing diverse populations and task paradigms to generalize findings. Such ambitious future directions highlight the dynamic nature of this research domain and the continual refinement of pupillometry as a tool.
In sum, Bényei and Pajkossy’s work represents a landmark achievement, validating pupil responses in the n-back task as reliable, sensitive indices of cognitive load and behavioral engagement. By illuminating the neurobehavioral architecture underpinning these responses, the study propels pupillometry into the forefront of objective cognitive assessment tools with wide-ranging scientific and clinical potential.
As technological advances render eye-tracking devices more accessible and portable, integrating pupillometry in real-world settings becomes increasingly plausible. The intersection of neurophysiology, behavioral science, and cutting-edge analytics embodied in this research promises ongoing breakthroughs in decoding the language of the mind through the window of the pupil.
Scientists and technologists will eagerly follow subsequent investigations catalyzed by this foundational work, which not only deepens mechanistic understanding but equips practitioners with validated metrics to measure and modulate human cognition in health and disease. The future of cognitive neuroscience shines ever brighter as the humble pupil reveals its profound secrets.
Subject of Research: Test–retest reliability and behavioral correlates of pupil responses during the n-back working memory task.
Article Title: Investigating the test–retest reliability and behavioral correlates of pupil responses in the n-back task.
Article References:
Bényei, G.L., Pajkossy, P. Investigating the test–retest reliability and behavioral correlates of pupil responses in the n-back task. Sci Rep (2026). https://doi.org/10.1038/s41598-026-46731-3
Image Credits: AI Generated

