In the ever-evolving landscape of artificial intelligence, deep neural networks (DNNs) have established themselves as the cornerstone of modern AI systems. These networks excel at deciphering complex patterns within vast datasets, such as images, audio, and text, enabling groundbreaking advancements in prediction and classification tasks across numerous disciplines. Traditionally, the training of these networks has leaned heavily on the back-propagation (BP) algorithm, a method that iteratively propagates errors backward through the network to adjust weights. Despite its ubiquity, back-propagation suffers from intrinsic limitations, including slow convergence, susceptibility to overfitting, significant computational demands, and an often impenetrable “black box” nature. Such issues highlight the pressing need for alternative training paradigms capable of delivering both efficiency and interpretability.
Emerging as a promising contender in this quest is the forward-forward network (FFN) framework, which diverges radically from the back-propagation approach. Rather than performing end-to-end training that propagates errors backward through the network’s layers, FFNs train each layer independently by optimizing a local goodness function. This paradigm promises faster convergence and more transparent training dynamics, potentially mimicking biological learning mechanisms more closely than conventional algorithms. However, the straightforward application of FFN to convolutional neural networks (CNNs)—the backbone of image-related AI tasks—has been fraught with challenges. CNNs rely on intricate spatial feature extraction, and splitting their training into independent layers risks losing critical information embedded within the input images, adversely impacting accuracy.
Addressing this formidable challenge, a team of researchers from Seoul National University of Science and Technology, led by Ph.D. candidate Gilha Lee and Associate Professor Hyun Kim from the Department of Electrical and Information Engineering, have crafted an innovative solution dubbed the Visual Forward-Forward Network (VFF-Net). Their pioneering study, published in the esteemed journal Neural Networks in October 2025, delineates how VFF-Net transcends the inherent limitations of FFNs when applied to convolutional architectures. By preserving spatial features and optimizing efficiency, their approach represents a significant leap towards more brain-like, resource-efficient AI training methods.
Central to the VFF-Net’s success is the introduction of three novel methodologies: Label-Wise Noise Labeling (LWNL), Cosine Similarity-based Contrastive Loss (CSCL), and Layer Grouping (LG). LWNL innovatively expands the training process by incorporating three distinct types of data: clean original images, positively labeled images correctly corresponding to their targets, and negatively labeled images deliberately assigned incorrect labels. This strategy counters the tendency of traditional FFNs to lose pixel-level information by reinforcing the network’s capacity to differentiate between genuine and corrupted inputs. Essentially, LWNL acts as a sophisticated noise-augmented training mechanism designed to safeguard the integrity of spatial details critical for accurate classification.
Complementing LWNL, the Cosine Similarity-based Contrastive Loss (CSCL) function refines the goodness criterion by leveraging the directional similarity of feature maps rather than mere magnitude comparisons. Unlike conventional approaches that often utilize Euclidean distances or simplistic statistical measures, CSCL evaluates how closely feature representations align in their vector orientations within high-dimensional space. This subtle yet powerful adjustment enables VFF-Net to retain intricate spatial relationships and nuanced patterns indispensable for discerning subtle differences across image categories. Consequently, CSCL imparts an enhanced ability to capture the essence of visual features, elevating classification performance.
To address the issue of training convolutional layers individually—a process that might otherwise degrade model performance due to inconsistent optimization—VFF-Net introduces Layer Grouping (LG). This methodology clusters layers exhibiting similar output characteristics into cohesive groups, training them collectively rather than in isolation. To further bolster this collective learning, auxiliary layers are incorporated, allowing the network to propagate meaningful gradients within grouped units. By doing so, LG mitigates the pitfalls of layer-wise greediness inherent in FFNs and nurtures a more holistic convergence resembling end-to-end training. The strategic grouping procedure significantly enhances stability and responsiveness, ensuring that the network’s layered structure contributes constructively to overall task performance.
The empirical results validating VFF-Net’s innovations are compelling. When applied to a CNN architecture with four convolutional layers, VFF-Net achieved notable reductions in test error rates on benchmark image datasets. Specifically, it decreased errors by 8.31% on CIFAR-10, a widely utilized dataset comprising 10 classes of natural images, and by 3.80% on the more challenging CIFAR-100 dataset containing 100 distinct categories. Additionally, when applied to a fully connected architecture, VFF-Net attained an impressively low test error of 1.70% on the MNIST dataset, a foundational corpus for handwritten digit recognition. These quantitative gains underscore VFF-Net’s capability to bridge the performance gap between FFNs and conventional back-propagation-trained CNNs while harnessing the benefits of layer-wise training.
Beyond the quantitative metrics, VFF-Net’s broader implications are profound. By circumventing the expensive and computationally intensive back-propagation paradigm, VFF-Net charts a path toward AI models that are inherently lighter and more biologically plausible. Such models have the transformative potential to decentralize AI processing, enabling the deployment of powerful neural architectures directly on edge devices like smartphones, medical instruments, and household electronics. This local computation paradigm reduces reliance on massive cloud-based data centers, helping curb the growing energy consumption footprint of AI and promoting sustainability in technological development.
Moreover, VFF-Net’s design echoes cognitive principles observed in neurological learning processes. By emphasizing local learning rules, noise handling, similarity-based contrastive criteria, and modular training units, it recaps aspects of synaptic plasticity and cortical area specialization found in biological brains. This convergence of neuroscience insights and computational innovation paves the way for more naturalistic, interpretable, and trustworthy AI systems. With interpretability being a critical concern in deploying AI in domains like healthcare and autonomous systems, VFF-Net’s brain-inspired paradigm could foster greater adoption and regulatory acceptance.
The conceptual significance of forward-forward algorithms extends well beyond the immediate context of CNN training. VFF-Net exemplifies how reimagining fundamental learning mechanisms can unlock new avenues for efficiency and capability. While back-propagation has demonstrated remarkable success over the past decades, its scaling limitations and biological implausibility leave room for alternative frameworks. Forward-forward networks and their derivatives such as VFF-Net could serve as pivotal milestones in the search for energy-efficient, interpretable, and robust AI.
As research progresses, the integration of VFF-Net with emerging hardware accelerators and neuromorphic chips could catalyze further acceleration and optimization. Its compatibility with computational simulations and modeling environments also invites broad experimental validation and extension across diverse modalities like speech recognition, natural language processing, and robotics. Thus, VFF-Net does not merely refine convolutional training—it ignites a paradigm shift in how neural computation can be conceptualized, engineered, and applied.
In summary, the Visual Forward-Forward Network offered by the Seoul National University of Science and Technology team marks a decisive evolution in AI training methodologies. By harmonizing label-wise noise interventions, cosine similarity-guided contrastive objectives, and strategic grouping of layers, VFF-Net transcends previous limitations of the forward-forward framework. This advancement promises faster, cheaper, and more biologically inspired AI, empowering powerful neural networks on devices far beyond traditional computing centers. The breakthrough holds promising ramifications for sustainability, accessibility, and trustworthiness in the future of artificial intelligence.
Subject of Research: Not applicable
Article Title: VFF-Net: Evolving forward–forward algorithms into convolutional neural networks for enhanced computational insights
News Publication Date: October 1, 2025
Web References: https://doi.org/10.1016/j.neunet.2025.107697
References: DOI: 10.1016/j.neunet.2025.107697
Image Credits: Hyung Kim from Seoul National University and Technology
Keywords: Artificial intelligence, Machine learning, Deep learning, Neural networks, Computer science, Algorithms, Computer architecture, Pattern recognition