For decades, the prevailing belief in artificial intelligence has revolved around a simple premise: the quality and scope of an AI model’s performance are directly proportional to the quantity and diversity of data it has been exposed to during training. The more extensive the dataset, the better the AI performs; limited data sets inevitably restrict its capabilities. However, a groundbreaking new study from the USC Viterbi School of Engineering challenges this foundational assumption in a dramatic fashion, offering an innovative approach that unlocks latent potential within AI models, transcending the boundaries imposed by their original training information.
This research, set to be presented at IEEE SoutheastCon 2026, unearths a striking revelation: with the implementation of a precise feedback mechanism, AI models can significantly elevate their proficiency in domains where they have limited exposure. The study’s authors, led by undergraduate researcher Minda Li and her advisor Professor Bhaskar Krishnamachari, chose a highly unconventional testing ground—a nearly forgotten programming language named Idris. With a scant online presence compared to mainstream languages like Python, Idris represents a formidable challenge due to its minimal training data available for AI models.
To appreciate the crux of this research, it is essential to understand the disparity between Python and Idris in data availability. Python boasts over 24 million public code repositories, providing AI models like GPT-5 with an immense reservoir of examples from which to learn. Idris, in stark contrast, has only about 2,000 repositories, a difference of more than four orders of magnitude. The decision to experiment with Idris was not accidental but a deliberate gambit by Li and Krishnamachari—they sought an environment so obscure that even their own expertise in writing the language was nonexistent. This extreme knowledge gap not only magnified the challenge but also served to validate any improvements in AI adaptability beyond human guiding capabilities.
At the outset, GPT-5’s performance on Idris coding exercises was underwhelming. Presented with 56 tasks on the popular code learning platform Exercism, the model managed a modest 39% success rate, far beneath its strengths with more commonly encountered languages where success rates typically exceed 70-90%. Initial attempts to enhance performance by supplementing GPT-5 with language documentation, error references, and manuals yielded limited gains, nudging the success rate into the low 60s but failing to deliver a breakthrough.
The paradigm shift occurred when Li introduced what she and her team term the “compiler feedback loop.” A compiler functions as a critical interpreter, translating human-written code into executable instructions; crucially, it provides detailed technical diagnostics when there are errors. By capturing these compiler-generated error messages and systematically feeding them back into GPT-5, the model was prompted to iteratively revise and improve its code, attempting up to 20 recompilations per problem. This process, seemingly straightforward, triggered a profound transformation in the AI’s capabilities.
Contrary to expectations that this feedback-driven method might produce only incremental improvement, the results were nothing short of astonishing. GPT-5’s success rate soared to an impressive 96%, surpassing even the most optimistic projections. This leap demonstrates that an AI model’s potential significantly exceeds what its training data might predict—a revelation that calls into question long-held assumptions about AI learning limitations and generalization boundaries.
The researchers emphasize that this methodology reveals capabilities inherent in the AI but previously inaccessible without structured feedback. This feedback loop creates a dynamic learning environment during inference time, allowing the model to self-correct in a manner reminiscent of human trial-and-error learning. Importantly, this approach is not limited to programming or coding tasks. The conceptual framework can be extended to virtually any artificially intelligent system where objective, rule-based feedback can be automated. This includes complex fields such as 3D architectural modeling, mathematical theorem proving, legal reasoning, and even natural language translation for low-resource languages.
Professor Krishnamachari envisions a future where AI systems are continually refined by external evaluation tools that guide their iterative improvements, pushing AI outputs to levels previously considered unattainable. For example, an AI tasked with creating structural models could receive real-time feedback about safety, cost, and material use, iteratively adjusting the design until it meets stringent criteria. This interactivity effectively transforms AI from static data-driven predictors into dynamic problem solvers that thrive on continuous evaluation and refinement.
The implications of this research reach beyond technical prowess, touching on the revitalization of endangered human languages. The study aligns with parallel efforts to deploy AI in preserving and translating languages with sparse textual resources, such as Owens Valley Paiute. Here, the ability of AI to self-improve based on iterative feedback could become a critical tool in linguistic research and cultural preservation, leveraging minimal data to produce meaningful outputs.
Yet, the journey is far from complete. The current feedback loop methodology relies heavily on brute-force trial and error, resetting the model’s state for each new problem without memory of previous attempts. Li is pioneering next steps aimed at enabling the model to accumulate and apply learned insights across problems, fostering progressive improvement rather than a fresh start every time. This evolution holds promise for creating AI systems that grow smarter and more efficient through experience, much like human learners refining skills over time.
Krishnamachari reflects on the broader ramifications of this research: the creation of AI tools that not only perform tasks beyond human expertise but also transcend the limitations of their own initial training data. Far from provoking fear, this prospect is met with enthusiasm—AI technologies are poised to liberate human creativity by automating routine or complex tasks, enabling focus on innovation and conceptual breakthroughs. The humble origins of this project—two researchers casually exploring obscure coding languages—highlight how curiosity and experimentation can yield transformative advances in AI.
The USC Viterbi team’s research fundamentally redefines the relationship between AI models and their training data, signaling a shift towards more adaptable, feedback-informed artificial intelligence. This development promises to accelerate AI applications in diverse specialized and low-data domains, heralding a new era where AI models are not confined by the past but can actively learn and improve through interaction with evaluative systems. As the paradigm moves from static knowledge ingestion to dynamic iterative refinement, the frontier of AI capabilities expands dramatically—opening doors to innovations previously deemed impossible.
Subject of Research: Not applicable
Article Title: Compiler-Guided Inference-Time Adaptation: Improving GPT-5 Programming Performance in Idris
News Publication Date: 13-Mar-2026
Web References:
– USC Viterbi School of Engineering: https://viterbischool.usc.edu/
– IEEE SoutheastCon 2026: https://ieeesoutheastcon.org/
– Paper: https://arxiv.org/abs/2602.11481
– Owens Valley Paiute Language Research: https://viterbischool.usc.edu/news/2024/06/imagine-hearing-a-distant-relative-telling-stories-in-a-nearly-forgotten-language-what-would-you-do/
Keywords
Artificial Intelligence, GPT-5, compiler feedback loop, Idris programming language, inference-time adaptation, low-resource languages, iterative learning, AI generalization, computational modeling, code debugging, AI autonomy

