Thursday, June 18, 2026
Science
No Result
View All Result
  • Login
  • HOME
  • SCIENCE NEWS
  • CONTACT US
  • HOME
  • SCIENCE NEWS
  • CONTACT US
No Result
View All Result
Scienmag
No Result
View All Result
Home Science News Technology and Engineering

PolyU Pioneers Protein-Based Data Storage Breakthroughs Amid AI-Driven Data Surge, Achieving Superior Capacity, Stability, and Encryption – 15 May 2026

May 28, 2026
in Technology and Engineering
Reading Time: 5 mins read
0
PolyU Pioneers Protein-Based Data Storage Breakthroughs Amid AI-Driven Data Surge, Achieving Superior Capacity, Stability, and Encryption – 15 May 2026 — Technology and Engineering

PolyU Pioneers Protein-Based Data Storage Breakthroughs Amid AI-Driven Data Surge, Achieving Superior Capacity, Stability, and Encryption – 15 May 2026

66
SHARES
596
VIEWS
Share on FacebookShare on Twitter
ADVERTISEMENT

In an era characterized by an unprecedented explosion of digital data, driven largely by artificial intelligence (AI) development, big data analytics, and the proliferation of smart devices, the demand for innovative, sustainable, and efficient data storage solutions has never been more urgent. Traditional storage platforms, such as hard drives and cloud infrastructures, are fast approaching their practical limits, hindered by elevated costs, thermodynamic inefficiencies, capacity ceilings, and degradation over time. Addressing these challenges head-on, a pioneering interdisciplinary team from The Hong Kong Polytechnic University (PolyU) has unveiled a groundbreaking method that leverages engineered proteins as novel carriers for digital information storage. This innovation not only pushes the boundaries of storage capacity and stability but also demonstrates robust encryption and random access capabilities, marking a paradigm shift in the future of data preservation and retrieval.

At the heart of this breakthrough is the visionary work led by Professor Zhongping YAO, Associate Head and Professor of the Department of Applied Biology and Chemical Technology at PolyU. His collaborative team includes Dr. Cheuk-chi NG and Professor Chung-Ming Francis LAU, blending expertise from protein engineering, synthetic biology, biochemistry, analytical chemistry, and computer science. Their recent findings, published in the prestigious journal Nature Communications, detail the unprecedented realization of a full cycle—from digital data encoding and protein expression via living cells to data retrieval through sophisticated mass spectrometric analysis—in de novo designed unnatural proteins. This accomplishment highlights proteins as a sustainable and scalable medium with remarkable longevity and functional versatility.

Digital information, inherently binary, is conventionally stored as sequences of 0s and 1s within electronic or magnetic media. Translating such data into molecular formats involves encoding binary strings into sequences of monomer units in macromolecules. DNA has long served as a prototype molecule for this purpose due to its natural information storage capabilities; however, it suffers from several intrinsic limitations, such as containing only four nucleotide monomer types, which confines storage density, alongside its susceptibility to chemical and enzymatic degradation. Previous work by Prof. Yao’s group explored peptides—polymers of amino acids—as alternative carriers. Peptides benefit from the availability of 20 natural amino acid monomers and a host of non-natural analogs, facilitating higher information density and increased molecular stability. Nevertheless, peptides’ relatively short sequences and costly chemical synthesis restricted their usability for large-scale data storage applications.

Expanding on this foundation, the team’s innovative leap involves harnessing full-length proteins as data storage media. Unlike peptides, proteins possess vastly longer sequences of amino acids, which drastically elevates potential storage density and efficiency. Moreover, proteins can be biosynthesized enzymatically within living cells, such as genetically engineered Escherichia coli strains, circumventing expensive chemical syntheses and enabling mass production at scale. Proteins also demonstrate enhanced stability during storage, whether in dry powdered forms or aqueous solutions, sometimes enduring conditions that degrade DNA swiftly. This combination fosters a sustainable, cost-effective platform primed for the demands of future data-intensive landscapes.

However, transitioning to protein-based data storage involves surmounting formidable challenges. First, the data-encoded amino acid sequences typically manifest as highly irregular and non-naturally evolved patterns, which often impair protein folding, solubility, and stability, complicating both design and expression within host organisms. Second, contemporary protein sequencing techniques primarily aim to identify proteins by matching partial sequences to known databases rather than reconstructing entire sequences—yet full-length sequence retrieval is imperative to accurately decode embedded digital information. The PolyU team confronted these hurdles with originality and ingenuity.

Drawing inspiration from collagen’s biophysical properties—an archetypal natural protein famed for its striking stability and longevity—the researchers engineered a collagen-like protein scaffold to serve as a robust backbone. This scaffold enhances structural integrity and resists chemical degradation, providing an ideal framework for incorporating data-encoding segments within its architecture. Through precise genetic engineering, these bespoke sequences were inserted into the collagen template, enabling successful expression of the hybrid protein constructs in E. coli. Such bio-fabrication methodologies mark a crucial advance, marrying synthetic biology with information technology.

Subsequent data retrieval involved enzymatic digestion of the expressed proteins into smaller peptide fragments, followed by comprehensive liquid chromatography–tandem mass spectrometry (LC-MS/MS) analysis. This analytical regimen allowed discrete peptide sequences to be identified with high resolution. The mass spectrometric data were then processed using specially developed algorithmic software capable of assembling full-length protein sequences from overlapping peptide fragments. This sophisticated bioinformatics pipeline also incorporated error-correcting codes to rectify minor sequence ambiguities, collectively ensuring that the original digital bit strings could be reconstructed with remarkable accuracy, thereby validating the feasibility of the entire storage-retrieval cycle.

The superiority of protein-based storage solutions was further underscored by comparative analyses with previously developed peptide-based systems, which had already demonstrated notable stability under conditions relevant to space exploration missions in China’s next-generation manned spacecraft. Prof. Yao emphasized that “the protein samples in our research achieved 30 times the storage density at only 10% of the cost of the peptide-based method.” Additionally, unlike DNA that rapidly degrades in acidic environments or aqueous solutions, protein samples remained intact and readable after protracted periods, underscoring their exceptional chemical resilience.

Beyond basic data encoding, the team advanced the concept of “functionalizing” these proteins to implement random data access and cryptographic protections. Traditional molecular storage systems require decoding the entire data set to extract specific information segments—a process that is inefficient and inflexible. By grafting specific affinity tags onto data-bearing proteins, the researchers enabled selective binding and isolation of targeted sequences via corresponding antibodies, permitting random access to discrete data portions without full dataset decoding. Moreover, embedding encrypted messages into these proteins and selectively recovering them only with predefined affinity compounds demonstrated an innovative approach to molecular-level data encryption, underscoring the potential for secure information storage at the biochemical scale.

Professor Yao highlighted the broader implications of protein-based data storage: “Their inherent biocompatibility implies the intriguing prospect of embedding digital archives within living organisms, opening new frontiers in biological data integration.” This prospect invites visionary applications spanning bioinformatics, synthetic biology, and personalized medicine, where biological systems could not only process but inherently retain digital information. The research team envisages next-generation endeavors targeting mass storage scalability, acceleration of data writing and reading processes, reduction of biosynthetic costs, and diversification of protein scaffolds to incorporate additional functionalities and improved performance parameters.

This pioneering work intersects multiple high-impact disciplines, spanning protein engineering, synthetic biology, analytical chemistry, bioinformatics, and computer science. Beyond pushing the scientific envelope, it addresses a pressing societal need precipitated by the deluge of AI-generated data worldwide. Supported by the Hong Kong Research Grants Council through the Collaborative Research Fund and Research Impact Fund, this breakthrough represents a beacon of innovation in sustainable, ultra-dense molecular data storage technologies.


Subject of Research: Protein-based molecular data storage and retrieval using engineered collagen-like proteins expressed via E. coli.

Article Title: Data storage and retrieval with unnatural proteins expressed via E. coli

News Publication Date: 28-Feb-2026

Web References:
https://www.nature.com/articles/s41467-026-70061-7
http://dx.doi.org/10.1038/s41467-026-70061-7

Image Credits: polyu

Keywords

Proteins, Data storage, Collagen, Artificial intelligence, Sequence analysis, Biochemistry

Tags: AI-driven big data storage solutionsbiochemical data storage innovationsengineered protein information carriershigh-capacity biomolecular storageHong Kong Polytechnic University researchinterdisciplinary synthetic biology researchnext-generation data storage breakthroughsovercoming traditional storage limitationsprotein-based data storage technologyrandom access protein storage systemsstable protein data encryption methodssustainable digital data preservation
Share26Tweet17
Previous Post

Scientists Discover New Target to Halt Aggressive Prostate Cancer Progression

Next Post

Grant to Revolutionize Care for Young Cancer Survivors

Related Posts

Confined Migration Causes DNA Damage in Neurons — Medicine
Medicine

Confined Migration Causes DNA Damage in Neurons

June 18, 2026
KAIST Creates Next-Generation Self-Powered Wearable Sensor withstanding 668% Stretch — Technology and Engineering
Technology and Engineering

KAIST Creates Next-Generation Self-Powered Wearable Sensor withstanding 668% Stretch

June 17, 2026
Insights from 173,303 Pakistan Genome Analyses — Medicine
Medicine

Insights from 173,303 Pakistan Genome Analyses

June 17, 2026
Electric Nose Detects When Your Food Has Spoiled — Technology and Engineering
Technology and Engineering

Electric Nose Detects When Your Food Has Spoiled

June 17, 2026
Cortical Development Dynamics in Autism Models — Medicine
Medicine

Cortical Development Dynamics in Autism Models

June 17, 2026
Can AI Help You Find Your Lost Keys? — Technology and Engineering
Technology and Engineering

Can AI Help You Find Your Lost Keys?

June 17, 2026
Next Post
Grant to Revolutionize Care for Young Cancer Survivors — Medicine

Grant to Revolutionize Care for Young Cancer Survivors

  • Mothers who receive childcare support from maternal grandparents show more parental warmth, finds NTU Singapore study

    Mothers who receive childcare support from maternal grandparents show more parental warmth, finds NTU Singapore study

    27656 shares
    Share 11059 Tweet 6912
  • University of Seville Breaks 120-Year-Old Mystery, Revises a Key Einstein Concept

    1060 shares
    Share 424 Tweet 265
  • Bee body mass, pathogens and local climate influence heat tolerance

    682 shares
    Share 273 Tweet 171
  • Researchers record first-ever images and data of a shark experiencing a boat strike

    545 shares
    Share 218 Tweet 136
  • Groundbreaking Clinical Trial Reveals Lubiprostone Enhances Kidney Function

    531 shares
    Share 212 Tweet 133
Science

Embark on a thrilling journey of discovery with Scienmag.com—your ultimate source for cutting-edge breakthroughs. Immerse yourself in a world where curiosity knows no limits and tomorrow’s possibilities become today’s reality!

RECENT NEWS

  • Confined Migration Causes DNA Damage in Neurons
  • Multimodal Multitask AI Transforms Lung Cancer Grading
  • Legalizing Cannabis Boosts Use and Addiction Rates—Tight Regulation Is Essential
  • New Study Seeks to Prolong Immune System Longevity

Categories

  • Agriculture
  • Anthropology
  • Archaeology
  • Athmospheric
  • Biology
  • Biotechnology
  • Blog
  • Bussines
  • Cancer
  • Chemistry
  • Climate
  • Earth Science
  • Editorial Policy
  • Marine
  • Mathematics
  • Medicine
  • Pediatry
  • Policy
  • Psychology & Psychiatry
  • Science Education
  • Social Science
  • Space
  • Technology and Engineering

Subscribe to Blog via Email

Success! An email was just sent to confirm your subscription. Please find the email now and click 'Confirm Follow' to start subscribing.

Join 5,146 other subscribers

© 2025 Scienmag - Science Magazine

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • HOME
  • SCIENCE NEWS
  • CONTACT US

© 2025 Scienmag - Science Magazine