Thursday, March 30, 2023
SCIENMAG: Latest Science and Health News
No Result
View All Result
  • Login
  • HOME PAGE
  • BIOLOGY
  • CHEMISTRY AND PHYSICS
  • MEDICINE
    • Cancer
    • Infectious Emerging Diseases
  • SPACE
  • TECHNOLOGY
  • CONTACT US
  • HOME PAGE
  • BIOLOGY
  • CHEMISTRY AND PHYSICS
  • MEDICINE
    • Cancer
    • Infectious Emerging Diseases
  • SPACE
  • TECHNOLOGY
  • CONTACT US
No Result
View All Result
Scienmag - Latest science news from science magazine
No Result
View All Result
Home Latest News

An innovative computational tool that integrates and analyzes different health databases

October 4, 2022
in Latest News
0
Share on FacebookShare on Twitter

Brazilian researchers have created an innovative and agile computational tool to link and analyze different health databases with millions of patient records. Called Tucuxi-BLAST, the platform encodes identification records in a database, such as patient name, mother’s name and place of birth, using letters that represent the nucleotides in a DNA sequence (A, T, C or G). This “conversion” of individuals to DNA enables accurate record linkage across databases despite typographical errors and other inconsistencies.

The tool can be used in research, epidemiological analysis and public policy formulation.

For example, people who have been vaccinated by the SUS, Brazil’s national health service, can be cross-referenced to other datasets to find vaccinated patients with a specific disease. Even if a vaccination record contains errors or uncompleted fields, Tucuxi-BLAST is able to link it to the same patient in another database because it treats inconsistencies as if they were DNA mutations. Genomics tools routinely need to compare fragments in order to decide whether they are more similar than different and whether to link the base pairs in question. If each individual corresponds to a sequence of letters, data from different repositories can be cross-referenced and linked by the tool.

“The SUS is a valuable source of information for medical and epidemiological research because it stores health data for millions of patients. However, records relating to diseases and other types of data are stored in different databases that don’t always talk to each other. The method we’ve developed is able to effect record linkage accurately and at great speed,” Helder Nakaya, corresponding author of an article on the study published in the journal PeerJ, told Agência FAPESP. 

Nakaya is an immunologist affiliated with the University of São Paulo’s School of Pharmaceutical Sciences (FCF-USP), the Albert Einstein Jewish Hospital (HIAE), the Scientific Platform Pasteur-USP, and Todos pela Saúde institute. He also belongs to the Center for Research on Inflammatory Diseases (CRID), one of the Research, Innovation and Dissemination Centers (RIDCs) funded by FAPESP. 

The study was also supported by FAPESP via two other projects (18/14933-2 and 19/27139-5).

Using the tool in practice

Even before the article was published, Tucuxi-BLAST began to be deployed in practice. It was used, for example, to cross-reference four years of data from the Ministry of Health’s Malaria Surveillance System with clinical data from the Dr. Heitor Vieira Dourado Tropical Medicine Foundation (in Manaus, Amazonas state), a branch of Oswaldo Cruz Foundation (Fiocruz), another arm of the ministry. 

The result showed that being HIV positive is a risk for Plasmodium vivax malaria patients, representing an additional challenge for public policy. Given the lack of single identifiers, Tucuxi-BLAST used patient name, mother’s name and date of birth. The findings were described in an article published in May 2022 in Scientific Reports. 

The study was led by researchers at Amazonas State University (UEA). Nakaya and FCF-USP’s José Deney Alves Araújo, first author of the PeerJ article, also participated. Araújo named the tool Tucuxi in honor of Sotalia fluviatilis, a freshwater dolphin that inhabits the rivers of the Amazon Basin.

BLAST (Basic Local Alignment Search Tool) refers to a suite of programs used in bioinformatics to generate alignments between nucleotides or protein sequences across large databases.

How it works

To develop the new method, the scientists translated patient data into DNA sequences using a codon wheel that changed dynamically over different runs without impairing the efficiency of the process. Codons are sequences of three nucleotides that code for a specific amino acid in a DNA or RNA molecule. Codon wheels are used to identify the amino acids encoded by any DNA or RNA codon.

This encoding scheme enabled real-time data encryption, thus providing an additional layer of privacy during the linking process. “It used DNA to encrypt the information and guarantee privacy,” Nakaya said.

The DNA-encoded identification fields were compared using BLAST, and machine learning algorithms automatically classified the final results. 

As in comparative genomics, where genes from different genomes are compared to determine common and unique sequences, Tucuxi-BLAST also permits the simultaneous integration of data from multiple administrative databases without the need for complex data pre-processing. 

In the study, the group used Tucuxi-BLAST to test and compare a simulated database containing 300 million records, as well as four large administrative databases containing data for real cases of patients infected with different pathogens.

The conclusion was that Tucuxi-BLAST successfully processed record linkages for the largest dataset (200,000 records), despite misspellings and other errors and omissions, in a fifth of the time: 23 hours, compared with 127 hours (five days and seven hours) for the state-of-the-art method.

The researchers set up a website where users can translate words, phrases and names into DNA.

Several countries, such as the UK, Canada and Australia, have invested in successful initiatives to integrate databases and develop novel data analysis strategies, Nakaya noted.

A Brazilian example is the Center for Health Data and Knowledge Integration (CIDACS/Fiocruz), which has integrated administrative and health databases to assemble records for 114 million people.

About São Paulo Research Foundation (FAPESP)

The São Paulo Research Foundation (FAPESP) is a public institution with the mission of supporting scientific research in all fields of knowledge by awarding scholarships, fellowships and grants to investigators linked with higher education and research institutions in the State of São Paulo, Brazil. FAPESP is aware that the very best research can only be done by working with the best researchers internationally. Therefore, it has established partnerships with funding agencies, higher education, private companies, and research organizations in other countries known for the quality of their research and has been encouraging scientists funded by its grants to further develop their international collaboration. You can learn more about FAPESP at www.fapesp.br/en and visit FAPESP news agency at www.agencia.fapesp.br/en to keep updated with the latest scientific breakthroughs FAPESP helps achieve through its many programs, awards and research centers. You may also subscribe to FAPESP news agency at http://agencia.fapesp.br/subscribe



Journal

PeerJ

DOI

10.7717/peerj.13507

Article Title

Tucuxi-BLAST: Enabling fast and accurate record linkage of large-scale health-related administrative databases through a DNA-encoded approach

Article Publication Date

11-Jul-2022

Tags: analyzescomputationaldatabaseshealthInnovativeintegratestool
Share25Tweet16Share4ShareSendShare
  • AI Voting Prediction Image

    Can AI predict how you’ll vote in the next election?

    67 shares
    Share 27 Tweet 17
  • Ancient DNA reveals Asian ancestry introduced to East Africa in early modern times

    66 shares
    Share 26 Tweet 17
  • Extinction of steam locomotives derails assumptions about biological evolution

    69 shares
    Share 28 Tweet 17
  • A final present from birds killed in window collisions: poop that reveals their microbiomes

    71 shares
    Share 28 Tweet 18
  • Mimicking biological enzymes may be key to hydrogen fuel production

    66 shares
    Share 26 Tweet 17
  • Cancer that spreads to the lung maneuvers to avoid being attacked by “killer” T cells

    66 shares
    Share 26 Tweet 17
ADVERTISEMENT

About us

We bring you the latest science news from best research centers and universities around the world. Check our website.

Latest NEWS

The “Stonehenge calendar” shown to be a modern construct

Healthy men who have vaginal sex have a distinct urethral microbiome

Spotted lanternfly spreads by hitching a ride with humans

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 205 other subscribers

© 2023 Scienmag- Science Magazine: Latest Science News.

No Result
View All Result
  • HOME PAGE
  • BIOLOGY
  • CHEMISTRY AND PHYSICS
  • MEDICINE
    • Cancer
    • Infectious Emerging Diseases
  • SPACE
  • TECHNOLOGY
  • CONTACT US

© 2023 Scienmag- Science Magazine: Latest Science News.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In