Thursday, May 26, 2022
SCIENMAG: Latest Science and Health News
No Result
View All Result
  • Login
  • HOME PAGE
  • BIOLOGY
  • CHEMISTRY AND PHYSICS
  • MEDICINE
    • Cancer
    • Infectious Emerging Diseases
  • SPACE
  • TECHNOLOGY
  • CONTACT US
  • HOME PAGE
  • BIOLOGY
  • CHEMISTRY AND PHYSICS
  • MEDICINE
    • Cancer
    • Infectious Emerging Diseases
  • SPACE
  • TECHNOLOGY
  • CONTACT US
No Result
View All Result
Scienmag - Latest science news from science magazine
No Result
View All Result
Home SCIENCE NEWS Mathematics

Mimicking the brain to realize ‘human-like’ virtual assistants

February 3, 2022
in Mathematics
0
Share on FacebookShare on Twitter

Ishikawa, Japan — Speech is more than just a form of communication. A person’s voice conveys emotions and personality and is a unique trait we can recognize. Our use of speech as a primary means of communication is a key reason for the development of voice assistants in smart devices and technology. Typically, virtual assistants analyze speech and respond to queries by converting the received speech signals into a model they can understand and process to generate a valid response. However, they often have difficulty capturing and incorporating the complexities of human speech and end up sounding very unnatural.

Figure 1. A representation of the algorithm used to mimic human speech.

Credit: Masashi Unoki from JAIST.

Ishikawa, Japan — Speech is more than just a form of communication. A person’s voice conveys emotions and personality and is a unique trait we can recognize. Our use of speech as a primary means of communication is a key reason for the development of voice assistants in smart devices and technology. Typically, virtual assistants analyze speech and respond to queries by converting the received speech signals into a model they can understand and process to generate a valid response. However, they often have difficulty capturing and incorporating the complexities of human speech and end up sounding very unnatural.

Now, in a study published in the journal IEEE Access, Professor Masashi Unoki from Japan Advanced Institute of Science and Technology (JAIST), and Dung Kim Tran, a doctoral course student at JAIST, have developed a system that can capture the information in speech signals similar to how we perceive speech.

“In humans, the auditory periphery converts the information contained in input speech signals into neural activity patterns (NAPs) that the brain can identify. To emulate this function, we used a “matching pursuit algorithm” to obtain “sparse representations” of speech signals, or signal representations with the minimum possible significant coefficients,” explains Prof. Unoki. “We then used psychoacoustic principles, such as the equivalent rectangular bandwidth scale, gammachirp function, and masking effects to ensure that the auditory sparse representations are similar to that of the NAPs.”

To test the effectiveness of their model in understanding voice commands and generating an understandable and natural response, the duo performed experiments to compare the signal reconstruction quality and the perceptual structures of the auditory representations against conventional methods. “The effectiveness of an auditory representation can be evaluated in terms of three aspects: the quality of the resynthesized speech signals, the number of non-zero elements, and the ability to represent perceptual structures of speech signals,” elaborates Prof. Unoki.

To evaluate the quality of the resynthesized speech signals, the duo reconstructed 630 speech samples spoken by different speakers. The resynthesized signals were then rated using PEMO-Q and PESQ scores – objective measures for sound quality. They found the resynthesized signals to be comparable to the original signals. Additionally, they made auditory representations of certain phrases spoken by 6 speakers.

The duo also tested the model on its ability to capture voice structures accurately by using a pattern-matching experiment to determine if the auditory representations of the phrases could be matched to spoken utterances or queries made by the same speakers.

“Our results showed that the auditory sparse representations produced by our method can achieve high quality resynthesized signals with only 1066 coefficients per second. Furthermore, the proposed method also provides the highest matching accuracy in a pattern matching experiment,” comments Prof. Unoki.

From smartphones to smart televisions and even smart cars, the role of voice assistants is becoming more and more indispensable in our daily lives. The quality and the continued usage of these services will rely on their ability to understand our accents and our pronunciation and respond in a way we find natural. The model developed in this study could go a long way in imparting human-like qualities to our voice assistants, making our interactions not only more convenient but also psychologically satisfying.

 

###

 

Reference

Title of original paper: Matching Pursuit and Sparse Coding for Auditory Representation
Journal: IEEE Access
DOI: 10.1109/ACCESS.2021.3135011

 

 

About Japan Advanced Institute of Science and Technology, Japan

Founded in 1990 in Ishikawa prefecture, the Japan Advanced Institute of Science and Technology (JAIST) was the first independent national graduate school in Japan. Now, after 30 years of steady progress, JAIST has become one of Japan’s top-ranking universities. JAIST counts with multiple satellite campuses and strives to foster capable leaders with a state-of-the-art education system where diversity is key; about 40% of its alumni are international students. The university has a unique style of graduate education based on a carefully designed coursework-oriented curriculum to ensure that its students have a solid foundation on which to carry out cutting-edge research. JAIST also works closely both with local and overseas communities by promoting industry–academia collaborative research.  

 

About Professor Masashi Unoki from Japan Advanced Institute of Science and Technology, Japan

Dr. Masashi Unoki is a Professor at the School of Information Science at the Japan Advanced Institute of Science and Technology (JAIST) where he received his M.S. and Ph.D. degrees in 1996 and 1999, respectively. His main research interests lie in auditory motivated signal processing and the modeling of auditory systems. Dr. Unoki received the Sato Prize from the Acoustical Society of Japan (ASJ) in 1999, 2010, and 2013 for an Outstanding Paper and the Yamashita Taro “Young Researcher” Prize from the Yamashita Taro Research Foundation in 2005. He has published 198 papers and has authored 14 books so far.

 

Funding information

This work was supported in part by the Grant-in-Aid for Scientific Research (B) under Grant 17H01761, in part by JSPS KAKENHI under Grant 20J20580, in part by the Fund for the Promotion of Joint International Research (Fostering Joint International Research (B)) under Grant 20KK0233, in part by I-O DATA Foundation, and in part by KDDI Foundation (Research Grant Program).



Journal

IEEE Access

DOI

10.1109/ACCESS.2021.3135011

Article Title

Matching Pursuit and Sparse Coding for Auditory Representation

Article Publication Date

13-Dec-2021

Tags: assistantsbrainHumanlikeMimickingrealizevirtual
Share26Tweet16Share4ShareSendShare
  • Bronze Age Shoes

    Climate change reveals unique artefacts in melting ice patches

    68 shares
    Share 27 Tweet 17
  • Danish astrophysics student discovers link between global warming and locally unstable weather

    67 shares
    Share 27 Tweet 17
  • The Cinderella Project: The right to see yourself in the mirror and like what you see

    66 shares
    Share 26 Tweet 17
  • Simple, inexpensive diagnostic technology to combat global threat of African Swine Fever

    66 shares
    Share 26 Tweet 17
  • University of Kentucky receives renewed $11.4 million grant to further cancer research

    66 shares
    Share 26 Tweet 17
  • Congratulations to the 2022 American Ornithological Society (AOS) award winners

    65 shares
    Share 26 Tweet 16
ADVERTISEMENT

About us

We bring you the latest science news from best research centers and universities around the world. Check our website.

Latest NEWS

Data contradict fears of COVID-19 vaccine effects on pregnancy and fertility

Charging a green future: Latest advancement in lithium-ion batteries could make them ubiquitous

Long-duration energy storage beats the challenge of week-long wind-power lulls

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 188 other subscribers

© 2022 Scienmag- Science Magazine: Latest Science News.

No Result
View All Result
  • HOME PAGE
  • BIOLOGY
  • CHEMISTRY AND PHYSICS
  • MEDICINE
    • Cancer
    • Infectious Emerging Diseases
  • SPACE
  • TECHNOLOGY
  • CONTACT US

© 2022 Scienmag- Science Magazine: Latest Science News.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Posting....