
Using onomatopoeic words to pronounce as an ambulance sound or the sound made by a cat when it is asking for food, is as natural as expressing an idea beyond words. Borrowing a leaf from this capacity, scientists have developed an AI system that can imitate the human voice. This revolutionary model does not impose any requirement that the subject has ever been trained or exposed to human vocal imitation.
The Science Behind the Model
The AI is built on a model of the embodied vocal tract, which gives an accurate representation of how the sound generated by the voice organs is shaped by the throat, tongue, and lips. This model is underpinned by a cognitively inspired algorithm that makes it possible to produce realistic replicas of the various sounds ranging from the sound produced by rustling leaves to the hiss of a snake. It can even work in the opposite direction: it names the real sounds imitating their human voice.
To sharpen the model, there are three versions of it that researchers have put together. The last version adds the use of reason to mimic human behavior by focusing on different aspects of the sounds while trying to spend the least effort. This approach leads to the creation of products whose behavior is much more natural in the emulation of human-like behavior.
AI: Testing and Results
Feelings generated by the imitations were also rated higher compared to sounds generated by people in experiments, even where the reference sounds included human-made sounds such as motorboats and gunshots. The outcomes of this work show that the perspective model is useful in emulating the way the human brain comes to decisions when it imitates auditory patterns.
This is a relatively new line of Artificial Intelligence, which, however, can be used in the fields of sound design, language acquisition, or, for example, in the study of birds’ behavior. To artists and filmmakers, it could eliminate the steps for creating context-based sound. In addition, musicians would also appreciate the services creating sound searches based on imitations with the voice.
Future Directions
What the current model does very well is that she can for the most part reproduce phonemes, vowels, and consonants, but for some consonants and imitations which are speech and musical sounds within a context, she has some challenges. These aspects are expected to be further developed and generalized to language acquisition and interaction. SIGGRAPH Asia, this research is an important advance near the role of AI in improving human auditory communication.
Read More: Mandy Moore, Paris Hilton, and Others Lose Homes in Pacific Palisades Wildfires
Author
-
Muhammad Hashir is the proud owner of Spotlight Celeb, a platform dedicated to delivering the latest and most engaging celebrity news. As a passionate writer, he excels in crafting compelling stories that dive into the lives, achievements, and journeys of stars from around the world. With a keen eye for detail and a flair for storytelling, He brings a unique perspective to celebrity culture, connecting readers to the glitz and glamour of the entertainment industry while uncovering the human side of fame.
View all posts