As Mark Hasegawa-Johnson combed through data from his latest project, he was pleasantly surprised to discover the recipe for Florentine eggs. He said sifting through hundreds of hours of recordings would reveal a treasure or two.
Hasegawa Johnson leads the Speech Accessibility Project, an initiative at the University of Illinois at Urbana-Champaign that aims to make speech recognition devices more useful for people with speech disabilities.
In the program’s first published study, researchers asked automated speech recognizers to listen to recordings of people with Parkinson’s disease-related speech disorders for up to 151 hours (almost six and a half days). Their model transcribed a new data set similar to recordings with 30% greater accuracy than a control model that had not heard a Parkinson’s patient speak.
The study was published in the Journal of Speech, Language and Hearing Research. The voice recordings used in the study are available for free to researchers, non-profit organizations and companies looking to improve voice recognition equipment.
“Our results show that a large database of atypical speech can significantly improve speech technology for people with disabilities,” said Hasegawa Johnson, a professor of electrical and computer engineering in Illinois and a researcher at the university’s Beckman Institute for Advanced Science and Technology, where the program is located. “I look forward to seeing how other organizations can use this data to make speech recognition devices more inclusive.”
Machines such as smartphones and virtual assistants use automatic speech recognition to understand the meaning of sounds, allowing people to queue up for playlists, dictate hands-free messages, seamlessly participate in virtual meetings, and communicate clearly with friends and family.
Speech recognition technology is not suitable for everyone; especially those with neuromotor disorders such as Parkinson’s disease, which can cause a series of tense, ambiguous or inharmonious speech patterns, collectively known as dysarthria.
“Unfortunately, this means that many of the people who need voice-controlled devices the most may have the greatest difficulty using them,” Hasegawa Johnson said.
“We know from existing research that if you train ASR based on someone’s voice, it will start to understand them more accurately. We asked: Can you train an automatic speech recognizer to understand people with Parkinson’s disease by exposing it to a small group of people with Parkinson’s?
Hasegawa Johnson and his colleagues recruited about 250 adults with varying degrees of Parkinson’s disease-related dysarthria. Before joining the study, potential participants met with speech pathologists to assess their qualifications.
“Many people who have struggled with communication disorders for a long time, especially progressive communication disorders, may withdraw from daily communication,” said Clarion Mendes, a speech pathologist on the team. “They may share less and less their unique thoughts, needs and ideas, believing that their communication is too affected to have meaningful conversations.
“These are the people we are looking for,” she said.
Selected participants used their personal computers and smartphones to submit recordings. They work at their own pace and, with the optional help of caregivers, repeat old vocal commands such as “set the alarm clock”, recite passages from the novel, and express opinions with open-copy prompts such as “Please explain the steps for making breakfast for four people.”」。
Regarding the latter, one participant listed the steps to make Florentine eggs, hollandaise sauce, etc., while another participant pragmatically suggested ordering delivery.
“We heard from many participants that the participation process was not only enjoyable, but also gave them the confidence to communicate with their families again,” Mendes said. “This project brings hope, excitement and vitality to many of our participants and their loved ones-qualities unique to humans.”
She said the team consulted with Parkinson’s experts and community members to develop content relevant to participants ‘lives. Tips are specific and spontaneous: for example, training speech algorithms to recognize drug names may help end users communicate with their pharmacies, while casual conversation starters will imitate the rhythm of daily chatter.
“We told participants: We know you can make your speech clearer by putting in all the efforts you can, but you may be tired of having to try to be understood for the good of others. Try to relax and communicate like you are chatting with family on the sofa,”Mendes said.
To measure the listening and learning effectiveness of speech algorithms, the researchers divided the samples into three groups. The first group of 190 participants (i.e., 151 hours recorded) trained the model. As its performance improved, the researchers confirmed that the model was being carefully learned (rather than just remembering participants ‘reactions) by introducing the model to a second set of smaller recordings. When the model reached peak performance in the second group, the researchers challenged it with a test set.
On average, research team members manually transcribed 400 recordings for each participant to check the work of the model.
They found that after listening to the training set, the ASR system transcribed recordings from the test set, and the word error rate was 23.69%. For comparison, a system trained using speech samples from people without Parkinson’s disease had a single word error rate of 36.3% when transcribing the test set, with an accuracy approximately 30% lower.
Error rates also dropped for almost all individuals in the test set. Even people with Parkinson’s disease who have less typical speech, such as abnormally rapid speech or stuttering, have experienced modest improvement.
“I’m happy to see such a huge benefit,” Hasegawa Johnson said.
He added that feedback from participants reinforced his enthusiasm:
“I interviewed a participant who was interested in the future of this technology,” he said. “That’s the beauty of this project: to see how excited people are about the possibilities that their smart speakers and hand skills can understand them. That’s exactly what we’re trying to do.”
The original text is in the text description below the video
Thank you for watching this video. If you like it, please subscribe and like it. thank
Original text:https://medicalxpress.com/news/2024-09-automatic-speech-recognition-people-parkinson.html
More information: Mark Hasekawa-Johnson and others,”Community-supported shared infrastructure supports speech accessibility,” Journal of Speech, Language and Hearing Research (2024). DOI:10.1044/2024_JSLHR-24-00122
Provided by Beckman Institute for Advanced Science and Technology
Oil tubing: