The Elhuyar Foundation, headquartered in Usurbil (Gipuzkoa) has created and launched a voice recognition program that automatically transcribes and subtitles audio and video in Basque and Spanish. The technology, Aditu, can serve as the basis for introducing the Basque language in devices that allow interaction between people and machines. Jakes Goikoetxea signs this article in Berria a report that we have condensed here for our readers.
The interview that serves as the basis for this report could habve been transcribed through Aditu: taking the WMA audio file generated by the recorder, passing it through one of the audio formats accepted by Aditu, dragging it from the computer’s desktop to the box on the Aditu website, and requesting a transcription, and that’s it! A 40-minute interview with Igor Leturia converted into text in a blink of an eye. Review the text, and listen to it in case some passages raise doubts and the text would have been ready to edit, cut, paste, and remove…A great tool for journalists. A task that would have taken two ro three hours practically reduced into a single click. The requirement include: an audio of certain quality as well as the interview being conducted in standar Basque.
Aditu is a bilingual voice recognition program created by the Elhuyar Foundation. It is a platform that converts audio and video in Euskera and Spanish into text in these languages. Transcribe audio, transcribe and subtitle videos, audio and video archives; or simultaneous transcribe live: if you are talking on a microphone on a phone, or computer, it transcribes what is said in real time, as dictation. The tool provides the possibility of creating subtitles at live events. Translation combined with an automatic translator.
Aditu was developed by the department of speaking technology at the Elhuyar Foundation, with Igor Leturia as director of the department and of Aditu.
From Phonemes to Words
Aditu is based on neural networks and artificial intelligence, as well as on various products and serviced developed in language and speech technologies in recent years, such as Elhuyar's own Translator
“Finally, what is behind the neural networks are mathematical operations,” Igor Leturia explains. “In part, these mathematic calculations imitate the electric signals in our brain.”
The neural networks are forced to train, in order to learn. For that you need data, a lot of data. The computers of yesterday could not process all the necessary data; the current ones do. "Today, neural network technologies and the most powerful machines have coincided," summarizes Leturia. "On the one hand, neural networks have been shown to be very suitable and effective for language technologies; on the other, today's computers allow neural networks to be trained with much more data. These types of technologies have taken a tremendous leap in a short time.”
Aditu is a paid service. the price depends on the hours to transcribe. Nevertheless, anyone wishing to try it out can do so by registering and using the free trial.
(Summary in English of the original text published in Berria)