Laryngo Voice
A Laryngectomy. This is an operation in which the patient’s entire larynx, including the vocal cords, is removed. The permanent loss of the ability to speak changes the quality of a person’s life. This is something we decided to change, along with help from our colleagues at the University of West Bohemia in Pilsen (ZČU).
The Laryngo Voice solution offers the preservation and even continued use of a person’s unique voice for communication. The patient’s voice is generated on the basis of stored recordings of phonetically rich sentences from the period prior to the operation.
The voice synthesizer is installed on a portable device (i.e. a mobile phone, tablet, notebook) and the patient can adjust the artificially created voice themselves. This cutting-edge solution ensures a close likeness of the artificially generated voice to the patient’s original voice.
The voice preservation process
The first phase of the voice synthesis process is a sound recording of the patient’s voice before the operation, which is performed in a specialised recording studio.
Texts are carefully compiled and recorded so as to effectively capture a wide range of the specific characteristics of the individual patient’s vocal expression. After recording the patient’s voice, the recording is checked and a phonetic-prosodic analysis is performed. The result is a voice module compatible with the following voice synthesis methods:
- Statistical-parametric voice synthesis (speech is generated on the basis of statistical models, using Markov models)
- Unit selection (speech originates by the linking of carefully selected short segments of actual speech)
We are directing our current development of the system towards full automation, which would not require any human assistance during the entire process. The patient would thus create a program on their own, which would read assigned text with the patient’s own, synthetized voice, and would then install the program on a device of their choosing.
Currently available solutions dealing with the creation of personal synthetic speech systems mostly focus on statistical-parametric speech synthesis. The average statistical model created using a large number of different voices tends to be adapted to a specific user based on their voice recordings. However, the similarity of the resulting output with the person’s voice is usually not satisfactory.
Laryngo Voice and its testing
From the very beginning of the project, we have been closely cooperating with the Department of Otorhinolaryngology and Head and Neck Surgery in Motol University Hospital and First Faculty of Medicine, Charles University.