NonText Version

Speech Research Laboratory
AI. duPont Hospital for Children
and the
University of Delaware

SRL Main Page (text)

Info for Users (text)
Info for Clinicians (text)
Info for Researchers (text)

ModelTalker Speech Synthesizer

STAR: Speech Training, Assessment and Remediation (text)

Language and Speech

Newsletter (text)

Contact Us!


Info for Clinicians

Index:

Clinicians Needed!
About ModelTalker
ModelTalker Software
InvTool Software & Tutorial
BCCdb Software

Hardware Requirements
Inventory Design
User Satisfaction
Clinician Participation

 

Inventory Design

An issue we are exploring is the effects of different amounts and contents of recorded speech on the quality of the resulting synthesized speech. The Inventory composition directly affects the quality of the speech synthesis. Obviously, the more words, phrases and examples of each biphone that are recorded, the better the quality of the resulting synthesized speech. However, especially for people with ALS, there is a limitation on how much speech a person can record before becoming tired. The inventory must contain the recorded speech necessary for unrestricted English synthesis as well as the words and phrases that the user wants to be synthesized with "recording quality" (e.g. "Have a nice day!", family names, etc.), yet should be compact and manageable in size.

Our goal is to discover and eliminate redundancies in the current inventory. Currently the inventory consists of about 1400 words and phrases. However, it is very likely that a number of those words and phrases can be eliminated from the inventory without any noticeable effect on the quality of the resulting synthesized speech. It is also possible that users may be willing to sacrifice some quality in exchange for the ease of recording a smaller inventory. It is our intent to quantify the degradation in synthetic speech quality for different sized inventories, thus giving users a choice of inventory size and a good feel for the resulting synthetic speech quality they can expect.

The goal is to select the smallest list that provides high quality speech and allows synthesis of any possible English utterance and leads to user satisfaction.