A functional articulatory dynamic model for speech production

TitleA functional articulatory dynamic model for speech production
Publication TypeConference Paper
Year of Publication2001
AuthorsLee, L. J., P. Fieguth, and L. Deng
Conference Name27th International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Conference LocationUtah
Keywordsacoustic feature space, articulatory trajectories, coarticulation, dynamic properties, functional articulatory dynamic model, jaw, larynx, lower lip, matrix algebra, maximum likelihood estimation, natural speech, nonlinear parameter estimation methodology, parameter estimation, phonetic reduction phenomena, speech recognition, speech synthesis, state-space framework, state-space methods, statistical speech production model, tongue dorsum, tongue tip, upper lip, velum, vocal articulators

Introduces a statistical speech production model. The model synthesizes natural speech by modeling some key dynamic properties of vocal articulators in a linear/nonlinear state-space framework. The goal-oriented movements of the articulators (tongue tip, tongue dorsum, upper lip, lower lip, and jaw) are described in a linear dynamic state equation. The resulting articulatory trajectories, combined with the effects of the velum and larynx, are nonlinearly mapped into the acoustic feature space (MFCCs). The key challenges in this model are the development of a nonlinear parameter estimation methodology, and the incorporation of appropriate prior assumptions to assert in the articulatory dynamic structure. Such a model can also be directly applied to speech recognition to better account for coarticulation and phonetic reduction phenomena with considerably fewer parameters than HMM based approaches