Yinghua Zhu1, Yoon-Chul Kim1,
Michael I. Proctor1, Shrikanth S. Narayanan1, Krishna
S. Nayak1
1Ming Hsieh Department of
Electrical Engineering,
We reconstruct the 3D dynamics of vocal tract based on 1) parallel 2D real-time imaging of 15 repetitions of a speech productions /asa/, /aʃa/, /ala/ and /ara/, with the synchronized noise-cancelled audios recorded simultaneously, and 2) alignment of the 2D real-time movies using dynamic time warping based on the recorded audio tracks, with mel-frequency cepstral coefficients as the acoustic feature to analyze. The resulting 3D movies show several vocal tract features that cannot be seen in single 2D slice, and therefore present unique value to speech research.
Keywords