Predicting tongue motion in unlabeled ultrasound videos using convolutional LSTM neural network

February 19, 2019 ยท Entered Twilight ยท ๐Ÿ› IEEE International Conference on Acoustics, Speech, and Signal Processing

๐ŸŒ… TWILIGHT: Old Age
Predates the code-sharing era โ€” a pioneer of its time

"Last commit was 7.0 years ago (โ‰ฅ5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .ipynb_checkpoints, README.md, convlstm.ipynb, cw-ssim.ipynb, get_testdata.ipynb, get_traindata.ipynb, predict.ipynb

Authors Chaojie Zhao, Peng Zhang, Jian Zhu, Chengrui Wu, Huaimin Wang, Kele Xu arXiv ID 1902.06927 Category cs.CV: Computer Vision Cross-listed cs.LG, cs.MM Citations 25 Venue IEEE International Conference on Acoustics, Speech, and Signal Processing Repository https://github.com/shuiliwanwu/ConvLstm-ultrasound-videos โญ 19 Last Checked 1 month ago
Abstract
A challenge in speech production research is to predict future tongue movements based on a short period of past tongue movements. This study tackles speaker-dependent tongue motion prediction problem in unlabeled ultrasound videos with convolutional long short-term memory (ConvLSTM) networks. The model has been tested on two different ultrasound corpora. ConvLSTM outperforms 3-dimensional convolutional neural network (3DCNN) in predicting the 9\textsuperscript{th} frames based on 8 preceding frames, and also demonstrates good capacity to predict only the tongue contours in future frames. Further tests reveal that ConvLSTM can also learn to predict tongue movements in more distant frames beyond the immediately following frames. Our codes are available at: https://github.com/shuiliwanwu/ConvLstm-ultrasound-videos.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision