Learning ASR-Robust Contextualized Embeddings for Spoken Language Understanding
September 24, 2019 ยท Entered Twilight ยท ๐ IEEE International Conference on Acoustics, Speech, and Signal Processing
"Last commit was 5.0 years ago (โฅ5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: README.md, data, models, requirements.txt, src
Authors
Chao-Wei Huang, Yun-Nung Chen
arXiv ID
1909.10861
Category
cs.CL: Computation & Language
Cross-listed
cs.LG,
eess.AS
Citations
45
Venue
IEEE International Conference on Acoustics, Speech, and Signal Processing
Repository
https://github.com/MiuLab/SpokenVec
โญ 24
Last Checked
1 month ago
Abstract
Employing pre-trained language models (LM) to extract contextualized word representations has achieved state-of-the-art performance on various NLP tasks. However, applying this technique to noisy transcripts generated by automatic speech recognizer (ASR) is concerned. Therefore, this paper focuses on making contextualized representations more ASR-robust. We propose a novel confusion-aware fine-tuning method to mitigate the impact of ASR errors to pre-trained LMs. Specifically, we fine-tune LMs to produce similar representations for acoustically confusable words that are obtained from word confusion networks (WCNs) produced by ASR. Experiments on the benchmark ATIS dataset show that the proposed method significantly improves the performance of spoken language understanding when performing on ASR transcripts. Our source code is available at https://github.com/MiuLab/SpokenVec
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
RoBERTa: A Robustly Optimized BERT Pretraining Approach
R.I.P.
๐ป
Ghosted
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
R.I.P.
๐ป
Ghosted