| 101 |
Deep Contextualized Acoustic Representations For Semi-Supervised Speech Recognition
Shaoshi Ling, Yuzong Liu, ... (+2 more)
|
⏳
Coming Soon™
|
eess.AS
|
145 |
6 years ago |
| 102 |
High-quality nonparallel voice conversion based on cycle-consistent adversarial network
Fuming Fang, Junichi Yamagishi, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
144 |
8 years ago |
| 103 |
Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff
Eitan Borgnia, Valeriia Cherepanova, ... (+6 more)
|
👻
Ghosted
|
cs.CR
|
143 |
5 years ago |
| 104 |
Knowledge Transfer from Weakly Labeled Audio using Convolutional Neural Network for Sound Events and Scenes
Anurag Kumar, Maksim Khadkevich, Christian Fugen
|
👻
Ghosted
|
cs.SD
|
142 |
8 years ago |
| 105 |
Domain Adversarial Training for Accented Speech Recognition
Sining Sun, Ching-Feng Yeh, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
139 |
7 years ago |
| 106 |
Learning Sparse Graphs Under Smoothness Prior
Sundeep Prabhakar Chepuri, Sijia Liu, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
137 |
9 years ago |
| 107 |
Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes
Bo Li, Yu Zhang, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
136 |
7 years ago |
| 108 |
A Comparison of deep learning methods for environmental sound
Juncheng Li, Wei Dai, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
135 |
9 years ago |
| 109 |
Accelerating Deep Convolutional Networks using low-precision and sparsity
Ganesh Venkatesh, Eriko Nurvitadhi, Debbie Marr
|
⏳
Coming Soon™
|
cs.LG
|
135 |
9 years ago |
| 110 |
Speech Emotion Recognition with Dual-Sequence LSTM Architecture
Jianyou Wang, Michael Xue, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
132 |
6 years ago |
| 111 |
End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features
Chiori Hori, Huda Alamri, ... (+11 more)
|
👻
Ghosted
|
cs.CL
|
131 |
7 years ago |
| 112 |
Speech Emotion Recognition Using Multi-hop Attention Mechanism
Seunghyun Yoon, Seokhyun Byun, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
130 |
6 years ago |
| 113 |
Perfect match: Improved cross-modal embeddings for audio-visual synchronisation
Soo-Whan Chung, Joon Son Chung, Hong-Goo Kang
|
👻
Ghosted
|
cs.CV
|
130 |
7 years ago |
| 114 |
Exposing GAN-generated Faces Using Inconsistent Corneal Specular Highlights
Shu Hu, Yuezun Li, Siwei Lyu
|
👻
Ghosted
|
cs.CV
|
129 |
5 years ago |
| 115 |
Self-Supervised Learning For Few-Shot Image Classification
Da Chen, Yuefeng Chen, ... (+4 more)
|
🌅
Old Age
|
cs.CV
|
128 |
6 years ago |
| 116 |
Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization
Yoshiaki Bando, Masato Mimura, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
128 |
8 years ago |
| 117 |
Learning Filterbanks from Raw Speech for Phone Recognition
Neil Zeghidour, Nicolas Usunier, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
127 |
8 years ago |
| 118 |
Random Projections through multiple optical scattering: Approximating kernels at the speed of light
Alaa Saade, Francesco Caltagirone, ... (+5 more)
|
👻
Ghosted
|
cs.ET
|
125 |
10 years ago |
| 119 |
Deep-FSMN for Large Vocabulary Continuous Speech Recognition
Shiliang Zhang, Ming Lei, ... (+2 more)
|
👻
Ghosted
|
cs.NE
|
124 |
8 years ago |
| 120 |
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Eric Battenberg, RJ Skerry-Ryan, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
122 |
6 years ago |
| 121 |
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis
Yu-An Chung, Yuxuan Wang, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
121 |
7 years ago |
| 122 |
End-To-End Visual Speech Recognition With LSTMs
Stavros Petridis, Zuwei Li, Maja Pantic
|
👻
Ghosted
|
cs.CV
|
120 |
9 years ago |
| 123 |
Backdoor Attack against Speaker Verification
Tongqing Zhai, Yiming Li, ... (+4 more)
|
💀
404 Not Found
|
cs.CR
|
119 |
5 years ago |
| 124 |
FedPrompt: Communication-Efficient and Privacy Preserving Prompt Tuning in Federated Learning
Haodong Zhao, Wei Du, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
116 |
3 years ago |
| 125 |
Speaker-Invariant Training via Adversarial Learning
Zhong Meng, Jinyu Li, ... (+6 more)
|
👻
Ghosted
|
eess.AS
|
115 |
8 years ago |
| 126 |
Meta Learning for End-to-End Low-Resource Speech Recognition
Jui-Yang Hsu, Yuan-Jui Chen, Hung-yi Lee
|
👻
Ghosted
|
cs.SD
|
114 |
6 years ago |
| 127 |
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Kou Tanaka, Hirokazu Kameoka, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
114 |
7 years ago |
| 128 |
Connecting Speech Encoder and Large Language Model for ASR
Wenyi Yu, Changli Tang, ... (+7 more)
|
👻
Ghosted
|
eess.AS
|
114 |
2 years ago |
| 129 |
End-to-End ASR-free Keyword Search from Speech
Kartik Audhkhasi, Andrew Rosenberg, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
113 |
9 years ago |
| 130 |
Recurrent Neural Network Training with Dark Knowledge Transfer
Zhiyuan Tang, Dong Wang, Zhiyong Zhang
|
👻
Ghosted
|
stat.ML
|
112 |
10 years ago |
| 131 |
Using Intelligent Reflecting Surfaces for Rank Improvement in MIMO Communications
Özgecan Özdogan, Emil Björnson, Erik G. Larsson
|
👻
Ghosted
|
eess.SP
|
110 |
6 years ago |
| 132 |
Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Haoran Miao, Gaofeng Cheng, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
109 |
6 years ago |
| 133 |
Self-supervised Learning for ECG-based Emotion Recognition
Pritam Sarkar, Ali Etemad
|
👻
Ghosted
|
cs.LG
|
108 |
6 years ago |
| 134 |
Efficient keyword spotting using dilated convolutions and gating
Alice Coucke, Mohammed Chlieh, ... (+4 more)
|
👻
Ghosted
|
cs.LG
|
108 |
7 years ago |
| 135 |
MMSE precoder for massive MIMO using 1-bit quantization
Ovais Bin Usman, Hela Jedda, ... (+2 more)
|
👻
Ghosted
|
cs.IT
|
107 |
8 years ago |
| 136 |
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Ron J. Weiss, RJ Skerry-Ryan, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
106 |
5 years ago |
| 137 |
Quaternion Convolutional Neural Networks for Heterogeneous Image Processing
Titouan Parcollet, Mohamed Morchid, Georges Linarès
|
👻
Ghosted
|
cs.CV
|
106 |
7 years ago |
| 138 |
Learning Representations of Emotional Speech with Deep Convolutional Generative Adversarial Networks
Jonathan Chang, Stefan Scherer
|
👻
Ghosted
|
cs.CL
|
105 |
8 years ago |
| 139 |
Improving sequence-to-sequence speech recognition training with on-the-fly data augmentation
Thai-Son Nguyen, Sebastian Stueker, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
104 |
6 years ago |
| 140 |
Evaluating Voice Conversion-based Privacy Protection against Informed Attackers
Brij Mohan Lal Srivastava, Nathalie Vauquier, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
104 |
6 years ago |
| 141 |
Conditional Teacher-Student Learning
Zhong Meng, Jinyu Li, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
104 |
6 years ago |
| 142 |
On the Compression of Recurrent Neural Networks with an Application to LVCSR acoustic modeling for Embedded Speech Recognition
Rohit Prabhavalkar, Ouais Alsharif, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
104 |
10 years ago |
| 143 |
Correction of Automatic Speech Recognition with Transformer Sequence-to-sequence Model
Oleksii Hrinchuk, Mariya Popova, Boris Ginsburg
|
👻
Ghosted
|
cs.CL
|
101 |
6 years ago |
| 144 |
Convolutional Neural Network Approach for EEG-based Emotion Recognition using Brain Connectivity and its Spatial Information
Seong-Eun Moon, Soobeom Jang, Jong-Seok Lee
|
👻
Ghosted
|
cs.HC
|
101 |
7 years ago |
| 145 |
Advancing Acoustic-to-Word CTC Model
Jinyu Li, Guoli Ye, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
100 |
8 years ago |
| 146 |
FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization
Jiahui Yu, Chung-Cheng Chiu, ... (+9 more)
|
👻
Ghosted
|
eess.AS
|
99 |
5 years ago |
| 147 |
Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective
Zhong-Qiu Wang, Ke Tan, DeLiang Wang
|
👻
Ghosted
|
cs.SD
|
99 |
7 years ago |
| 148 |
Adversarial Teacher-Student Learning for Unsupervised Domain Adaptation
Zhong Meng, Jinyu Li, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
99 |
8 years ago |
| 149 |
This dataset does not exist: training models from generated images
Victor Besnier, Himalaya Jain, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
98 |
6 years ago |
| 150 |
Sequence-based Multi-lingual Low Resource Speech Recognition
Siddharth Dalmia, Ramon Sanabria, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
98 |
8 years ago |