| 201 |
Generative Adversarial Source Separation
Cem Subakan, Paris Smaragdis
|
👻
Ghosted
|
cs.SD
|
74 |
8 years ago |
| 202 |
End-to-End Optimized Speech Coding with Deep Neural Networks
Srihari Kankanahalli
|
👻
Ghosted
|
cs.SD
|
74 |
8 years ago |
| 203 |
End-to-End Streaming Keyword Spotting
Alvarez Raziel, Park Hyun-Jin
|
👻
Ghosted
|
cs.CL
|
73 |
7 years ago |
| 204 |
Joint Modeling of Accents and Acoustics for Multi-Accent Speech Recognition
Xuesong Yang, Kartik Audhkhasi, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
73 |
8 years ago |
| 205 |
Adversarial Semi-Supervised Audio Source Separation applied to Singing Voice Extraction
Daniel Stoller, Sebastian Ewert, Simon Dixon
|
🌅
Old Age
|
cs.LG
|
73 |
8 years ago |
| 206 |
Adversarial Advantage Actor-Critic Model for Task-Completion Dialogue Policy Learning
Baolin Peng, Xiujun Li, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
73 |
8 years ago |
| 207 |
Encoder-decoder with Focus-mechanism for Sequence Labelling Based Spoken Language Understanding
Su Zhu, Kai Yu
|
👻
Ghosted
|
cs.CL
|
73 |
9 years ago |
| 208 |
Modality Attention for End-to-End Audio-visual Speech Recognition
Pan Zhou, Wenwen Yang, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
72 |
7 years ago |
| 209 |
Adversarial Inpainting of Medical Image Modalities
Karim Armanious, Youssef Mecky, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
72 |
7 years ago |
| 210 |
Scaling Recurrent Neural Network Language Models
Will Williams, Niranjani Prasad, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
72 |
11 years ago |
| 211 |
Unsupervised Contrastive Learning of Sound Event Representations
Eduardo Fonseca, Diego Ortego, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
71 |
5 years ago |
| 212 |
Upsampling artifacts in neural audio synthesis
Jordi Pons, Santiago Pascual, ... (+2 more)
|
🌅
Old Age
|
cs.SD
|
71 |
5 years ago |
| 213 |
Analyzing ASR pretraining for low-resource speech-to-text translation
Mihaela C. Stoian, Sameer Bansal, Sharon Goldwater
|
👻
Ghosted
|
cs.CL
|
71 |
6 years ago |
| 214 |
Truly unsupervised acoustic word embeddings using weak top-down constraints in encoder-decoder models
Herman Kamper
|
👻
Ghosted
|
cs.CL
|
71 |
7 years ago |
| 215 |
Two-Step Sound Source Separation: Training on Learned Latent Targets
Efthymios Tzinis, Shrikant Venkataramani, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
70 |
6 years ago |
| 216 |
Towards Language-Universal End-to-End Speech Recognition
Suyoun Kim, Michael L. Seltzer
|
👻
Ghosted
|
cs.CL
|
70 |
8 years ago |
| 217 |
A comparison of recent waveform generation and acoustic modeling methods for neural-network-based speech synthesis
Xin Wang, Jaime Lorenzo-Trueba, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
69 |
8 years ago |
| 218 |
FPGA Based Implementation of Deep Neural Networks Using On-chip Memory Only
Jinhwan Park, Wonyong Sung
|
👻
Ghosted
|
cs.AR
|
69 |
10 years ago |
| 219 |
Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training
Sameer Khurana, Niko Moritz, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
68 |
5 years ago |
| 220 |
Deep Joint Source-Channel Coding for Wireless Image Retrieval
Mikolaj Jankowski, Deniz Gunduz, Krystian Mikolajczyk
|
👻
Ghosted
|
cs.IT
|
68 |
6 years ago |
| 221 |
Towards Audio to Scene Image Synthesis using Generative Adversarial Network
Chia-Hung Wan, Shun-Po Chuang, Hung-Yi Lee
|
👻
Ghosted
|
cs.CL
|
68 |
7 years ago |
| 222 |
Character-Level Language Modeling with Hierarchical Recurrent Neural Networks
Kyuyeon Hwang, Wonyong Sung
|
👻
Ghosted
|
cs.LG
|
68 |
9 years ago |
| 223 |
When BERT Meets Quantum Temporal Convolution Learning for Text Classification in Heterogeneous Computing
Chao-Han Huck Yang, Jun Qi, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
68 |
4 years ago |
| 224 |
Improved Mask-CTC for Non-Autoregressive End-to-End ASR
Yosuke Higuchi, Hirofumi Inaguma, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
67 |
5 years ago |
| 225 |
Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems
Yinghui Huang, Hong-Kwang Kuo, ... (+6 more)
|
👻
Ghosted
|
cs.CL
|
67 |
5 years ago |
| 226 |
Adversarial Speaker Verification
Zhong Meng, Yong Zhao, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
67 |
6 years ago |
| 227 |
Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Xuankai Chang, Brian Yan, ... (+15 more)
|
👻
Ghosted
|
cs.CL
|
67 |
2 years ago |
| 228 |
Training neural audio classifiers with few data
Jordi Pons, Joan Serrà, Xavier Serra
|
🌅
Old Age
|
cs.SD
|
66 |
7 years ago |
| 229 |
Invariances and Data Augmentation for Supervised Music Transcription
John Thickstun, Zaid Harchaoui, ... (+2 more)
|
👻
Ghosted
|
stat.ML
|
66 |
8 years ago |
| 230 |
Muse: Multi-modal target speaker extraction with visual cues
Zexu Pan, Ruijie Tao, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
65 |
5 years ago |
| 231 |
Diffusion-based Generative Speech Source Separation
Robin Scheibler, Youna Ji, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
65 |
3 years ago |
| 232 |
A Multi-Phase Gammatone Filterbank for Speech Separation via TasNet
David Ditter, Timo Gerkmann
|
👻
Ghosted
|
eess.AS
|
64 |
6 years ago |
| 233 |
Small-Footprint Keyword Spotting on Raw Audio Data with Sinc-Convolutions
Simon Mittermaier, Ludwig Kürzinger, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
64 |
6 years ago |
| 234 |
Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments
Giovanni Morrone, Luca Pasa, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
64 |
7 years ago |
| 235 |
Fixed-Point Performance Analysis of Recurrent Neural Networks
Sungho Shin, Kyuyeon Hwang, Wonyong Sung
|
👻
Ghosted
|
cs.LG
|
64 |
10 years ago |
| 236 |
Demystifying TasNet: A Dissecting Approach
Jens Heitkaemper, Darius Jakobeit, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
63 |
6 years ago |
| 237 |
Quaternion Convolutional Neural Networks for Detection and Localization of 3D Sound Events
Danilo Comminiello, Marco Lella, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
63 |
7 years ago |
| 238 |
Character-Level Incremental Speech Recognition with Recurrent Neural Networks
Kyuyeon Hwang, Wonyong Sung
|
👻
Ghosted
|
cs.CL
|
63 |
10 years ago |
| 239 |
FuzzLLM: A Novel and Universal Fuzzing Framework for Proactively Discovering Jailbreak Vulnerabilities in Large Language Models
Dongyu Yao, Jianshu Zhang, ... (+2 more)
|
👻
Ghosted
|
cs.CR
|
63 |
2 years ago |
| 240 |
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
Yiwei Guo, Chenpeng Du, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
62 |
2 years ago |
| 241 |
Semi-Supervised Spoken Language Understanding via Self-Supervised Speech and Language Model Pretraining
Cheng-I Lai, Yung-Sung Chuang, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
62 |
5 years ago |
| 242 |
Frequency and temporal convolutional attention for text-independent speaker recognition
Sarthak Yadav, Atul Rai
|
👻
Ghosted
|
cs.SD
|
62 |
6 years ago |
| 243 |
Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-End Speaker Verification
Gautam Bhattacharya, Joao Monteiro, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
62 |
7 years ago |
| 244 |
Effect of data reduction on sequence-to-sequence neural TTS
Javier Latorre, Jakub Lachowicz, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
62 |
7 years ago |
| 245 |
High efficiency compression for object detection
Hyomin Choi, Ivan V. Bajic
|
👻
Ghosted
|
eess.IV
|
62 |
8 years ago |
| 246 |
Distributed Scheduling using Graph Neural Networks
Zhongyuan Zhao, Gunjan Verma, ... (+3 more)
|
👻
Ghosted
|
eess.SP
|
61 |
5 years ago |
| 247 |
Low-resource expressive text-to-speech using data augmentation
Goeric Huybrechts, Thomas Merritt, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
61 |
5 years ago |
| 248 |
On the Influence of Momentum Acceleration on Online Learning
Kun Yuan, Bicheng Ying, Ali H. Sayed
|
👻
Ghosted
|
math.OC
|
61 |
10 years ago |
| 249 |
Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models
Minki Kang, Dongchan Min, Sung Ju Hwang
|
👻
Ghosted
|
eess.AS
|
61 |
3 years ago |
| 250 |
Speaker-invariant Affective Representation Learning via Adversarial Training
Haoqi Li, Ming Tu, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
60 |
6 years ago |