| 151 |
Sample-level CNN Architectures for Music Auto-tagging Using Raw Waveforms
Taejun Kim, Jongpil Lee, Juhan Nam
|
👻
Ghosted
|
cs.SD
|
98 |
8 years ago |
| 152 |
LoRa Digital Receiver Analysis and Implementation
Reza Ghanaatian, Orion Afisiadis, ... (+2 more)
|
👻
Ghosted
|
cs.IT
|
97 |
7 years ago |
| 153 |
Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning
Shansong Liu, Atin Sakkeer Hussain, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
96 |
2 years ago |
| 154 |
Short-segment heart sound classification using an ensemble of deep convolutional neural networks
Fuad Noman, Chee-Ming Ting, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
95 |
7 years ago |
| 155 |
Bootstrapping Graph Convolutional Neural Networks for Autism Spectrum Disorder Classification
Rushil Anirudh, Jayaraman J. Thiagarajan
|
👻
Ghosted
|
stat.ML
|
95 |
8 years ago |
| 156 |
Advances in All-Neural Speech Recognition
G. Zweig, C. Yu, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
95 |
9 years ago |
| 157 |
Lip2AudSpec: Speech reconstruction from silent lip movements video
Hassan Akbari, Himani Arora, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
94 |
8 years ago |
| 158 |
Generalized linear mixing model accounting for endmember variability
Tales Imbiriba, Ricardo Augusto Borsoi, José Carlos Moreira Bermudez
|
👻
Ghosted
|
cs.CV
|
94 |
8 years ago |
| 159 |
Adversarial Attacks on GMM i-vector based Speaker Verification Systems
Xu Li, Jinghua Zhong, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
93 |
6 years ago |
| 160 |
DADA: Deep Adversarial Data Augmentation for Extremely Low Data Regime Classification
Xiaofeng Zhang, Zhangyang Wang, ... (+2 more)
|
🌅
Old Age
|
cs.CV
|
93 |
7 years ago |
| 161 |
Self-supervised Text-independent Speaker Verification using Prototypical Momentum Contrastive Learning
Wei Xia, Chunlei Zhang, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
91 |
5 years ago |
| 162 |
StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive Normalization
Ahmed Mustafa, Nicola Pia, Guillaume Fuchs
|
👻
Ghosted
|
eess.AS
|
91 |
5 years ago |
| 163 |
Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems
Nick Rossenbach, Albert Zeyer, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
91 |
6 years ago |
| 164 |
A Recurrent Variational Autoencoder for Speech Enhancement
Simon Leglaive, Xavier Alameda-Pineda, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
91 |
6 years ago |
| 165 |
Cycle-consistency training for end-to-end speech recognition
Takaaki Hori, Ramon Astudillo, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
90 |
7 years ago |
| 166 |
Deep Long Short-Term Memory Adaptive Beamforming Networks For Multichannel Robust Speech Recognition
Zhong Meng, Shinji Watanabe, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
90 |
8 years ago |
| 167 |
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language
Yusuke Yasuda, Xin Wang, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
89 |
7 years ago |
| 168 |
How should we evaluate supervised hashing?
Alexandre Sablayrolles, Matthijs Douze, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
89 |
9 years ago |
| 169 |
Training Speech Recognition Models with Federated Learning: A Quality/Cost Framework
Dhruv Guliani, Francoise Beaufays, Giovanni Motta
|
👻
Ghosted
|
cs.LG
|
88 |
5 years ago |
| 170 |
Cascaded encoders for unifying streaming and non-streaming ASR
Arun Narayanan, Tara N. Sainath, ... (+6 more)
|
👻
Ghosted
|
eess.AS
|
88 |
5 years ago |
| 171 |
Lightweight and Efficient End-to-End Speech Recognition Using Low-Rank Transformer
Genta Indra Winata, Samuel Cahyawijaya, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
87 |
6 years ago |
| 172 |
Learning Compact Recurrent Neural Networks
Zhiyun Lu, Vikas Sindhwani, Tara N. Sainath
|
👻
Ghosted
|
cs.LG
|
87 |
10 years ago |
| 173 |
Attention-Augmented End-to-End Multi-Task Learning for Emotion Prediction from Speech
Zixing Zhang, Bingwen Wu, Bjoern Schuller
|
👻
Ghosted
|
cs.CL
|
86 |
7 years ago |
| 174 |
Forward Attention in Sequence-to-sequence Acoustic Modelling for Speech Synthesis
Jing-Xuan Zhang, Zhen-Hua Ling, Li-Rong Dai
|
👻
Ghosted
|
cs.CL
|
86 |
7 years ago |
| 175 |
FragmentVC: Any-to-Any Voice Conversion by End-to-End Extracting and Fusing Fine-Grained Voice Fragments With Attention
Yist Y. Lin, Chung-Ming Chien, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
85 |
5 years ago |
| 176 |
End-to-End Monaural Multi-speaker ASR System without Pretraining
Xuankai Chang, Yanmin Qian, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
85 |
7 years ago |
| 177 |
Speech Sentiment Analysis via Pre-trained Features from End-to-end ASR Models
Zhiyun Lu, Liangliang Cao, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
84 |
6 years ago |
| 178 |
Detecting Multiple Speech Disfluencies using a Deep Residual Network with Bidirectional Long Short-Term Memory
Tedd Kourkounakis, Amirhossein Hajavi, Ali Etemad
|
👻
Ghosted
|
eess.AS
|
84 |
6 years ago |
| 179 |
Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial Networks
Amanda Duarte, Francisco Roldan, ... (+8 more)
|
🌅
Old Age
|
cs.MM
|
84 |
7 years ago |
| 180 |
Attentive Filtering Networks for Audio Replay Attack Detection
Cheng-I Lai, Alberto Abad, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
84 |
7 years ago |
| 181 |
Advanced LSTM: A Study about Better Time Dependency Modeling in Emotion Recognition
Fei Tao, Gang Liu
|
👻
Ghosted
|
cs.LG
|
84 |
8 years ago |
| 182 |
Parsimonious Online Learning with Kernels via Sparse Projections in Function Space
Alec Koppel, Garrett Warnell, ... (+2 more)
|
👻
Ghosted
|
stat.ML
|
82 |
9 years ago |
| 183 |
End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager
Xuesong Yang, Yun-Nung Chen, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
82 |
9 years ago |
| 184 |
Cross-lingual and Multilingual Speech Emotion Recognition on English and French
Michael Neumann, Ngoc Thang Vu
|
👻
Ghosted
|
cs.CL
|
81 |
8 years ago |
| 185 |
A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks
Yun Tang, Juan Pino, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
80 |
5 years ago |
| 186 |
Scalable Mutual Information Estimation using Dependence Graphs
Morteza Noshad, Yu Zeng, Alfred O. Hero
|
👻
Ghosted
|
cs.IT
|
80 |
8 years ago |
| 187 |
Improving Universal Sound Separation Using Sound Classification
Efthymios Tzinis, Scott Wisdom, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
79 |
6 years ago |
| 188 |
Sound Source Localization in a Multipath Environment Using Convolutional Neural Networks
Eric L. Ferguson, Stefan B. Williams, Craig T. Jin
|
👻
Ghosted
|
cs.SD
|
79 |
8 years ago |
| 189 |
Emotion recognition by fusing time synchronous and time asynchronous representations
Wen Wu, Chao Zhang, Philip C. Woodland
|
👻
Ghosted
|
cs.CL
|
78 |
5 years ago |
| 190 |
RETURNN: The RWTH Extensible Training framework for Universal Recurrent Neural Networks
Patrick Doetsch, Albert Zeyer, ... (+4 more)
|
👻
Ghosted
|
cs.LG
|
78 |
9 years ago |
| 191 |
Knowledge Distillation for Small-footprint Highway Networks
Liang Lu, Michelle Guo, Steve Renals
|
👻
Ghosted
|
cs.CL
|
78 |
9 years ago |
| 192 |
STC Anti-spoofing Systems for the ASVspoof 2015 Challenge
Sergey Novoselov, Alexandr Kozlov, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
78 |
10 years ago |
| 193 |
Attention Driven Fusion for Multi-Modal Emotion Recognition
Darshana Priyasad, Tharindu Fernando, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
77 |
5 years ago |
| 194 |
Synchronous Transformers for End-to-End Speech Recognition
Zhengkun Tian, Jiangyan Yi, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
77 |
6 years ago |
| 195 |
The CORAL+ Algorithm for Unsupervised Domain Adaptation of PLDA
Kong Aik Lee, Qiongqiong Wang, Takafumi Koshinaka
|
👻
Ghosted
|
cs.LG
|
77 |
7 years ago |
| 196 |
Breast density classification with deep convolutional neural networks
Nan Wu, Krzysztof J. Geras, ... (+7 more)
|
👻
Ghosted
|
cs.CV
|
77 |
8 years ago |
| 197 |
Sketching for Large-Scale Learning of Mixture Models
Nicolas Keriven, Anthony Bourrier, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
77 |
9 years ago |
| 198 |
A Probabilistic Interpretation of Sampling Theory of Graph Signals
Akshay Gadde, Antonio Ortega
|
👻
Ghosted
|
cs.LG
|
77 |
11 years ago |
| 199 |
DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech
Keon Lee, Kyumin Park, Daeyoung Kim
|
👻
Ghosted
|
eess.AS
|
76 |
3 years ago |
| 200 |
Filterbank design for end-to-end speech separation
Manuel Pariente, Samuele Cornell, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
74 |
6 years ago |