| 301 |
Improving the Performance of Online Neural Transducer Models
Tara N. Sainath, Chung-Cheng Chiu, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
50 |
8 years ago |
| 302 |
CopyPaste: An Augmentation Method for Speech Emotion Recognition
Raghavendra Pappagari, Jesús Villalba, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
49 |
5 years ago |
| 303 |
Multimodal Metric Learning for Tag-based Music Retrieval
Minz Won, Sergio Oramas, ... (+3 more)
|
👻
Ghosted
|
cs.IR
|
49 |
5 years ago |
| 304 |
MoGA: Searching Beyond MobileNetV3
Xiangxiang Chu, Bo Zhang, Ruijun Xu
|
🌅
Old Age
|
cs.LG
|
49 |
6 years ago |
| 305 |
Generalization of Spoofing Countermeasures: a Case Study with ASVspoof 2015 and BTAS 2016 Corpora
Dipjyoti Paul, Md Sahidullah, Goutam Saha
|
👻
Ghosted
|
cs.MM
|
49 |
7 years ago |
| 306 |
Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks
Shuangting Liu, Jiaqi Zhang, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
49 |
7 years ago |
| 307 |
Attention-Based LSTM for Psychological Stress Detection from Spoken Language Using Distant Supervision
Genta Indra Winata, Onno Pepijn Kampman, Pascale Fung
|
👻
Ghosted
|
cs.CL
|
49 |
7 years ago |
| 308 |
Retrieval-Generation Synergy Augmented Large Language Models
Zhangyin Feng, Xiaocheng Feng, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
49 |
2 years ago |
| 309 |
Wav2vec-based Detection and Severity Level Classification of Dysarthria from Speech
Farhad Javanmardi, Saska Tirronen, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
49 |
2 years ago |
| 310 |
Branchy-GNN: a Device-Edge Co-Inference Framework for Efficient Point Cloud Processing
Jiawei Shao, Haowei Zhang, ... (+2 more)
|
👻
Ghosted
|
cs.DC
|
48 |
5 years ago |
| 311 |
Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks
Andros Tjandra, Chunxi Liu, ... (+6 more)
|
👻
Ghosted
|
cs.CL
|
48 |
6 years ago |
| 312 |
Adaptive Scenario Discovery for Crowd Counting
Xingjiao Wu, Yingbin Zheng, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
48 |
7 years ago |
| 313 |
Learning to detect dysarthria from raw speech
Juliette Millet, Neil Zeghidour
|
👻
Ghosted
|
cs.CL
|
48 |
7 years ago |
| 314 |
On Projected Stochastic Gradient Descent Algorithm with Weighted Averaging for Least Squares Regression
Kobi Cohen, Angelia Nedic, R. Srikant
|
👻
Ghosted
|
cs.IT
|
48 |
9 years ago |
| 315 |
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Ruize Xu, Ruoxuan Feng, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
48 |
3 years ago |
| 316 |
BW-EDA-EEND: Streaming End-to-End Neural Speaker Diarization for a Variable Number of Speakers
Eunjung Han, Chul Lee, Andreas Stolcke
|
👻
Ghosted
|
cs.SD
|
47 |
5 years ago |
| 317 |
Efficient Arabic emotion recognition using deep neural networks
Ahmed Ali, Yasser Hifny
|
👻
Ghosted
|
cs.CL
|
47 |
5 years ago |
| 318 |
BBAND Index: A No-Reference Banding Artifact Predictor
Zhengzhong Tu, Jessie Lin, ... (+3 more)
|
👻
Ghosted
|
eess.IV
|
47 |
6 years ago |
| 319 |
Simultaneous Separation and Transcription of Mixtures with Multiple Polyphonic and Percussive Instruments
Ethan Manilow, Prem Seetharaman, Bryan Pardo
|
👻
Ghosted
|
eess.AS
|
47 |
6 years ago |
| 320 |
PitchNet: Unsupervised Singing Voice Conversion with Pitch Adversarial Network
Chengqi Deng, Chengzhu Yu, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
47 |
6 years ago |
| 321 |
Class-conditional embeddings for music source separation
Prem Seetharaman, Gordon Wichern, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
47 |
7 years ago |
| 322 |
Dense Multimodal Fusion for Hierarchically Joint Representation
Di Hu, Feiping Nie, Xuelong Li
|
👻
Ghosted
|
cs.CV
|
47 |
7 years ago |
| 323 |
Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model
Alexander H. Liu, Hung-yi Lee, Lin-shan Lee
|
👻
Ghosted
|
cs.CL
|
47 |
7 years ago |
| 324 |
A Coupled Compressive Sensing Scheme for Unsourced Multiple Access
Vamsi K. Amalladinne, Avinash Vem, ... (+3 more)
|
👻
Ghosted
|
cs.IT
|
47 |
7 years ago |
| 325 |
Learning Online Alignments with Continuous Rewards Policy Gradient
Yuping Luo, Chung-Cheng Chiu, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
47 |
9 years ago |
| 326 |
FAPM: Fast Adaptive Patch Memory for Real-time Industrial Anomaly Detection
Donghyeong Kim, Chaewon Park, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
47 |
3 years ago |
| 327 |
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations
Wen-Chin Huang, Yi-Chiao Wu, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
46 |
5 years ago |
| 328 |
Representation Mixing for TTS Synthesis
Kyle Kastner, João Felipe Santos, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
46 |
7 years ago |
| 329 |
Deep Multi-view Models for Glitch Classification
Sara Bahaadini, Neda Rohani, ... (+4 more)
|
👻
Ghosted
|
cs.LG
|
46 |
8 years ago |
| 330 |
Image denoising via group sparsity residual constraint
Zhiyuan Zha, Xin Liu, ... (+8 more)
|
👻
Ghosted
|
cs.CV
|
46 |
9 years ago |
| 331 |
Temporally Aligned Audio for Video with Autoregression
Ilpo Viertola, Vladimir Iashin, Esa Rahtu
|
👻
Ghosted
|
cs.CV
|
46 |
1 year ago |
| 332 |
StemGen: A music generation model that listens
Julian D. Parker, Janne Spijkervet, ... (+7 more)
|
👻
Ghosted
|
cs.SD
|
46 |
2 years ago |
| 333 |
A Foundation Model for Music Informatics
Minz Won, Yun-Ning Hung, Duc Le
|
👻
Ghosted
|
cs.SD
|
46 |
2 years ago |
| 334 |
Untargeted Backdoor Attack against Object Detection
Chengxiao Luo, Yiming Li, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
46 |
3 years ago |
| 335 |
AMC-Net: An Effective Network for Automatic Modulation Classification
Jiawei Zhang, Tiantian Wang, ... (+2 more)
|
👻
Ghosted
|
eess.SP
|
45 |
3 years ago |
| 336 |
End-to-End Speaker Diarization as Post-Processing
Shota Horiguchi, Paola Garcia, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
45 |
5 years ago |
| 337 |
Learning ASR-Robust Contextualized Embeddings for Spoken Language Understanding
Chao-Wei Huang, Yun-Nung Chen
|
🌅
Old Age
|
cs.CL
|
45 |
6 years ago |
| 338 |
Similarity Learning for Authorship Verification in Social Media
Benedikt Boenninghoff, Robert M. Nickel, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
45 |
6 years ago |
| 339 |
A Recurrent Graph Neural Network for Multi-Relational Data
Vassilis N. Ioannidis, Antonio G. Marques, Georgios B. Giannakis
|
👻
Ghosted
|
cs.LG
|
45 |
7 years ago |
| 340 |
End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator
Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
|
👻
Ghosted
|
cs.CL
|
45 |
7 years ago |
| 341 |
Decoding visemes: improving machine lipreading
Helen L. Bear, Richard Harvey
|
👻
Ghosted
|
cs.CV
|
45 |
8 years ago |
| 342 |
Learning From Yourself: A Self-Distillation Method for Fake Speech Detection
Jun Xue, Cunhang Fan, ... (+5 more)
|
👻
Ghosted
|
cs.SD
|
45 |
3 years ago |
| 343 |
Evidence of Vocal Tract Articulation in Self-Supervised Learning of Speech
Cheol Jun Cho, Peter Wu, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
45 |
3 years ago |
| 344 |
Similarity Analysis of Self-Supervised Speech Representations
Yu-An Chung, Yonatan Belinkov, James Glass
|
🌅
Old Age
|
eess.AS
|
44 |
5 years ago |
| 345 |
Dynamic Sparsity Neural Networks for Automatic Speech Recognition
Zhaofeng Wu, Ding Zhao, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
44 |
5 years ago |
| 346 |
Deep geometric knowledge distillation with graphs
Carlos Lassance, Myriam Bontonou, ... (+4 more)
|
👻
Ghosted
|
cs.LG
|
44 |
6 years ago |
| 347 |
Towards Unsupervised Speech-to-Text Translation
Yu-An Chung, Wei-Hung Weng, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
44 |
7 years ago |
| 348 |
Transfer learning of language-independent end-to-end ASR with language model fusion
Hirofumi Inaguma, Jaejin Cho, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
44 |
7 years ago |
| 349 |
Extracting Domain Invariant Features by Unsupervised Learning for Robust Automatic Speech Recognition
Wei-Ning Hsu, James Glass
|
👻
Ghosted
|
cs.CL
|
44 |
8 years ago |
| 350 |
Deep Multimodal Learning for Emotion Recognition in Spoken Language
Yue Gu, Shuhong Chen, Ivan Marsic
|
👻
Ghosted
|
cs.CL
|
44 |
8 years ago |