| 351 |
Real-Time Target Sound Extraction
Bandhav Veluri, Justin Chan, ... (+4 more)
|
🌅
Old Age
|
cs.SD
|
44 |
3 years ago |
| 352 |
Visual Prompting for Adversarial Robustness
Aochuan Chen, Peter Lorenz, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
44 |
3 years ago |
| 353 |
Streaming Simultaneous Speech Translation with Augmented Memory Transformer
Xutai Ma, Yongqiang Wang, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
43 |
5 years ago |
| 354 |
Emotional Voice Conversion using Multitask Learning with Text-to-speech
Tae-Ho Kim, Sungjae Cho, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
43 |
6 years ago |
| 355 |
To Reverse the Gradient or Not: An Empirical Comparison of Adversarial and Multi-task Learning in Speech Recognition
Yossi Adi, Neil Zeghidour, ... (+4 more)
|
👻
Ghosted
|
cs.LG
|
43 |
7 years ago |
| 356 |
Contextual Speech Recognition with Difficult Negative Training Examples
Uri Alon, Golan Pundak, Tara N. Sainath
|
👻
Ghosted
|
eess.AS
|
43 |
7 years ago |
| 357 |
Geometry of Deep Learning for Magnetic Resonance Fingerprinting
Mohammad Golbabaee, Dongdong Chen, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
43 |
7 years ago |
| 358 |
Dynamic Temporal Alignment of Speech to Lips
Tavi Halperin, Ariel Ephrat, Shmuel Peleg
|
👻
Ghosted
|
cs.CV
|
43 |
7 years ago |
| 359 |
Visual Features for Context-Aware Speech Recognition
Abhinav Gupta, Yajie Miao, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
43 |
8 years ago |
| 360 |
Revisiting the problem of audio-based hit song prediction using convolutional neural networks
Li-Chia Yang, Szu-Yu Chou, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
43 |
9 years ago |
| 361 |
Son of Zorn's Lemma: Targeted Style Transfer Using Instance-aware Semantic Segmentation
Carlos Castillo, Soham De, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
43 |
9 years ago |
| 362 |
Egocentric Activity Recognition with Multimodal Fisher Vector
Sibo Song, Ngai-Man Cheung, ... (+3 more)
|
👻
Ghosted
|
cs.MM
|
43 |
10 years ago |
| 363 |
SpeechLMScore: Evaluating speech generation using speech language model
Soumi Maiti, Yifan Peng, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
43 |
3 years ago |
| 364 |
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input
Daisuke Niizumi, Daiki Takeuchi, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
43 |
3 years ago |
| 365 |
Generative AI-aided Joint Training-free Secure Semantic Communications via Multi-modal Prompts
Hongyang Du, Guangyuan Liu, ... (+6 more)
|
👻
Ghosted
|
eess.IV
|
42 |
2 years ago |
| 366 |
Cross-Domain Sentiment Classification with Contrastive Learning and Mutual Information Maximization
Tian Li, Xiang Chen, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
42 |
5 years ago |
| 367 |
PPG-based singing voice conversion with adversarial representation learning
Zhonghao Li, Benlai Tang, ... (+5 more)
|
👻
Ghosted
|
cs.SD
|
42 |
5 years ago |
| 368 |
CASS-NAT: CTC Alignment-based Single Step Non-autoregressive Transformer for Speech Recognition
Ruchao Fan, Wei Chu, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
42 |
5 years ago |
| 369 |
DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures
Yang Zhao, Chaojian Li, ... (+4 more)
|
⏳
Coming Soon™
|
cs.LG
|
42 |
6 years ago |
| 370 |
SNDCNN: Self-normalizing deep CNNs with scaled exponential linear units for speech recognition
Zhen Huang, Tim Ng, ... (+4 more)
|
👻
Ghosted
|
cs.LG
|
42 |
6 years ago |
| 371 |
Distributed Gradient Descent with Coded Partial Gradient Computations
Emre Ozfatura, Sennur Ulukus, Deniz Gunduz
|
👻
Ghosted
|
cs.LG
|
42 |
7 years ago |
| 372 |
Nose, eyes and ears: Head pose estimation by locating facial keypoints
Aryaman Gupta, Kalpit Thakkar, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
42 |
7 years ago |
| 373 |
Pixel-Superpixel Contrastive Learning and Pseudo-Label Correction for Hyperspectral Image Clustering
Renxiang Guan, Zihao Li, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
42 |
2 years ago |
| 374 |
Language-Oriented Communication with Semantic Coding and Knowledge Distillation for Text-to-Image Generation
Hyelin Nam, Jihong Park, ... (+3 more)
|
👻
Ghosted
|
eess.SP
|
42 |
2 years ago |
| 375 |
Toward Universal Text-to-Music Retrieval
SeungHeon Doh, Minz Won, ... (+2 more)
|
👻
Ghosted
|
cs.IR
|
42 |
3 years ago |
| 376 |
AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Jiuxin Lin, Xinyu Cai, ... (+8 more)
|
👻
Ghosted
|
cs.MM
|
41 |
2 years ago |
| 377 |
Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input
Xingchen Song, Zhiyong Wu, ... (+4 more)
|
👻
Ghosted
|
cs.SD
|
41 |
5 years ago |
| 378 |
Federated Neuromorphic Learning of Spiking Neural Networks for Low-Power Edge Intelligence
Nicolas Skatchkovsky, Hyeryung Jang, Osvaldo Simeone
|
👻
Ghosted
|
cs.LG
|
41 |
6 years ago |
| 379 |
Balanced Binary Neural Networks with Gated Residual
Mingzhu Shen, Xianglong Liu, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
41 |
6 years ago |
| 380 |
A Learning-Based Framework for Line-Spectra Super-resolution
Gautier Izacard, Brett Bernstein, Carlos Fernandez-Granda
|
👻
Ghosted
|
cs.LG
|
41 |
7 years ago |
| 381 |
Deep Transfer Learning for EEG-based Brain Computer Interface
Chuanqi Tan, Fuchun Sun, Wenchang Zhang
|
👻
Ghosted
|
cs.CV
|
41 |
7 years ago |
| 382 |
End-to-End Multimodal Speech Recognition
Shruti Palaskar, Ramon Sanabria, Florian Metze
|
👻
Ghosted
|
eess.AS
|
41 |
7 years ago |
| 383 |
A context-aware matching game for user association in wireless small cell networks
Nima Namvar, Walid Saad, ... (+2 more)
|
👻
Ghosted
|
cs.NI
|
41 |
10 years ago |
| 384 |
Building Lane-Level Maps from Aerial Images
Jiawei Yao, Xiaochao Pan, ... (+2 more)
|
💀
404 Not Found
|
cs.CV
|
41 |
2 years ago |
| 385 |
S3T: Self-Supervised Pre-training with Swin Transformer for Music Classification
Hang Zhao, Chen Zhang, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
41 |
4 years ago |
| 386 |
Learning on Graphs under Label Noise
Jingyang Yuan, Xiao Luo, ... (+4 more)
|
👻
Ghosted
|
cs.LG
|
40 |
2 years ago |
| 387 |
Adaptive Contention Window Design using Deep Q-learning
Abhishek Kumar, Gunjan Verma, ... (+3 more)
|
👻
Ghosted
|
eess.SP
|
40 |
5 years ago |
| 388 |
A ReLU Dense Layer to Improve the Performance of Neural Networks
Alireza M. Javid, Sandipan Das, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
40 |
5 years ago |
| 389 |
Tie Your Embeddings Down: Cross-Modal Latent Spaces for End-to-end Spoken Language Understanding
Bhuvan Agrawal, Markus Müller, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
40 |
5 years ago |
| 390 |
Multistream CNN for Robust Acoustic Modeling
Kyu J. Han, Jing Pan, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
40 |
5 years ago |
| 391 |
Teacher-Student Training for Robust Tacotron-based TTS
Rui Liu, Berrak Sisman, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
40 |
6 years ago |
| 392 |
Deep Ptych: Subsampled Fourier Ptychography using Generative Priors
Fahad Shamshad, Farwa Abbas, Ali Ahmed
|
👻
Ghosted
|
cs.LG
|
40 |
7 years ago |
| 393 |
Adapting End-to-End Neural Speaker Verification to New Languages and Recording Conditions with Adversarial Training
Gautam Bhattacharya, Jahangir Alam, Patrick Kenny
|
👻
Ghosted
|
eess.AS
|
40 |
7 years ago |
| 394 |
A network of deep neural networks for distant speech recognition
Mirco Ravanelli, Philemon Brakel, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
40 |
9 years ago |
| 395 |
Improving End-to-End Speech Recognition with Policy Learning
Yingbo Zhou, Caiming Xiong, Richard Socher
|
👻
Ghosted
|
cs.CL
|
40 |
8 years ago |
| 396 |
Optimizing Vision Transformers for Medical Image Segmentation
Qianying Liu, Chaitanya Kaul, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
40 |
3 years ago |
| 397 |
M3ST: Mix at Three Levels for Speech Translation
Xuxin Cheng, Qianqian Dong, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
40 |
3 years ago |
| 398 |
Retrieval-Augmented Text-to-Audio Generation
Yi Yuan, Haohe Liu, ... (+4 more)
|
👻
Ghosted
|
cs.SD
|
39 |
2 years ago |
| 399 |
Streaming Multi-speaker ASR with RNN-T
Ilya Sklyar, Anna Piunova, Yulan Liu
|
👻
Ghosted
|
eess.AS
|
39 |
5 years ago |
| 400 |
Jointly optimal dereverberation and beamforming
Christoph Boeddeker, Tomohiro Nakatani, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
39 |
6 years ago |