| 251 |
Vision as an Interlingua: Learning Multilingual Semantic Embeddings of Untranscribed Speech
David Harwath, Galen Chuang, James Glass
|
👻
Ghosted
|
cs.CL
|
60 |
8 years ago |
| 252 |
EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance
Yiwei Guo, Chenpeng Du, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
60 |
3 years ago |
| 253 |
Efficient Video and Audio processing with Loihi 2
Sumit Bam Shrestha, Jonathan Timcheck, ... (+3 more)
|
👻
Ghosted
|
cs.NE
|
59 |
2 years ago |
| 254 |
End-to-end contextual speech recognition using class language models and a token passing decoder
Zhehuai Chen, Mahaveer Jain, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
59 |
7 years ago |
| 255 |
EEG2IMAGE: Image Reconstruction from EEG Brain Signals
Prajwal Singh, Pankaj Pandey, ... (+2 more)
|
👻
Ghosted
|
cs.HC
|
58 |
3 years ago |
| 256 |
Multi-stage Speaker Extraction with Utterance and Frame-Level Reference Signals
Meng Ge, Chenglin Xu, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
58 |
5 years ago |
| 257 |
Knowledge Distillation for Improved Accuracy in Spoken Question Answering
Chenyu You, Nuo Chen, Yuexian Zou
|
👻
Ghosted
|
cs.CL
|
58 |
5 years ago |
| 258 |
Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition
Qiujia Li, David Qiu, ... (+6 more)
|
👻
Ghosted
|
eess.AS
|
58 |
5 years ago |
| 259 |
Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation
Naveen Arivazhagan, Colin Cherry, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
58 |
6 years ago |
| 260 |
FDDWNet: A Lightweight Convolutional Neural Network for Real-time Sementic Segmentation
Jia Liu, Quan Zhou, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
58 |
6 years ago |
| 261 |
C3DVQA: Full-Reference Video Quality Assessment with 3D Convolutional Neural Network
Munan Xu, Junming Chen, ... (+4 more)
|
👻
Ghosted
|
eess.IV
|
58 |
6 years ago |
| 262 |
Deep Signal Recovery with One-Bit Quantization
Shahin Khobahi, Naveed Naimipour, ... (+2 more)
|
👻
Ghosted
|
eess.SP
|
58 |
7 years ago |
| 263 |
Low-complexity Recurrent Neural Network-based Polar Decoder with Weight Quantization Mechanism
Chieh-Fang Teng, Chen-Hsi Wu, ... (+2 more)
|
👻
Ghosted
|
eess.SP
|
58 |
7 years ago |
| 264 |
Complex Transformer: A Framework for Modeling Complex-Valued Sequence
Muqiao Yang, Martin Q. Ma, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
57 |
6 years ago |
| 265 |
Sequence-to-sequence Singing Synthesis Using the Feed-forward Transformer
Merlijn Blaauw, Jordi Bonada
|
👻
Ghosted
|
cs.SD
|
57 |
6 years ago |
| 266 |
Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech Synthesis
Thomas Drugman, Alexis Moinet, ... (+2 more)
|
👻
Ghosted
|
cs.SD
|
57 |
6 years ago |
| 267 |
Leveraging mmWave Imaging and Communications for Simultaneous Localization and Mapping
Mohammed Aladsani, Ahmed Alkhateeb, Georgios C. Trichopoulos
|
👻
Ghosted
|
cs.IT
|
57 |
7 years ago |
| 268 |
Characterizing Speech Adversarial Examples Using Self-Attention U-Net Enhancement
Chao-Han Huck Yang, Jun Qi, ... (+3 more)
|
👻
Ghosted
|
eess.AS
|
56 |
6 years ago |
| 269 |
Developing Far-Field Speaker System Via Teacher-Student Learning
Jinyu Li, Rui Zhao, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
56 |
8 years ago |
| 270 |
Improving Prosody Modelling with Cross-Utterance BERT Embeddings for End-to-end Speech Synthesis
Guanghui Xu, Wei Song, ... (+4 more)
|
👻
Ghosted
|
eess.AS
|
55 |
5 years ago |
| 271 |
Meta-Learning to Communicate: Fast End-to-End Training for Fading Channels
Sangwoo Park, Osvaldo Simeone, Joonhyuk Kang
|
👻
Ghosted
|
eess.SP
|
55 |
6 years ago |
| 272 |
Noise-tolerant Audio-visual Online Person Verification using an Attention-based Neural Network Fusion
Suwon Shon, Tae-Hyun Oh, James Glass
|
👻
Ghosted
|
cs.CV
|
55 |
7 years ago |
| 273 |
Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection
Yu-Hsuan Wang, Hung-yi Lee, Lin-shan Lee
|
👻
Ghosted
|
cs.CL
|
55 |
7 years ago |
| 274 |
Memory Visualization for Gated Recurrent Neural Networks in Speech Recognition
Zhiyuan Tang, Ying Shi, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
55 |
9 years ago |
| 275 |
Optimization of Speaker Extraction Neural Network with Magnitude and Temporal Spectrum Approximation Loss
Chenglin Xu, Wei Rao, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
54 |
7 years ago |
| 276 |
EEG-based video identification using graph signal modeling and graph convolutional neural network
Soobeom Jang, Seong-Eun Moon, Jong-Seok Lee
|
👻
Ghosted
|
eess.SP
|
54 |
7 years ago |
| 277 |
Speech waveform synthesis from MFCC sequences with generative adversarial networks
Lauri Juvela, Bajibabu Bollepalli, ... (+5 more)
|
👻
Ghosted
|
eess.AS
|
54 |
8 years ago |
| 278 |
No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models
Tara N. Sainath, Rohit Prabhavalkar, ... (+10 more)
|
👻
Ghosted
|
cs.CL
|
54 |
8 years ago |
| 279 |
Compressive K-means
Nicolas Keriven, Nicolas Tremblay, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
54 |
9 years ago |
| 280 |
Unsupervised Deep Clustering for Source Separation: Direct Learning from Mixtures using Spatial Information
Efthymios Tzinis, Shrikant Venkataramani, Paris Smaragdis
|
👻
Ghosted
|
cs.LG
|
53 |
7 years ago |
| 281 |
Dual-fisheye lens stitching for 360-degree imaging
Tuan Ho, Madhukar Budagavi
|
👻
Ghosted
|
cs.CV
|
53 |
8 years ago |
| 282 |
Performance of time delay estimation in a cognitive radar
Kumar Vijay Mishra, Yonina C. Eldar
|
👻
Ghosted
|
cs.IT
|
53 |
9 years ago |
| 283 |
Song Recommendation with Non-Negative Matrix Factorization and Graph Total Variation
Kirell Benzi, Vassilis Kalofolias, ... (+2 more)
|
🌅
Old Age
|
stat.ML
|
53 |
10 years ago |
| 284 |
Prefix tuning for automated audio captioning
Minkyu Kim, Kim Sung-Bin, Tae-Hyun Oh
|
👻
Ghosted
|
eess.AS
|
53 |
3 years ago |
| 285 |
On End-to-end Multi-channel Time Domain Speech Separation in Reverberant Environments
Jisi Zhang, Catalin Zorila, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
52 |
5 years ago |
| 286 |
Singing Voice Conversion with Disentangled Representations of Singer and Vocal Technique Using Variational Autoencoders
Yin-Jyun Luo, Chin-Chen Hsu, ... (+2 more)
|
👻
Ghosted
|
eess.AS
|
52 |
6 years ago |
| 287 |
Generating Empathetic Responses by Looking Ahead the User's Sentiment
Jamin Shin, Peng Xu, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
52 |
6 years ago |
| 288 |
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning
Alexander H. Liu, Tao Tu, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
52 |
6 years ago |
| 289 |
Optimal Importance Sampling for Federated Learning
Elsa Rizk, Stefan Vlaski, Ali H. Sayed
|
👻
Ghosted
|
cs.LG
|
51 |
5 years ago |
| 290 |
Deep Learning the EEG Manifold for Phonological Categorization from Active Thoughts
Pramit Saha, Muhammad Abdul-Mageed, Sidney Fels
|
👻
Ghosted
|
cs.LG
|
51 |
7 years ago |
| 291 |
How to Improve Your Speaker Embeddings Extractor in Generic Toolkits
Hossein Zeinali, Lukas Burget, ... (+3 more)
|
👻
Ghosted
|
cs.SD
|
51 |
7 years ago |
| 292 |
Advancing Connectionist Temporal Classification With Attention Modeling
Amit Das, Jinyu Li, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
51 |
8 years ago |
| 293 |
SVSGAN: Singing Voice Separation via Generative Adversarial Network
Zhe-Cheng Fan, Yen-Lin Lai, Jyh-Shing Roger Jang
|
👻
Ghosted
|
cs.SD
|
51 |
8 years ago |
| 294 |
Towards stationary time-vertex signal processing
Nathanael Perraudin, Andreas Loukas, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
51 |
9 years ago |
| 295 |
Diffusion Motion: Generate Text-Guided 3D Human Motion by Diffusion Model
Zhiyuan Ren, Zhihong Pan, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
51 |
3 years ago |
| 296 |
Language Model is All You Need: Natural Language Understanding as Question Answering
Mahdi Namazifar, Alexandros Papangelis, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
50 |
5 years ago |
| 297 |
Transformer-Transducers for Code-Switched Speech Recognition
Siddharth Dalmia, Yuzong Liu, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
50 |
5 years ago |
| 298 |
Regularized Fourier Ptychography using an Online Plug-and-Play Algorithm
Yu Sun, Shiqi Xu, ... (+4 more)
|
👻
Ghosted
|
cs.CV
|
50 |
7 years ago |
| 299 |
Non-native children speech recognition through transfer learning
Marco Matassoni, Roberto Gretter, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
50 |
7 years ago |
| 300 |
Robust Speech Recognition Using Generative Adversarial Networks
Anuroop Sriram, Heewoo Jun, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
50 |
8 years ago |