Additive Margin SincNet for Speaker Recognition
January 28, 2019 Β· Declared Dead Β· π IEEE International Joint Conference on Neural Network
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
JoΓ£o AntΓ΄nio Chagas Nunes, David MacΓͺdo, Cleber Zanchettin
arXiv ID
1901.10826
Category
eess.AS: Audio & Speech
Cross-listed
cs.CL,
cs.LG,
cs.NE,
cs.SD,
stat.ML
Citations
17
Venue
IEEE International Joint Conference on Neural Network
Last Checked
3 months ago
Abstract
Speaker Recognition is a challenging task with essential applications such as authentication, automation, and security. The SincNet is a new deep learning based model which has produced promising results to tackle the mentioned task. To train deep learning systems, the loss function is essential to the network performance. The Softmax loss function is a widely used function in deep learning methods, but it is not the best choice for all kind of problems. For distance-based problems, one new Softmax based loss function called Additive Margin Softmax (AM-Softmax) is proving to be a better choice than the traditional Softmax. The AM-Softmax introduces a margin of separation between the classes that forces the samples from the same class to be closer to each other and also maximizes the distance between classes. In this paper, we propose a new approach for speaker recognition systems called AM-SincNet, which is based on the SincNet but uses an improved AM-Softmax layer. The proposed method is evaluated in the TIMIT dataset and obtained an improvement of approximately 40% in the Frame Error Rate compared to SincNet.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Audio & Speech
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
R.I.P.
π»
Ghosted
DiffWave: A Versatile Diffusion Model for Audio Synthesis
R.I.P.
π»
Ghosted
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
R.I.P.
π»
Ghosted
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
R.I.P.
π»
Ghosted
Generalized End-to-End Loss for Speaker Verification
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted