GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network
April 25, 2023 ยท Entered Twilight ยท ๐ IEEE International Conference on Acoustics, Speech, and Signal Processing
"No code URL or promise found in abstract"
"Derived repo from GitHub Pages (backfill)"
Evidence collected by the PWNC Scanner
Repo contents: README.md, img, index.html, music.py, out15.mp4, out16.mp4, style.css, video_ab, video_b_s, video_r, video_r_20, video_r_20_m, video_s
Authors
Haolin Zhuang, Shun Lei, Long Xiao, Weiqin Li, Liyang Chen, Sicheng Yang, Zhiyong Wu, Shiyin Kang, Helen Meng
arXiv ID
2304.12704
Category
cs.SD: Sound
Cross-listed
cs.MM,
eess.AS
Citations
18
Venue
IEEE International Conference on Acoustics, Speech, and Signal Processing
Repository
https://github.com/im1eon/ICASSP23-GTNB-DG
โญ 4
Last Checked
13 days ago
Abstract
Music-driven 3D dance generation has become an intensive research topic in recent years with great potential for real-world applications. Most existing methods lack the consideration of genre, which results in genre inconsistency in the generated dance movements. In addition, the correlation between the dance genre and the music has not been investigated. To address these issues, we propose a genre-consistent dance generation framework, GTN-Bailando. First, we propose the Genre Token Network (GTN), which infers the genre from music to enhance the genre consistency of long-term dance generation. Second, to improve the generalization capability of the model, the strategy of pre-training and fine-tuning is adopted.Experimental results on the AIST++ dataset show that the proposed dance generation framework outperforms state-of-the-art methods in terms of motion quality and genre consistency.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Sound
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
CNN Architectures for Large-Scale Audio Classification
R.I.P.
๐ป
Ghosted
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
R.I.P.
๐ป
Ghosted
WaveGlow: A Flow-based Generative Network for Speech Synthesis
R.I.P.
๐ป
Ghosted