R.I.P.
👻
Ghosted
Human Motion Instruction Tuning
November 25, 2024 · Declared Dead · 🏛 Computer Vision and Pattern Recognition
Authors
Lei Li, Sen Jia, Jianhao Wang, Zhongyu Jiang, Feng Zhou, Ju Dai, Tianfang Zhang, Zongkai Wu, Jenq-Neng Hwang
arXiv ID
2411.16805
Category
cs.AI: Artificial Intelligence
Cross-listed
cs.CV
Citations
14
Venue
Computer Vision and Pattern Recognition
Repository
https://github.com/ILGLJ/LLaMo
⭐ 5
Last Checked
1 month ago
Abstract
This paper presents LLaMo (Large Language and Human Motion Assistant), a multimodal framework for human motion instruction tuning. In contrast to conventional instruction-tuning approaches that convert non-linguistic inputs, such as video or motion sequences, into language tokens, LLaMo retains motion in its native form for instruction tuning. This method preserves motion-specific details that are often diminished in tokenization, thereby improving the model's ability to interpret complex human behaviors. By processing both video and motion data alongside textual inputs, LLaMo enables a flexible, human-centric analysis. Experimental evaluations across high-complexity domains, including human behaviors and professional activities, indicate that LLaMo effectively captures domain-specific knowledge, enhancing comprehension and prediction in motion-intensive scenarios. We hope LLaMo offers a foundation for future multimodal AI systems with broad applications, from sports analytics to behavioral prediction. Our code and models are available on the project website: https://github.com/ILGLJ/LLaMo.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
📜 Similar Papers
In the same crypt — Artificial Intelligence
R.I.P.
👻
Ghosted
Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI
R.I.P.
👻
Ghosted
Addressing Function Approximation Error in Actor-Critic Methods
R.I.P.
👻
Ghosted
Explanation in Artificial Intelligence: Insights from the Social Sciences
R.I.P.
👻
Ghosted
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
R.I.P.
👻
Ghosted
Complex Embeddings for Simple Link Prediction
Died the same way — ⚰️ The Empty Tomb
R.I.P.
⚰️
The Empty Tomb
DSFD: Dual Shot Face Detector
R.I.P.
⚰️
The Empty Tomb
InstanceCut: from Edges to Instances with MultiCut
R.I.P.
⚰️
The Empty Tomb
FLNet: Landmark Driven Fetching and Learning Network for Faithful Talking Facial Animation Synthesis
R.I.P.
⚰️
The Empty Tomb