ESVQA: Perceptual Quality Assessment of Egocentric Spatial Videos

December 29, 2024 · Declared Dead · 🏛 IEEE International Conference on Multimedia and Expo

Repo contents: README.md

Authors Xilei Zhu, Huiyu Duan, Liu Yang, Yucheng Zhu, Xiongkuo Min, Guangtao Zhai, Patrick Le Callet arXiv ID 2412.20423 Category cs.CV: Computer Vision Cross-listed cs.MM Citations 4 Venue IEEE International Conference on Multimedia and Expo Repository https://github.com/iamazxl/ESVQA ⭐ 1 Last Checked 1 month ago

Abstract

With the rapid development of eXtended Reality (XR), egocentric spatial shooting and display technologies have further enhanced immersion and engagement for users, delivering more captivating and interactive experiences. Assessing the quality of experience (QoE) of egocentric spatial videos is crucial to ensure a high-quality viewing experience. However, the corresponding research is still lacking. In this paper, we use the concept of embodied experience to highlight this more immersive experience and study the new problem, i.e., embodied perceptual quality assessment for egocentric spatial videos. Specifically, we introduce the first Egocentric Spatial Video Quality Assessment Database (ESVQAD), which comprises 600 egocentric spatial videos captured using the Apple Vision Pro and their corresponding mean opinion scores (MOSs). Furthermore, we propose a novel multi-dimensional binocular feature fusion model, termed ESVQAnet, which integrates binocular spatial, motion, and semantic features to predict the overall perceptual quality. Experimental results demonstrate the ESVQAnet significantly outperforms 16 state-of-the-art VQA models on the embodied perceptual quality assessment task, and exhibits strong generalization capability on traditional VQA tasks. The database and code are available at https://github.com/iamazxl/ESVQA.